Contains the title of the article.
Contains the content of the article. It may contain html tags based on the website (or parsing).
The url of the article. By opening this url the user should see the original article.
Contains the publication date of the article.
Unlike pubDate
it contains the date in milliseconds since EPOCH
where the article was scraped.
The article's attachments.
It will also contain the extracted urls that exists inside the content
.
Categories where the article belongs to. The categories assigned at the source file will also be added here.
The thumbnail's url if there is any.
Extra information about the article that does not match the above fields.
The name of the source that was used to scrape the article.