We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rss-fetcher output includes URLs that story-indexer regards as "non-news", both simple domain names (archive.org) and subdomains (xyz.iheart.com):
2024-08-16 18:17:28,180 c9a6a33e93c1 rss-puller INFO: non-news: http://archive.org/details/dlibra.bibliotekaelblaska.pl.92649-2.30732645 2024-08-16 18:17:26,732 c9a6a33e93c1 rss-puller INFO: non-news: https://kentuckynewsnetwork.iheart.com/content/2024-08-16-18-year-old-teen-cowboy-ace-patton-ashford-killed-in-freak-accident/ 2024-08-16 18:17:24,066 c9a6a33e93c1 rss-puller INFO: non-news: https://knrs.iheart.com/content/2024-08-16-new-poll-shows-where-harris-trump-stand-in-crucial-swing-state/ 2024-08-16 18:17:23,563 c9a6a33e93c1 rss-puller INFO: non-news: https://buckeyecountry105.iheart.com/content/2024-08-16-new-poll-shows-where-harris-trump-stand-in-crucial-swing-state/ 2024-08-16 18:17:19,856 c9a6a33e93c1 rss-puller INFO: non-news: https://wgy.iheart.com/content/2024-08-16-boebert-bikini-photo-supporting-colleague-reveals-massive-secret-tattoo/
The text was updated successfully, but these errors were encountered:
story-indexer has a non_news_fqdn function for this. mediacloud/metadata-lib#91 is a request to move that to mc_metadata
non_news_fqdn
Sorry, something went wrong.
Code is in mediacloud/metadata-lib#93
No branches or pull requests
rss-fetcher output includes URLs that story-indexer regards as "non-news", both simple domain names (archive.org) and subdomains (xyz.iheart.com):
The text was updated successfully, but these errors were encountered: