This repository serves as a starting point for RSS/Atom feed scraping solutions:
- Scripts fixing broken feeds
- Scripts scraping websites without feeds
- Scripts augmenting feeds without good content
While this repos hosts a few scripts and examples to build upon, it mostly provides links to existing solutions.
This list does focus on simple (almost) zero-setup solutions!
In the examples
folder provides a few scripts that illustrate how to write a scraper in different scripting languages. It focusses on languages that can be run out of the box on all Linux distributions so you can use those scrapers on with feed readers that support running scripts as sources like Liferea and SnowNews.
This is a list of simple scripts you can run yourself locally (or in the cloud).
Tool | Input | Extraction | Output | Details |
---|---|---|---|---|
sjehuda/html2atom | HTML | XPath | Atom | Python Script |
h43z/rssify | HTML | CSS selectors | RSS | Python script |
MitchellMcKenna/twitter-rss-google-apps-script | auto | RSS | Google apps hosted script for Twitter API callback |
These are 3rd party services usually provided by companies that offer subscriptions. List is roughly ordered by usefulness and simplicity of the services. When using free plans consider your privacy!
Tool | Input | Extraction | Output | Sign Up | Details |
---|---|---|---|---|---|
rsshub.app | Many social networks | auto | RSS | no | Simple link syntax e.g. https://rsshub.app/<service>/user/<user name> |
nitter.com | auto | RSS | no | Simple link syntax https://nitter.net/<twitter username>/rss |
|
feed43.com | Any website | string pattern | RSS | no | Free for non-commercial use. Allows to specify patterns to extract |
fivefilters.org | Any website | CSS selectors | RSS | no | Returns only 5 most recent items per feed |
RSS.app | Many social networks | auto | RSS | yes | |
fetchrss | Any website | visual assistant | RSS | yes | 4 feeds are free |
Google Search | Google Search | API Query | RSS/Atom | yes | 100 requests per day, API key necessary |
If you find a service/link broken or missing please create a PR!