Open
Description
Currently, Readability.js extracts structured data according to certain schema.org vocabularies; however, it only recognizes the JSON-LD syntax when doing so.
For comparison:
- Trafilatura understands schema.org RDFa, Dublin Core RDFa, JSON-LD, microformats, and microdata
- Readability understands microformats and JSON-LD.
- DOM-Distiller understands microdata, legacy microformats, and IE Reading View markup tags
- mercury-parser understands legacy microformats, IE markup, Dublin Core RDFa, and microdata.
Simply supporting microdata would give authors one simple universal syntax which can be used across all three major open-source article extractors and various search indexes (Google, Bing, Yandex).