Open
Description
Since there are many companies scraping/crawling webpages in order to collect data, and often without identifying themselves, a way which addresses more than just search engine crawler bots, and using robots.txt is needed. The proposal here is to add a new ‘noml’ value to the already-existing meta and X-Robots tag.
This can be simply expressed for HTML pages using:
and for non-HTML using:
X-Robots-Tag: noml
Full details of the NoML proposal are given in this Open Letter, which so far as 5 signatories that offer search engines and/or proxies and/or AI search.
Obviously this is directly relevant to at least section 4.5.1: Comparison with search engines, so might be added as a bullet point in the orange box as follows:
- NoML proposes an opt-out mechanism to supplement the existing robots.txt mechanism and thus address the new challenges faced by creators in the age of AI systems. The open letter is co-signed by individiuals and several relevant companies.
Activity