-
Notifications
You must be signed in to change notification settings - Fork 254
Open
Description
A lot of Blacklight applications have been struggling because of bots. The Rails generator creates a default robots.txt that only contains # See http://www.robotstxt.org/robotstxt.html for documentation on how to use the robots.txt file.
We should add some Blacklight-specific configuration to make it easier for new Blacklight apps to manage bot traffic.
@tpendragon and @jrochkind found some configuration that helps avoid bots trying to crawl all of your facets, hammering Solr with complex nonsensical queries.
Suggestion:
# See https://www.robotstxt.org/robotstxt.html for documentation on how to use the robots.txt file
#
# To ban all spiders from the entire site uncomment the next two lines:
# User-agent: *
# Disallow: /
User-agent: *
# Disable crawling filters - it'll just slow down discovery of useful resources - better that they page.
Disallow: /catalog*f[
Disallow: /catalog*f%5B
# Bots can't log in.
Disallow: /users
Acceptance criteria
- Newly generated Blacklight applications update the default
public/robots.txtto include paths that generally should not be crawled by bots for Blacklight applications.
Metadata
Metadata
Assignees
Labels
No labels