Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

robots.txt should be served as content-type text/plain #3556

Closed
t83714 opened this issue Aug 22, 2024 · 1 comment
Closed

robots.txt should be served as content-type text/plain #3556

t83714 opened this issue Aug 22, 2024 · 1 comment
Assignees
Labels

Comments

@t83714
Copy link
Contributor

t83714 commented Aug 22, 2024

robots.txt should be served as content-type text/plain

We currently serve a robots.txt with search engine bots instructions to:

  • supply sitemap entry point
  • avoid certain pages (e.g. search result page) being indexed

The serving response currently carries content-type header text/html, which might see here prevent certain search bots from recognising this file.

We will also add additional instructions to prevent URLs with q query parameter from being indexed --- although some existing instructions already cover this by URL path, the additional rule might help to secure this logic further.

t83714 added a commit that referenced this issue Aug 22, 2024
- Added additional rules to prevent any url with `q` query parameter to be indexed
- adjust sitemap url for deployment where default UI is not served at root path `/`
@t83714 t83714 removed the v4.2.2 label Aug 25, 2024
@t83714 t83714 modified the milestone: v4.2.3 Aug 25, 2024
@t83714 t83714 self-assigned this Aug 25, 2024
@t83714
Copy link
Contributor Author

t83714 commented Sep 16, 2024

closed via PR: #3561

@t83714 t83714 closed this as completed Sep 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant