Open
Description
sometimes we have URLs that are canonicalized to other pages, and these should not be included in the sitemap. See google's reference: https://developers.google.com/search/docs/advanced/sitemaps/build-sitemap
So the logic would be to look for a canonical tag and check if it matches the crawled URL. If it does not, then do not include that page in the sitemap.
I'm working on updating your code myself to include this but I'm still new to Python.
Metadata
Metadata
Assignees
Labels
No labels