Skip to content

Katana will not exclude links with parameters such as js and css from crawling #987

Open
@CatDrinkCoffee

Description

@CatDrinkCoffee

Normally, the crawler will not request js and css pages once more, but when I used the -sb parameter to observe the browser crawling process, I found that Katana actually had the following problem

For example: .js and .css will not be visited once during the crawling process, but if it is with parameters, such as .js?ver=1.1, the crawler will choose to visit this page once, which will cause a huge number of crawler requests. Now many pages may have a parameter value after the js link. I think this is a defect and hope it can be fixed. Thank you

The picture below is what I captured when the crawler chose to visit this js (this js link is with parameters)

1723303751361

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type: BugInconsistencies or issues which will cause an issue or problem for users or implementors.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions