Skip to content

Conversation

@netanelitguru
Copy link

No description provided.

@netanelitguru
Copy link
Author

OK

@stasinopoulos
Copy link
Member

Thanks for submitting this pull request!

Before I proceed with the review, could you please provide a bit more information about the changes introduced here?

Specifically, it would be helpful if you could clarify:

  • What is the main purpose of this PR?
  • What issue does it resolve, or what improvement does it aim to introduce?
  • Does it add new functionality or modify existing behavior in Commix?
  • Are there any specific use cases or examples that demonstrate the impact of this PR?

A brief description will help me better understand the intent behind the changes and ensure they align with the project’s direction.

Thank you in advance for taking the time to elaborate!

@netanelitguru
Copy link
Author

Thanks for submitting this pull request!

Before I proceed with the review, could you please provide a bit more information about the changes introduced here?

Specifically, it would be helpful if you could clarify:

  • What is the main purpose of this PR?
  • What issue does it resolve, or what improvement does it aim to introduce?
  • Does it add new functionality or modify existing behavior in Commix?
  • Are there any specific use cases or examples that demonstrate the impact of this PR?

A brief description will help me better understand the intent behind the changes and ensure they align with the project’s direction.

Thank you in advance for taking the time to elaborate!

Here’s a reply you can paste back (tweak wording/details to match your exact flag names if needed):


Hi, thanks for taking the time to review this PR!
Here’s some more context about the changes:

  • Main purpose of this PR
    The goal is to limit the crawling/scan phase to a specific domain (or set of domains) so that Commix stays strictly within the intended scope.

  • Issue it resolves / improvement it introduces
    Currently, when crawling, Commix may follow links that lead to other domains. This can:

    • unintentionally scan third-party hosts,
    • make scans slower and noisier, and
    • go beyond the scope defined by the user/assessment rules.

    This PR adds logic to restrict the crawler to a specific domain so that all discovered URLs stay in-scope.

  • New functionality vs. modification of existing behavior
    It modifies the existing crawling behavior by:

    • adding a check that filters out URLs pointing to a different domain than the target (or user-specified allowed domain), and
    • optionally allowing the user to configure which domain(s) are considered in-scope.

    Apart from the scope restriction, the rest of the scanning behavior remains unchanged.

  • Use cases / examples

    • When testing https://example.com, Commix will now ignore links to domains like https://cdn.example.org or https://thirdparty.com (unless explicitly allowed), keeping the scan focused on example.com.
    • In environments with strict scopes (e.g., pentests with a single authorized domain), this helps ensure Commix doesn’t accidentally crawl or attack out-of-scope hosts.
    • It also reduces noise and scan time by not exploring external domains that are irrelevant to the assessment.

Let me know if you’d like me to expand the documentation or add more tests around the domain-filtering logic.

@stasinopoulos
Copy link
Member

The domain-filtering logic seems reasonable as an idea and makes sense as an isolated implementation.

However, at the moment it appears to exist only as a standalone code block and doesn’t seem to be fully integrated into Commix’s broader workflow (e.g., command-line options, global settings, configuration flow, or the crawler’s end-to-end execution path).

Before moving forward with the review, could you clarify whether this is intended as an initial code piece or if additional integration steps are planned?

As it stands, the filtering mechanism doesn’t appear to be connected to the main Commix flow, so I’d like to understand how you envision completing the integration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants