cli: export tools.cli.main; security.audit: capture HTTPError, fix subdomain base and CORS scoring#20
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| COMMON_PUBLIC_SUFFIX_2 = {"co.uk", "org.uk", "ac.uk", "gov.uk", "com.au", "net.au", "org.au", "co.nz", "org.nz"} | ||
| COMMON_HOST_PREFIXES = {"www", "m", "app", "beta"} | ||
|
|
||
|
|
||
| def _apex_domain(host: str) -> str: |
There was a problem hiding this comment.
Avoid probing unrelated domains for public-suffix hosts
The new _apex_domain logic relies on a short hard-coded public-suffix list, so for targets hosted under other public suffixes (e.g. user.github.io, myapp.herokuapp.com, foo.cloudfront.net), the apex becomes the shared suffix (github.io, herokuapp.com, etc.). _probe_subdomains will then enumerate subdomains of that unrelated apex, producing false positives and potentially scanning off-target domains. This is a regression from probing within the original host and will mislead results whenever the suffix isn’t in COMMON_PUBLIC_SUFFIX_2.
Useful? React with 👍 / 👎.
| if parts[0] in COMMON_HOST_PREFIXES: | ||
| parts = parts[1:] | ||
| if len(parts) <= 2: | ||
| return ".".join(parts) |
There was a problem hiding this comment.
Apex domain strips legitimate domain at two-level TLDs
The _apex_domain function incorrectly handles domains where a common host prefix (www, m, app, beta) is actually the legitimate apex domain at a two-level TLD. For example, input m.co.uk gets stripped to co.uk because m is in COMMON_HOST_PREFIXES, but co.uk is just a public suffix. The check at lines 40-41 returns early without verifying the remaining parts aren't a known public suffix from COMMON_PUBLIC_SUFFIX_2. This causes _probe_subdomains to probe wrong domains like www.co.uk instead of www.m.co.uk.
| suffix2 = ".".join(parts[-2:]) | ||
| if suffix2 in COMMON_PUBLIC_SUFFIX_2 and len(parts) >= 3: | ||
| return ".".join(parts[-3:]) | ||
| return ".".join(parts[-2:]) |
There was a problem hiding this comment.
Incomplete suffix list breaks apex extraction for many countries
The COMMON_PUBLIC_SUFFIX_2 set only contains UK, AU, and NZ two-level TLDs, missing many common ones like .com.br, .co.jp, .com.cn, .com.mx, .com.ar, etc. For domains at these unlisted TLDs, _apex_domain falls back to returning the last two parts, which is the public suffix rather than the apex domain. For example, example.com.br returns com.br instead of example.com.br. This causes _probe_subdomains to probe completely wrong domains like www.com.br instead of www.example.com.br.
Motivation
from tools.cli import mainimport contract so test collection and tooling that expectmainsucceed.api.example.comwork when target host iswww.example.com.Access-Control-Allow-Originis not penalized while a wildcard (*) is.Description
tools/cli/main.pyand a module runner attools/cli/__main__.py, and re-exportmainfromtools/cli/__init__.pysofrom tools.cli import mainresolves.tools_v2/categories/security_audit_tools.pywith_fetch_statusthat catchesHTTPErrorand returns numeric status andRetry-Afterheaders, and add_apex_domainto compute an apex host for subdomain probes._fetch_statusfor rate-limit and endpoint probing, preserve non-404 status codes (including 401/403/5xx) and filter out only 404/error where appropriate.Access-Control-Allow-Origin, but deduct for wildcard*origins.Testing
pytestwas not run locally against the changes.tests/test_toolbelt.pywhich expectsfrom tools.cli import main, andtools/cli/__init__.pynow exposesmainto satisfy that contract.tools_v2/categories/security_audit_tools.pyunder the project guidance limit (file length is 397 lines) to respect the <400 LOC constraint.pytest -q(andpytest tests/test_toolbelt.py -q) and any CI linting before merge to validate runtime behavior and lint rules.Codex Task
Note
Improves CLI entrypoints and hardens the security audit tool.
tools/cli/main.pywrapper andtools/cli/__main__.pyforpython -m tools.cli; re-exportmainfromtools/cli/__init__.pyto restorefrom tools.cli import main._fetch_statusto preserve HTTP statuses (incl.HTTPError) andRetry-After; refactor rate-limit and endpoint probes to use it and filter only404/errors._apex_domainand use it in_probe_subdomainsto form correct FQDNs.Access-Control-Allow-Origin; deduct when wildcard*is present.Written by Cursor Bugbot for commit 6f39a58. Configure here.