fix(scan): resolve detection of the first endpoint in the initiate scan task #238

psyray · 2024-11-22T18:51:59Z

Fixes #237

Replace HTTPx first scan by nmap, then launch HTTPx with discovered port
Create a reusable function to launch nmap on the fly
Add parsing to get ports and services from Nmap output
Add more logs to debug scans while running
Remove the HTTP CRAWL global var, Nmap is the default to retrieve the first endpoint (the starting point for all the others tasks)
Adjust the is_alive parameter for tasks that need alive endpoints
Fix S3 scanner source file not found
Add more checks to prevent errors and scan crash
Refactor Endpoint saving for a better logic and less errors
Improve URLs validation

Summary by Sourcery

Bug Fixes:

Fix the issue where the S3 scanner source file was not found, ensuring the correct file path is used.

Enhancements:

Refactor the endpoint saving logic to improve clarity and reduce errors.
Improve URL validation by adding a new function to check the validity of URLs, including domain:port formats.
Add more logging to provide better insights during scan execution and debugging.

This PR prepared the ground to effectively resolve #208 & #8

Todo

Test initiate scan with Full scan
Test initiation subscan with all the scan type

…an task - Replace HTTPx first scan by nmap, then launch HTTPx with discovered port - Create a reusable function to launch nmap on the fly - Add parsing to get ports and services from Nmap output - Add more logs to debug scans while running - Remove the HTTP CRAWL global var, Nmap is the default to retrieve the first endpoint (the starting point for all the others tasks) - Adjust the is_alive parameter for tasks that need alive endpoints - Fix S3 scanner source file not found - Add more checks to prevent errors and scan crash - Refactor Endpoint saving for a better logic and less errors - Improve URLs validation

sourcery-ai · 2024-11-22T18:52:03Z

Reviewer's Guide by Sourcery

This PR improves the scan initialization process by replacing HTTPx with Nmap for initial endpoint detection and adds several robustness improvements. The main changes focus on better error handling, improved URL validation, and more structured service detection using Nmap. The code has been refactored to be more maintainable and reliable.

Sequence diagram for scan initiation with Nmap

sequenceDiagram
    actor User
    participant System
    participant Nmap
    participant HTTPx
    User->>System: Initiate scan
    System->>Nmap: Run Nmap to find web services
    alt Web services found
        Nmap-->>System: Return open ports and services
        System->>HTTPx: Launch HTTPx with discovered ports
        HTTPx-->>System: Return HTTP endpoints
    else No web services found
        Nmap-->>System: No open ports
        System-->>User: Scan failed
    end

Updated class diagram for scan initiation

classDiagram
    class ScanHistory {
        +int id
        +DateTime last_scan_date
        +String scan_status
        +String error_message
        +void save()
    }
    class Domain {
        +int id
        +String name
        +DateTime last_scan_date
        +void save()
    }
    class Subdomain {
        +int id
        +String name
        +void save()
    }
    class EndPoint {
        +int id
        +String http_url
        +bool is_default
        +DateTime discovered_date
        +void save()
    }
    class Nmap {
        +dict get_nmap_http_datas(String host, dict ctx)
    }
    ScanHistory --> Domain : belongs to
    Domain --> Subdomain : has
    Subdomain --> EndPoint : has
    EndPoint --> Nmap : uses
    note for Nmap "Nmap is used to detect open ports and services"

File-Level Changes

Change	Details	Files
Replace HTTPx with Nmap for initial endpoint detection	Created new function get_nmap_http_datas() to detect web services Added parsing of Nmap results to identify HTTP/HTTPS services Modified initiate_scan to use Nmap results for endpoint creation Added support for multiple endpoints discovery from Nmap results	`web/reNgine/tasks.py`
Improve URL validation and endpoint handling	Added new is_valid_url() function with comprehensive URL validation Enhanced save_endpoint() with better error checking and duplicate prevention Added validation for domain names, IP addresses, and port numbers Improved handling of default endpoints for subdomains	`web/reNgine/common_func.py` `web/reNgine/tasks.py`
Enhance error handling and logging	Added more detailed error messages and debug logging Improved error handling in subscan initialization Added checks for missing or invalid scan data Enhanced logging to include domain/subdomain context	`web/reNgine/tasks.py` `web/reNgine/celery_custom_task.py`
Refactor Nmap parsing functionality	Split parse_nmap_results() into different parsing types (vulnerabilities, services, ports) Improved Nmap command generation with better parameter handling Added support for different Nmap output formats Enhanced service detection logic	`web/reNgine/tasks.py` `web/reNgine/common_func.py`
Update configuration handling	Created get_http_crawl_value() for centralized HTTP crawl configuration Changed default HTTP crawl setting to false Added support for subscan-specific configurations Improved configuration inheritance logic	`web/reNgine/common_func.py` `web/reNgine/settings.py`

Possibly linked issues

bug(ui): CIDR breakdown and domain lookup is not working as intended #123: PR enhances HTTP scanning by using Nmap to detect web services, aligning with issue's goal.
bug: Scan not starting when subdomain is_default is set to False #7: PR changes improve alive host detection, directly addressing the issue's scan initiation problem.

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time. You can also use
this command to specify where the summary should be inserted.

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

sourcery-ai

Hey @psyray - I've reviewed your changes and they look great!

Here's what I looked at during the review

🟡 General issues: 3 issues found
🟢 Security: all looks good
🟡 Testing: 1 issue found
🟡 Complexity: 1 issue found
🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

web/reNgine/tasks.py

web/tests/test_nmap.py

web/reNgine/tasks.py

web/tests/test_nmap.py

web/reNgine/tasks.py

…thod - Extract first endpoint creation logic from initiate_scan to simplify it - Improve logging for debug - Add/remove logs for production

- Dedicated class/methods created for this purpose - Sanitize domain name to prevent dangerous chars - Full path normalization - Verify that final path is within the base folder - Explicit permissions defined - Manage errors with explicit message

web/reNgine/tasks.py

psyray · 2024-11-24T18:43:33Z

@AnonymousWP

Ready to merge for me.
This one fix an old business logic error that I wanted to fix from a long time ago
Replace inital HTTPx scan, that scans only the http port, by nmap which is more accurate in this case.
It quickly probes the web service ports, and according to the result it will use the good http scheme by prioritizing https port if it exists.
This will be improved in the future, but for the moment, the main goal was to fix screenshot issues (and others by collateral damage) and it prepares the ground to make a more precise initial scan, which is the starting point of all the remaining scan.

So if this one is badly recognized, reconnaissance will fail, and pentester could pass away a critical target

All my tests are OK

sourcery-ai bot reviewed Nov 22, 2024

View reviewed changes

github-advanced-security bot found potential problems Nov 22, 2024

View reviewed changes

web/reNgine/tasks.py Fixed Show fixed Hide fixed

web/reNgine/tasks.py Fixed Show fixed Hide fixed

web/reNgine/tasks.py Fixed Show fixed Hide fixed

web/reNgine/tasks.py Fixed Show fixed Hide fixed

web/reNgine/tasks.py Fixed Show fixed Hide fixed

psyray self-assigned this Nov 22, 2024

psyray requested a review from AnonymousWP November 22, 2024 18:54

psyray added the bug Something isn't working label Nov 22, 2024

psyray linked an issue Nov 22, 2024 that may be closed by this pull request

bug(scope): scanning stops while running scans on target is list of ips. #237

Closed

3 tasks

psyray added 5 commits November 24, 2024 17:03

fix: prevent race condition on save & add retry attempt on nmap scan

8412236

refactor: extract first endpoint creation logic from initiate_scan me…

fa3b654

…thod - Extract first endpoint creation logic from initiate_scan to simplify it - Improve logging for debug - Add/remove logs for production

fix(tests): remove loops in tests as recommended by sourcery

1490663

fix: remove unused vars

ec7e5a1

github-advanced-security bot found potential problems Nov 24, 2024

View reviewed changes

web/reNgine/tasks.py Fixed Show fixed Hide fixed

fix: remove nmap custom exception

b38f823

psyray mentioned this pull request Nov 24, 2024

bug(scan): port not correctly recognized #8

Closed

1 task

psyray requested a review from 0b3ud November 25, 2024 14:47

AnonymousWP approved these changes Nov 28, 2024

View reviewed changes

AnonymousWP merged commit 5c4d823 into release/2.1.1 Nov 28, 2024
5 checks passed

AnonymousWP deleted the fix-target-down-on-scan branch November 28, 2024 19:47

psyray mentioned this pull request Nov 29, 2024

bug(scope): Nuclei only scan the first endpoint (https://domain.com/) (HTTPS) and do not scan the next endpoints (http://domain.com/) (HTTP) #217

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(scan): resolve detection of the first endpoint in the initiate scan task #238

fix(scan): resolve detection of the first endpoint in the initiate scan task #238

psyray commented Nov 22, 2024 •

edited

Loading

sourcery-ai bot commented Nov 22, 2024 •

edited

Loading

Interacting with Sourcery

Customizing Your Experience

Getting Help

sourcery-ai bot left a comment

psyray commented Nov 24, 2024

fix(scan): resolve detection of the first endpoint in the initiate scan task #238

fix(scan): resolve detection of the first endpoint in the initiate scan task #238

Conversation

psyray commented Nov 22, 2024 • edited Loading

Summary by Sourcery

Todo

sourcery-ai bot commented Nov 22, 2024 • edited Loading

Reviewer's Guide by Sourcery

Sequence diagram for scan initiation with Nmap

Updated class diagram for scan initiation

File-Level Changes

Possibly linked issues

Interacting with Sourcery

Customizing Your Experience

Getting Help

sourcery-ai bot left a comment

Choose a reason for hiding this comment

psyray commented Nov 24, 2024

psyray commented Nov 22, 2024 •

edited

Loading

sourcery-ai bot commented Nov 22, 2024 •

edited

Loading