Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(scan): resolve detection of the first endpoint in the initiate scan task #238

Merged
merged 7 commits into from
Nov 28, 2024

Conversation

psyray
Copy link
Contributor

@psyray psyray commented Nov 22, 2024

Fixes #237

  • Replace HTTPx first scan by nmap, then launch HTTPx with discovered port
  • Create a reusable function to launch nmap on the fly
  • Add parsing to get ports and services from Nmap output
  • Add more logs to debug scans while running
  • Remove the HTTP CRAWL global var, Nmap is the default to retrieve the first endpoint (the starting point for all the others tasks)
  • Adjust the is_alive parameter for tasks that need alive endpoints
  • Fix S3 scanner source file not found
  • Add more checks to prevent errors and scan crash
  • Refactor Endpoint saving for a better logic and less errors
  • Improve URLs validation

Summary by Sourcery

Bug Fixes:

  • Fix the issue where the S3 scanner source file was not found, ensuring the correct file path is used.

Enhancements:

  • Refactor the endpoint saving logic to improve clarity and reduce errors.
  • Improve URL validation by adding a new function to check the validity of URLs, including domain:port formats.
  • Add more logging to provide better insights during scan execution and debugging.

This PR prepared the ground to effectively resolve #208 & #8

Todo

  • Test initiate scan with Full scan
  • Test initiation subscan with all the scan type

…an task

- Replace HTTPx first scan by nmap, then launch HTTPx with discovered port
- Create a reusable function to launch nmap on the fly
- Add parsing to get ports and services from Nmap output
- Add more logs to debug scans while running
- Remove the HTTP CRAWL global var, Nmap is the default to retrieve the first endpoint (the starting point for all the others tasks)
- Adjust the is_alive parameter for tasks that need alive endpoints
- Fix S3 scanner source file not found
- Add more checks to prevent errors and scan crash
- Refactor Endpoint saving for a better logic and less errors
- Improve URLs validation
Copy link
Contributor

sourcery-ai bot commented Nov 22, 2024

Reviewer's Guide by Sourcery

This PR improves the scan initialization process by replacing HTTPx with Nmap for initial endpoint detection and adds several robustness improvements. The main changes focus on better error handling, improved URL validation, and more structured service detection using Nmap. The code has been refactored to be more maintainable and reliable.

Sequence diagram for scan initiation with Nmap

sequenceDiagram
    actor User
    participant System
    participant Nmap
    participant HTTPx
    User->>System: Initiate scan
    System->>Nmap: Run Nmap to find web services
    alt Web services found
        Nmap-->>System: Return open ports and services
        System->>HTTPx: Launch HTTPx with discovered ports
        HTTPx-->>System: Return HTTP endpoints
    else No web services found
        Nmap-->>System: No open ports
        System-->>User: Scan failed
    end
Loading

Updated class diagram for scan initiation

classDiagram
    class ScanHistory {
        +int id
        +DateTime last_scan_date
        +String scan_status
        +String error_message
        +void save()
    }
    class Domain {
        +int id
        +String name
        +DateTime last_scan_date
        +void save()
    }
    class Subdomain {
        +int id
        +String name
        +void save()
    }
    class EndPoint {
        +int id
        +String http_url
        +bool is_default
        +DateTime discovered_date
        +void save()
    }
    class Nmap {
        +dict get_nmap_http_datas(String host, dict ctx)
    }
    ScanHistory --> Domain : belongs to
    Domain --> Subdomain : has
    Subdomain --> EndPoint : has
    EndPoint --> Nmap : uses
    note for Nmap "Nmap is used to detect open ports and services"
Loading

File-Level Changes

Change Details Files
Replace HTTPx with Nmap for initial endpoint detection
  • Created new function get_nmap_http_datas() to detect web services
  • Added parsing of Nmap results to identify HTTP/HTTPS services
  • Modified initiate_scan to use Nmap results for endpoint creation
  • Added support for multiple endpoints discovery from Nmap results
web/reNgine/tasks.py
Improve URL validation and endpoint handling
  • Added new is_valid_url() function with comprehensive URL validation
  • Enhanced save_endpoint() with better error checking and duplicate prevention
  • Added validation for domain names, IP addresses, and port numbers
  • Improved handling of default endpoints for subdomains
web/reNgine/common_func.py
web/reNgine/tasks.py
Enhance error handling and logging
  • Added more detailed error messages and debug logging
  • Improved error handling in subscan initialization
  • Added checks for missing or invalid scan data
  • Enhanced logging to include domain/subdomain context
web/reNgine/tasks.py
web/reNgine/celery_custom_task.py
Refactor Nmap parsing functionality
  • Split parse_nmap_results() into different parsing types (vulnerabilities, services, ports)
  • Improved Nmap command generation with better parameter handling
  • Added support for different Nmap output formats
  • Enhanced service detection logic
web/reNgine/tasks.py
web/reNgine/common_func.py
Update configuration handling
  • Created get_http_crawl_value() for centralized HTTP crawl configuration
  • Changed default HTTP crawl setting to false
  • Added support for subscan-specific configurations
  • Improved configuration inheritance logic
web/reNgine/common_func.py
web/reNgine/settings.py

Possibly linked issues


Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time. You can also use
    this command to specify where the summary should be inserted.

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @psyray - I've reviewed your changes and they look great!

Here's what I looked at during the review
  • 🟡 General issues: 3 issues found
  • 🟢 Security: all looks good
  • 🟡 Testing: 1 issue found
  • 🟡 Complexity: 1 issue found
  • 🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

web/reNgine/tasks.py Outdated Show resolved Hide resolved
web/reNgine/tasks.py Outdated Show resolved Hide resolved
web/reNgine/tasks.py Show resolved Hide resolved
web/tests/test_nmap.py Show resolved Hide resolved
web/reNgine/tasks.py Outdated Show resolved Hide resolved
web/tests/test_nmap.py Outdated Show resolved Hide resolved
web/reNgine/tasks.py Fixed Show fixed Hide fixed
web/reNgine/tasks.py Fixed Show fixed Hide fixed
web/reNgine/tasks.py Fixed Show fixed Hide fixed
web/reNgine/tasks.py Fixed Show fixed Hide fixed
web/reNgine/tasks.py Fixed Show fixed Hide fixed
@psyray psyray self-assigned this Nov 22, 2024
@psyray psyray requested a review from AnonymousWP November 22, 2024 18:54
@psyray psyray added the bug Something isn't working label Nov 22, 2024
@psyray psyray linked an issue Nov 22, 2024 that may be closed by this pull request
3 tasks
…thod

- Extract first endpoint creation logic from initiate_scan to simplify it
- Improve logging for debug
- Add/remove logs for production
- Dedicated class/methods created for this purpose
- Sanitize domain name to prevent dangerous chars
- Full path normalization
- Verify that final path is within the base folder
- Explicit permissions defined
- Manage errors with explicit message
web/reNgine/tasks.py Fixed Show fixed Hide fixed
@psyray
Copy link
Contributor Author

psyray commented Nov 24, 2024

@AnonymousWP

Ready to merge for me.
This one fix an old business logic error that I wanted to fix from a long time ago
Replace inital HTTPx scan, that scans only the http port, by nmap which is more accurate in this case.
It quickly probes the web service ports, and according to the result it will use the good http scheme by prioritizing https port if it exists.
This will be improved in the future, but for the moment, the main goal was to fix screenshot issues (and others by collateral damage) and it prepares the ground to make a more precise initial scan, which is the starting point of all the remaining scan.

So if this one is badly recognized, reconnaissance will fail, and pentester could pass away a critical target

All my tests are OK

@psyray psyray requested a review from 0b3ud November 25, 2024 14:47
@AnonymousWP AnonymousWP merged commit 5c4d823 into release/2.1.1 Nov 28, 2024
5 checks passed
@AnonymousWP AnonymousWP deleted the fix-target-down-on-scan branch November 28, 2024 19:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bug(scope): scanning stops while running scans on target is list of ips.
2 participants