xcrawl3r

xcrawl3r is a command-line utility designed to recursively spider webpages for URLs. It works by actively traversing websites - following links embedded in webpages, and parsing files (including sitemaps & robots.txt) - to uncover every URL.

Unlike xurlfind3r that doesn't interact directly with the target, xcrawl3r interacts directly with the target by spidering its pages in real time. This active approach allows it to discover URLs that may be hidden or unindexed, providing a complete picture of the website’s navigational flow and content distribution. This makes xcrawl3r a powerful tool for security researchers, IT professionals, and anyone looking to gain insights into the URLs associated with websites.

Features

Recursively spiders webpages for URLs
Extracts URLs from files (including sitemaps & robots.txt)
Supports stdin and stdout for easy integration in automated workflows
Supports multiple output formats (JSONL, file, stdout)
Cross-Platform (Windows, Linux & macOS)

Installation

Install release binaries (Without Go Installed)

Visit the releases page and find the appropriate archive for your operating system and architecture. Download the archive from your browser or copy its URL and retrieve it with wget or curl:

...with wget:

 wget https://github.com/hueristiq/xcrawl3r/releases/download/v<version>/xcrawl3r-<version>-linux-amd64.tar.gz

...or, with curl:

 curl -OL https://github.com/hueristiq/xcrawl3r/releases/download/v<version>/xcrawl3r-<version>-linux-amd64.tar.gz

...then, extract the binary:

tar xf xcrawl3r-<version>-linux-amd64.tar.gz

Tip

The above steps, download and extract, can be combined into a single step with this onliner

curl -sL https://github.com/hueristiq/xcrawl3r/releases/download/v<version>/xcrawl3r-<version>-linux-amd64.tar.gz | tar -xzv

Note

On Windows systems, you should be able to double-click the zip archive to extract the xcrawl3r executable.

...move the xcrawl3r binary to somewhere in your PATH. For example, on GNU/Linux and OS X systems:

sudo mv xcrawl3r /usr/local/bin/

Note

Windows users can follow How to: Add Tool Locations to the PATH Environment Variable in order to add xcrawl3r to their PATH.

Install source (With Go Installed)

Before you install from source, you need to make sure that Go is installed on your system. You can install Go by following the official instructions for your operating system. For this, we will assume that Go is already installed.

`go install ...`

go install -v github.com/hueristiq/xcrawl3r/cmd/xcrawl3r@latest

`go build ...` the development version

Clone the repository

 git clone https://github.com/hueristiq/xcrawl3r.git

Build the utility

 cd xcrawl3r/cmd/xcrawl3r && \
 go build .

Move the xcrawl3r binary to somewhere in your PATH. For example, on GNU/Linux and OS X systems:
```
 sudo mv xcrawl3r /usr/local/bin/
```
Windows users can follow How to: Add Tool Locations to the PATH Environment Variable in order to add xcrawl3r to their PATH.

Caution

While the development version is a good way to take a peek at xcrawl3r's latest features before they get released, be aware that it may have bugs. Officially released versions will generally be more stable.

Install on Docker (With Docker Installed)

To install xcrawl3r on docker:

Pull the docker image using:
```
docker pull hueristiq/xcrawl3r:latest
```

Run xcrawl3r using the image:

docker run --rm hueristiq/xcrawl3r:latest -h

Post Installation

xcrawl3r will work right after installation. However, some configuration added to a configuration file at $HOME/.config/xcrawl3r/config.yaml, created upon first run, or set as environment variables.

Example of environment variables:

XCRAWL3R_REQUEST_TIMEOUT=10

Usage

To start using xcrawl3r, open your terminal and run the following command for a list of options:

xcrawl3r -h

Here's what the help message looks like:


                             _ _____
__  _____ _ __ __ ___      _| |___ / _ __
\ \/ / __| '__/ _` \ \ /\ / / | |_ \| '__|
 >  < (__| | | (_| |\ V  V /| |___) | |
/_/\_\___|_|  \__,_| \_/\_/ |_|____/|_|
                                    v1.1.0

USAGE:
 xcrawl3r [OPTIONS]

CONFIGURATION:
 -c, --configuration string       (default: $HOME/.config/xcrawl3r/config.yaml)

INPUT:
 -u, --url string[]               target URL
 -l, --list string                target URLs file path

 For multiple URLs, use comma(,) separated value with `--url`,
 specify multiple `--url`, load from file with `--list` or load from stdin.

SCOPE:
 -d, --domain string[]            match domain(s)  URLs

 For multiple domains, use comma(,) separated value with `--domain`
 or specify multiple `--domain`.

     --include-subdomains bool    with domain(s), match subdomains' URLs

REQUEST:
     --delay int                  delay between each request in seconds
 -H, --header string[]            header to include in 'header:value' format

 For multiple headers, use comma(,) separated value with `--header`
 or specify multiple `--header`.

     --timeout int                time to wait for request in seconds (default: 10)

PROXY:
 -p, --proxy string[]             Proxy (e.g: http://127.0.0.1:8080)

 For multiple proxies use comma(,) separated value with `--proxy`
 or specify multiple `--proxy`.

OPTIMIZATION:
     --depth int                  maximum depth to crawl, `0` for infinite (default: 1)
 -C, --concurrency int            number of concurrent inputs to process (default: 5)
 -P, --parallelism int            number of concurrent fetchers to use (default: 5)

DEBUG:
     --debug bool                 enable debug mode

OUTPUT:
     --jsonl bool                 output in JSONL(ines)
 -o, --output string              output write file path
 -m, --monochrome bool            stdout in monochrome
 -s, --silent bool                stdout in silent mode
 -v, --verbose bool               stdout in verbose mode

Contributing

Contributions are welcome and encouraged! Feel free to submit Pull Requests or report Issues. For more details, check out the contribution guidelines.

A big thank you to all the contributors for your ongoing support!

Licensing

This package is licensed under the MIT license. You are free to use, modify, and distribute it, as long as you follow the terms of the license. You can find the full license text in the repository - Full MIT license text.

Name		Name	Last commit message	Last commit date
Latest commit History 127 Commits
.github		.github
cmd/xcrawl3r		cmd/xcrawl3r
internal		internal
pkg/xcrawl3r		pkg/xcrawl3r
.dockerignore		.dockerignore
.gitignore		.gitignore
.golangci.yaml		.golangci.yaml
.goreleaser.yaml		.goreleaser.yaml
.lefthook.yaml		.lefthook.yaml
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

xcrawl3r

Resources

Features

Installation

Install release binaries (Without Go Installed)

Install source (With Go Installed)

`go install ...`

`go build ...` the development version

Install on Docker (With Docker Installed)

Post Installation

Usage

Contributing

Licensing

About

Uh oh!

Releases 3

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

License

hueristiq/xcrawl3r

Folders and files

Latest commit

History

Repository files navigation

xcrawl3r

Resources

Features

Installation

Install release binaries (Without Go Installed)

Install source (With Go Installed)

go install ...

go build ... the development version

Install on Docker (With Docker Installed)

Post Installation

Usage

Contributing

Licensing

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

`go install ...`

`go build ...` the development version

Packages