Raven is a powerful and customizable web crawler written in Go. It allows you to extract internal and external links from a given website with options for concurrent crawling, depth customization, and maximum URL limits.
- Concurrent crawling to maximize efficiency.
- Customizable depth and maximum URL limits to tailor the crawling process to your needs.
- Extraction of both internal and external links for comprehensive analysis.
- Colorful logging for easy debugging and tracking of crawling progress.
- Error handling for fetching URLs to ensure robustness.
To install Raven, you have three options,
-
Clone the Raven repository
git clone https://github.com/VFA250/Raven.git- Navigate to the project directory
cd raven- Build the project
go build- To install Raven, use go get
go get github.com/VFA250/ravenchmod +x raven./raven [options] <startURL>- -maxURLs : Maximum number of URLs to crawl (default: 100)
- -maxDepth : Maximum depth of crawling (default: 3)
- -concurrency : Number of concurrent requests (default: 10)
./raven -maxURLs 500 -maxDepth 5 -concurrency 20 https://target.comThis command will crawl the website https://target.com with a maximum of 500 URLs, a maximum depth of 5, and 20 concurrent requests.
-
Raven depends on the following external packages: golang.org/x/net/html : Used for HTML parsing.
-
You can install these dependencies using the following command
go mod tidyThis project is licensed under the MIT License. See the LICENSE file for details.