Open Source Website Crawler
Explore the docs »
Report Bug
.
Request Feature
An Open Source Crawler/Spider
Can be used by anyone... And can be ran on any win / linux computers It ain't any crawler for industrial use as written in a slow programming language and may have its own issues..
The project can be easily used with mongoDB.
The project can also be used for pentesting.
- Cross Platform
- Installer for linux
- Related-CLI Tools (includes ,CLI access to tool, not that good search-tool xD, etc)
- Memory efficient [ig]
- Pool Crawling - Use multiple crawlers at same time
- Supports Robot.txt
- MongoDB [DB]
- Language Detection
- 18 + Checks / Offensive Content Check
- Proxies
- Multi Threading
- Url Scanning
- Keyword, Desc And recurring words Logging
- Search Website - search_website.py
- Connection Tree Website - tree_website.py
- Tool for finding proxies - proxy_tool.py
The first thing is install the project... The installer provided is only for Linux..
In windows the application wont be added to path or requirements be installed soo check out the installation procedure for Windows.
git clone https://github.com/merwin-asm/OpenCrawler.gitcd OpenCrawlerchmod +x install.sh && ./install.shYou need git, python3 and pip installed
git clone https://github.com/merwin-asm/OpenCrawler.gitcd OpenCrawlerpip install -r requirements.txtThe project can be used for :
- Making a (not that good) search engine
- For Osint
- For Pentesting
To see available commands
opencrawler helpor
man opencrawlerTo see available commands
python opencrawler helpContributions are what make the open source community such an amazing place to be learn, inspire, and create. Any contributions you make are greatly appreciated.
- If you have suggestions for adding or removing projects, feel free to open an issue to discuss it, or directly create a pull request after you edit the README.md file with necessary changes.
- Please make sure you check your spelling and grammar.
- Create individual PR for each suggestion.
Distributed under the MIT License. See LICENSE for more information.
- Merwin A J - CS Student - Merwin A J - Build OpenCrawler

