Skip to content

Python script that crawls websites to your chosen depth and makes a graph of the url's each page hyperlinks too with lot's of settings to configure visually.

License

Notifications You must be signed in to change notification settings

zachjesus/web-graph

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Python Web Graph

Wraps over graphviz, uses bs4 to parse and httpx for requests. Takes advantage of Pythons built in robots.txt parser. Very respectful to servers by default. Main file can easily be modified for experimentation.

image (1)

How to Run

Clone repo into new folder:

git clone https://github.com/zachjesus/web-graph.git .

Set up virtual environment:

python3 -m venv venv

Active venv

source venv/bin/activate  # Linux / macOS
venv\Scripts\activate     # Windows

Install requirements

 pip install -r requirements.txt

Configure main.py, settings are graphviz's, see docs (or use default):

python3 main.py

Result will be placed in a folder called directory named "output" in the folder where the code is installed. By default, the web graph made graph is an svg image like the one above.

About

Python script that crawls websites to your chosen depth and makes a graph of the url's each page hyperlinks too with lot's of settings to configure visually.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages