Skip to content

tk565134/Multi-Site-Python-Crawler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 

Repository files navigation

Multi-Site-Python-Crawler

1.Python libraries used are urllib, bs4, re, os, json, logging.

2.Run the program and then give the url string as input to the program.

3.This multisite web crawler extract all the p tag values from a page and recursively follow all the links on that page which matches with the given url domain.

4.Output will be saved in the file 'output.json' and logs will be written in the file 'log.txt' in CWD.

About

Python libraries used in the program are: bs4,urllib,re,os,json,logging

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages