1. Web Scrapers
2. Web Crawlers
3. Pen Testers
4. People who want list of proxies in Bulk
It scraps [free-proxy-list.net] and extracts all the IP Addresses, Port Numbers,Country Codes etc, which are available on the home page accessible via pagination.
1. requests
2. BeautifulSoup
3. html5lib
git clone https://github.com/gagangulyani/Proxy_List_Generator.git
cd Proxy_List_Generator
pip install -r requirements.txt
AttributeError: module 'html5lib.treebuilders' has no attribute '_base'
pip install --upgrade beautifulsoup4
pip install --upgrade html5lib
- Copy Paste or Move proxy_get.py file in Current Working Directory of your project
- Import
proxy_get
in your code:from proxy_get import Proxy_List
- Call the
Proxy_List()
function
python proxy_get.py
Output will be displayed only is script is executed directly (without importing it)
If Imported and Proxy_List() is called:
Returns dictionary containing list of proxies and number of proxies retrieved
example:
{
"list_of_proxies":[
{
'ip_address': '118.174.196.112',
'port': '36314',
'code': 'TH',
'country': 'Thailand',
'anonymity': 'elite proxy',
'google': 'no',
'https': 'yes',
'last_checked': '29 seconds ago'
},
{
'ip_address': '84.201.158.146',
'port': '3128',
'code': 'RU',
'country': 'Russian Federation',
'anonymity': 'elite proxy',
'google': 'no',
'https': 'yes',
'last_checked': '29 seconds ago'}]
}
...
],
"count" : 300
}
Output:
Failed to get Proxy List :(
Try Running the script again..
Return:
None
Output:
Something Went Wrong while Getting Proxy List!
Return:
None
feel free to tweak it as much as you like.
If you have any ideas or suggestions regarding the latter,mail me.
(Email: gagangulyanig@gmail.com)
1. beautifulsoup, html5lib: module object has no attribute _base
https://stackoverflow.com/questions/38447738/beautifulsoup-html5lib-module-object-has-no-attribute-base