Multi-Country Indeed Job Scraper

This Python script scrapes job listings from Indeed.com across 14 different country domains for multiple job professions. It extracts job details including title, company name, location, job URL, and company URL, then exports the results to a CSV file.

Features

Scrapes job listings from 14 Indeed.com country domains
Searches for 14 different job professions
Extracts comprehensive job details including company URLs
Implements multi-threading for faster scraping
Uses cloudscraper to bypass anti-bot measures
Implements retry mechanism for handling network errors
Exports results to a CSV file
Includes user authentication for script execution

Prerequisites

Before you begin, ensure you have met the following requirements:

Python 3.6+
pip (Python package manager)

Installation

Clone this repository:

git clone https://github.com/yourusername/multi-country-indeed-scraper.git

Navigate to the project directory:
```
cd multi-country-indeed-scraper
```
Install the required packages:
```
pip install -r requirements.txt
```

Usage

Run the script:
```
python multi_country_indeed_scraper.py
```
Enter the username and password when prompted:
- Username: Professor
- Password: raja
The script will start scraping job listings from all specified countries and job professions.
Results will be saved in Multi_Country_Job_results.csv in the same directory.

Customization

To modify the list of countries or their Indeed URLs, edit the domains dictionary in the main() function.
To change the job professions being searched, modify the job_professions list in the main() function.
Adjust the MAX_RETRIES and RETRY_DELAY variables to fine-tune the retry mechanism.

Dependencies

cloudscraper: For bypassing Cloudflare's anti-bot page.
BeautifulSoup: For parsing HTML and extracting data.
pandas: For creating and exporting data to CSV.
requests: For making HTTP requests.
concurrent.futures: For implementing multi-threading.

Additional Resources

Ethical Considerations

Web scraping may be against the terms of service of some websites. Always review and respect the target website's robots.txt file and terms of service. Use this script responsibly and ensure you have permission to scrape the target websites. The authors are not responsible for any misuse of this script.

Contributing

Contributions, issues, and feature requests are welcome. Feel free to check the issues page if you want to contribute.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitattributes		.gitattributes
Indeed_job_Scraper.py		Indeed_job_Scraper.py
Multi_Country_Job_results.csv		Multi_Country_Job_results.csv
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multi-Country Indeed Job Scraper

Features

Prerequisites

Installation

Usage

Customization

Dependencies

Additional Resources

Ethical Considerations

Contributing

License

About

Releases

Packages

Languages

Raimal-Raja/Advanced_Web_Crawler

Folders and files

Latest commit

History

Repository files navigation

Multi-Country Indeed Job Scraper

Features

Prerequisites

Installation

Usage

Customization

Dependencies

Additional Resources

Ethical Considerations

Contributing

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages