Wuzzwf Data Analysis
Overview
This project collects and analyzes "Data Analysis" job data from Wuzzuf using web scraping, cleans the dataset, and presents visualizations for easy insights.
Features
Data Scraping: Uses Selenium and BeautifulSoup to extract job title, company, location, and job type.
Data Cleaning: Removes duplicates, standardizes formats, and cleans columns.
Data Visualization: Creates charts showing job distribution by company, location, and job type.
Automated Pipeline: The main.py script runs all steps automatically: scraping → cleaning → visualizing.
Technologies Used
Python – Main programming language
Selenium – For scraping web pages
BeautifulSoup – For parsing HTML
Pandas – For data manipulation and analysis
Matplotlib – For data visualization
How to Use
- Clone the repository:
git clone https://github.com/mohamedamerdev-coder/Wuzzwf-Data-Analysis- cd Wuzzwf-Data-Analysis-
- Install dependencies:
pip install selenium beautifulsoup4 pandas matplotlib
- Run the main script:
python main.py
Steps executed automatically:
-
Scrapes job data from Wuzzuf
-
Cleans and saves the data
-
Generates charts in the charts/ folder
Project Structure
Wuzzwf-Data-Analysis
scrap.py # Scraping functions
dataclean.py # Data cleaning functions
visualizelize.py #Visualizationtion functions
main.py # Main pipeline
wuzzuf_jobs.csv # Raw scraped data
jobs_cleaned.csv #Cleaned data
charts/ # Generated visualizations
Notes
You can adjust the number of pages to scrape by changing the pages argument in wuzzuf_scrap().
Make sure ChromeDriver is installed and compatible with your Chrome version.
Visualizations are saved in charts/ as high-resolution PNG files.
License
This project is licensed under the MIT License.