Modular Car Data Scraper & Analyzer

This project is a modular system for scraping car data from a website, processing it through a server, and analyzing the results. The system consists of three main components: a web scraper client, a data processing server, and a data analyzer.

Project Structure

├── client.py          # Web scraper client
├── server.py          # Data processing server
└── fileAnalyzer.py    # Data analysis module

Components

1. client.py (Web Scraper)

Purpose: Scrapes car data from a target website
Functionality:
- Authenticates with the website
- Scrapes car listings from multiple pages
- Extracts car details (company, model, year, trim, kilometer, price)
- Sends extracted data to the server
- Receives processed data and saves it to file.txt
Key Libraries: requests, BeautifulSoup, json, re, socket

2. server.py (Data Processor)

Purpose: Processes scraped car data using a custom spreadsheet-like language
Functionality:
- Listens for connections from the client
- Creates tables to store car data
- Implements a custom language for data manipulation
- Processes and analyzes car data
- Sends processed data back to the client
Key Features:
- Custom spreadsheet language with cell references (e.g., A1, B2)
- Arithmetic operations (+, -, *, /)
- Variable assignment and context management
- Hash-based cell addressing system

3. fileAnalyzer.py (Data Analyzer)

Purpose: Analyzes processed car data
Functionality:
- Reads processed data from file.txt
- Performs various statistical analyses:
  - Model comparison between companies
  - Production year analysis
  - Price analysis by company
  - Specific model analysis (e.g., Peugeot 206)
- Uses pandas and numpy for data manipulation
Key Libraries: json, numpy, pandas

Setup Instructions

Install Dependencies:

pip install requests beautifulsoup4 numpy pandas

Start the Server:
```
python server.py
```
The server will start listening on port 9999.
Run the Client:
```
python client.py
```
The client will scrape data, send it to the server, and save processed data to file.txt.
Analyze the Data:
```
python fileAnalyzer.py
```
The analyzer will process the data and print statistical results.

Data Flow

Scraping Phase:
- client.py scrapes car data from the website
- Data is sent to server.py via socket connection
Processing Phase:
- server.py processes data using custom spreadsheet language
- Processed data is sent back to client.py
Analysis Phase:
- fileAnalyzer.py reads processed data from file.txt
- Performs statistical analysis and prints results

Key Features

Modular Design: Each component has a distinct responsibility
Custom Spreadsheet Language: Implemented in server.py for data manipulation
Real-time Processing: Data is processed as it's scraped
Statistical Analysis: Comprehensive analysis of car market data

Notes

The scraper uses hardcoded credentials for authentication
The server implements a custom hash-based addressing system for cells
Analysis results are printed to the console (can be modified to save to files)
The system is designed to handle 500 pages of car listings

Output Files

file.txt: Contains processed car data in JSON format
Console output: Statistical analysis results from fileAnalyzer.py

This modular system provides a complete solution for web scraping, data processing, and analysis of car market data. Each component can be developed and tested independently while maintaining interoperability through standardized data formats.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Modular Car Data Scraper & Analyzer

Project Structure

Components

1. client.py (Web Scraper)

2. server.py (Data Processor)

3. fileAnalyzer.py (Data Analyzer)

Setup Instructions

Data Flow

Key Features

Notes

Output Files

About

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md
client.py		client.py
file.txt		file.txt
fileAnalyzer.py		fileAnalyzer.py
server.py		server.py

Ali-Morajabi/BAMA-DataAnalysis

Folders and files

Latest commit

History

Repository files navigation

Modular Car Data Scraper & Analyzer

Project Structure

Components

1. client.py (Web Scraper)

2. server.py (Data Processor)

3. fileAnalyzer.py (Data Analyzer)

Setup Instructions

Data Flow

Key Features

Notes

Output Files

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Languages