🌐 Web Scraper App

A simple and extensible command-line tool for scraping web data.

Built with the tools and technologies:

Python | Requests | BeautifulSoup | Pandas

Overview

This project is a command-line web scraping application built with Python. It allows users to extract structured data from different websites through an interactive menu. The scraped data is displayed in the console and can be optionally saved to a CSV file for further analysis.

The application is designed to be easily extensible, allowing new scrapers for other websites to be added with minimal effort.

Features

Interactive CLI: A user-friendly command-line interface to select a scraping target.
Multiple Scrapers:
- IMDb Top 250 Movies: Scrapes movie title, release year, duration, and IMDb rating.
- Former Presidents of India: Scrapes the list of presidents from Wikipedia, including their name, lifespan, home state, and term of office.
Data Export: Option to save the scraped data into a clean, well-formatted CSV file.
Modular Design: The code is organized into modules for scraping, utilities, and the main application logic, promoting readability and maintainability.

Getting Started

Prerequisites

Python 3.7+
The following Python libraries are required:
- requests
- beautifulsoup4
- pandas

Installation

Clone the repository (you'll need to set this up on a platform like GitHub):
```
git clone https://github.com/your-username/webscraper-app.git
```
Navigate to the project directory:
```
cd webscraper-app
```

It is recommended to create a virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows, use `venv\Scripts\activate`

Install the required packages from requirements.txt:
```
pip install -r requirements.txt
```

Usage

Run the application from the root directory of the project:
```
python src/main.py
```
The console will display a menu of available scraping options. Follow the on-screen prompts to choose a website to scrape, view the results, and optionally save them to a CSV file.

Project Structure

webscraper-app/
├── src/
│   ├── __init__.py
│   ├── main.py         # Main application entry point, handles user interaction
│   ├── scraper.py      # Contains all the web scraping logic
│   └── utils.py        # Utility functions (e.g., saving to CSV)
├── requirements.txt    # Lists project dependencies
└── README.md           # This file

License

This project is licensed under the MIT License. Consider creating a LICENSE file in your project root.

Contact

If you have any questions or feedback, feel free to reach out to me via my LinkedIn Profile.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
src		src
FirstTimeIMDBRanking.csv		FirstTimeIMDBRanking.csv
FirstTimePresidentsOfIndia.csv		FirstTimePresidentsOfIndia.csv
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🌐 Web Scraper App

Table of Contents

Overview

Features

Getting Started

Prerequisites

Installation

Usage

Project Structure

License

Contact

About

Uh oh!

Releases

Packages

Languages

License

brej-29/Logicmojo-AIML-Assignment-WebScrapperApp

Folders and files

Latest commit

History

Repository files navigation

🌐 Web Scraper App

Table of Contents

Overview

Features

Getting Started

Prerequisites

Installation

Usage

Project Structure

License

Contact

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages