Skip to content

naveennk045/AI-WebScraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🌐 AI Web Scraper Chatbot

🚀 Unlock the Power of AI-Powered Web Scraping!

📖 Overview

Welcome to the AI Web Scraper Chatbot! This tool combines AI and web scraping technologies to provide an intelligent interface for extracting and interacting with website data. Seamlessly scrape website content, ask intelligent questions, and extract precise information—all through an interactive chatbot interface.

🎯 Features

  • 🖥️ Web Scraping: Enter a URL, and the tool will scrape the website content in real-time.
  • 📄 DOM Content Viewer: Instantly preview the website's DOM content for better understanding.
  • 💬 Conversational Parsing: Ask questions about the content, and the chatbot will extract the information you need.
  • 🔍 Smart Data Extraction: Provide a description for parsing, and get the exact data you're looking for with AI-driven precision.

⚙️ Technologies Used

  • 💻 Frontend: Streamlit for a seamless and interactive user interface.
  • ⚙️ Backend: Python-based web scraping using Selenium, BeautifulSoup, and requests.
  • 🧠 AI Integration: LLaMA Model for smart text parsing and chatbot functionality.
  • 📦 Libraries:
    • Selenium for automated web interactions
    • BeautifulSoup for HTML parsing
    • LLaMA for AI-driven response generation
    • Streamlit for building interactive UI

🔧 How It Works

  1. Step 1: Enter a URL you want to scrape.
  2. Step 2: Preview the DOM content of the webpage.
  3. Step 3: Ask questions or describe the specific data you want to extract.
  4. Step 4: The AI model will parse the content and provide relevant answers based on your query.

🛠️ Project Workflow

Below is the flow of how the system works:

Workflow Diagram

  1. Scrape Website → 2. Extract DOM → 3. Chat with AI → 4. Get Parsed Results

🎥 Demo Video

Check out the demo video of how this tool works:
Demo Video

🚀 Get Started

Installation:

  1. Clone the repository:

    git clone https://github.com/naveennk045/AI-WebScraper
  2. Install the required dependencies:

    pip install -r requirements.txt
  3. Run the Streamlit app:

    streamlit run app.py
  4. Usage:

    Open your browser and navigate to `http://localhost:8501`.  
    Enter a URL in the chatbox to scrape and start interacting!

🤝 Contributing

We welcome contributions! Feel free to fork the repository, raise issues, or submit PRs to help make this tool even better.

📞 Contact

For any queries or support, feel free to reach out:

📜 License

This project is licensed under the MIT License. See the LICENSE file for more details.

🏆 Acknowledgments

Special thanks to the open-source community and libraries that made this project possible! 🙏

About

AI-powered web scraper designed to extract, process, and analyze data from websites efficiently.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published