Skip to content

mgrafals/search-engine-flask-bot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Custom Search Engine with Real-Time Data Aggregation

This project is a custom-built search engine web application that performs real-time web scraping across multiple search engines (Google, Bing, DuckDuckGo, Yahoo), stores search query data in a SQL Server database, and presents results in an organized and user-friendly interface.

Features

  • Real-time query scraping using Selenium and BeautifulSoup
  • Data stored in a SQL Server database with structured tables for queries, URLs, and search term frequencies
  • Dynamic Flask front-end to display URLs by relevance
  • Duplicate filtering and frequency aggregation of search terms
  • Integrated search functionality from both the home and results pages

Technologies Used

  • Python, Flask
  • SQL Server (T-SQL)
  • Selenium, BeautifulSoup
  • HTML/CSS
  • Jupyter Notebook

How to Run

  1. Set Up the Database

    • Run the SQL file Custom Bot Database Query v2.0.sql in SQL Server Management Studio to initialize the schema.
  2. Set Up the Environment

    • Install required libraries:

      pip install flask selenium pyodbc nltk beautifulsoup4
    • Download NLTK corpora:

      import nltk
      nltk.download('stopwords')
      nltk.download('punkt')
  3. Start the App

    • Run the Flask application from your terminal:
      python webscraping.py
    • Go to http://127.0.0.1:5000/ in your browser.

Project Structure

├── webscraping.py
├── app.ipynb
├── Custom Bot Database Query v2.0.sql
├── static/
│ ├── styles.css
│ └── logo.png
├── templates/
│ ├── index.html
│ └── search_results.html
├── presentation/
│ └── Project Presentation.pptx

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors