Custom Search Engine with Real-Time Data Aggregation

This project is a custom-built search engine web application that performs real-time web scraping across multiple search engines (Google, Bing, DuckDuckGo, Yahoo), stores search query data in a SQL Server database, and presents results in an organized and user-friendly interface.

Features

Real-time query scraping using Selenium and BeautifulSoup
Data stored in a SQL Server database with structured tables for queries, URLs, and search term frequencies
Dynamic Flask front-end to display URLs by relevance
Duplicate filtering and frequency aggregation of search terms
Integrated search functionality from both the home and results pages

Technologies Used

Python, Flask
SQL Server (T-SQL)
Selenium, BeautifulSoup
HTML/CSS
Jupyter Notebook

How to Run

Set Up the Database
- Run the SQL file Custom Bot Database Query v2.0.sql in SQL Server Management Studio to initialize the schema.

Set Up the Environment

Install required libraries:

pip install flask selenium pyodbc nltk beautifulsoup4

Download NLTK corpora:

import nltk
nltk.download('stopwords')
nltk.download('punkt')

Start the App
- Run the Flask application from your terminal:
```
python webscraping.py
```
- Go to http://127.0.0.1:5000/ in your browser.

Project Structure

├── webscraping.py
├── app.ipynb
├── Custom Bot Database Query v2.0.sql
├── static/
│ ├── styles.css
│ └── logo.png
├── templates/
│ ├── index.html
│ └── search_results.html
├── presentation/
│ └── Project Presentation.pptx

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Custom Search Engine with Real-Time Data Aggregation

Features

Technologies Used

How to Run

Project Structure

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
presentation		presentation
static		static
templates		templates
Custom Bot Database Query v2.0.sql		Custom Bot Database Query v2.0.sql
README.md		README.md
app.ipynb		app.ipynb
webscraping.py		webscraping.py

mgrafals/search-engine-flask-bot

Folders and files

Latest commit

History

Repository files navigation

Custom Search Engine with Real-Time Data Aggregation

Features

Technologies Used

How to Run

Project Structure

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages