Created by KING-258

Statistics

📈 Contribution Graph

Overview

This Python-based script fetches data from multiple APIs (NewsAPI, GDELT, Wikipedia) based on a given phrase, processes it, and sends it to a Kafka message queue. The goal is to analyze news articles, global trends, and Wikipedia summaries efficiently, within a set time limit, to avoid long processing times.

Features

Multi-API Integration: Fetch data from NewsAPI, GDELT, and Wikipedia based on user input.
Time-limited Fetching: Limits API calls to 3 minutes or 5 pages of results, ensuring efficient processing.
Kafka Integration: Sends combined data to a Kafka queue for real-time streaming of data after webscraping.
Customizable Search: Can be modified to adjust the time limits or page size.
Hadoop Usage: Hadoop is used for storing webscraped data and cross-referencing new searches with already searched data.

Requirements

Python 3.8 or above
Kafka
Hadoop
NewsAPI Client
GDELT API
Wikipedia API
Requests

Installation

Clone the Repository

git clone https://github.com/KING-258/BDA_Mini
cd BDA_Mini

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
__pycache__		__pycache__
news_articles		news_articles
README.md		README.md
hadoop_utils.py		hadoop_utils.py
kafka_consumer.py		kafka_consumer.py
kafka_producer.py		kafka_producer.py
kafka_to_hdfs_consumer.py		kafka_to_hdfs_consumer.py
local_file_utils.py		local_file_utils.py
main.py		main.py
popularity_analysis.py		popularity_analysis.py
sentiment_analyzer.py		sentiment_analyzer.py
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Created by KING-258

Statistics

📈 Contribution Graph

Overview

Features

Requirements

Installation

About

Releases

Packages

Languages

KING-258/BDA_Mini

Folders and files

Latest commit

History

Repository files navigation

Created by KING-258

Statistics

📈 Contribution Graph

Overview

Features

Requirements

Installation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages