Analyze real-time company news, extract sentiment and key topics, and generate Hindi audio summaries using cutting-edge NLP and TTS technologies.
This project fetches the latest news articles about a given company, performs sentiment analysis, extracts key topics, and generates a Hindi spoken summary using Text-to-Speech (TTS). It includes a REST API built with Flask and a user-friendly Streamlit UI.
🔍 Real-Time News Scraping from multiple online sources.
💬 Sentiment Analysis using pre-trained transformer models (Hugging Face).
📌 Key Topic Extraction with KeyBERT.
🗣️ Hindi Audio Summarization with Google Text-to-Speech (gTTS).
🌐 RESTful API for programmatic access.
🖥️ Streamlit Web App for interactive use.
🌍 Multilingual Support for broader accessibility.
🔓 Open-source and free — no paid API dependencies.
Backend API: Flask (Python)
Frontend: Streamlit
Scraping: Requests, BeautifulSoup4
Sentiment Model: Transformers (BERT-like)
Topic Extraction: KeyBERT
Text-to-Speech: gTTS (Google Text-to-Speech)
Deployment Ready: Flask API & Streamlit (local/cloud)
git clone https://github.com/amruthadevops/NewsSummarization
cd NewsSummarization# For Windows
python -m venv venv
venv\Scripts\activate
# For macOS/Linux
python3 -m venv venv
source venv/bin/activatepip install -r requirements.txtpython app.pystreamlit run app.pyRequest Payload:
json
{
"company_name": "Tesla"
}Example CURL:
curl -X POST -H "Content-Type: application/json" \ -d '{"company_name": "Tesla"}' \http://localhost:5000/news_reportResponse:
{
"summary": "Tesla stock surges amid strong quarterly results...",
"sentiment": "Positive",
"topics": ["electric vehicles", "earnings", "Elon Musk"]
}Sentiment Analysis: Transformer models fine-tuned on financial/news datasets.
Topic Extraction: KeyBERT leverages BERT embeddings for keyword generation.
Hindi TTS: gTTS supports Hindi conversion and playback of summary text.
Website structure changes may break web scraping.
Sentiment analysis may misclassify sarcasm or mixed sentiments.
TTS output is limited to gTTS’s pronunciation capabilities.
Currently optimized for Hindi language; support for others is experimental.
No rate limiting or authentication yet — not production-hardened.
📰 Add more diverse and resilient news sources.
🧠 Train or fine-tune sentiment models on financial news.
🌐 Add support for other languages (e.g., Tamil, Bengali, English).
🎙️ Integrate advanced Hindi TTS like Coqui or NVIDIA FastSpeech.
☁️ Deploy on AWS/GCP with public endpoints and authentication.
This project is licensed under the MIT License — feel free to use, contribute, and share!