Skip to content

This repository contains the code for a realtime election project. The system is built using Python, Kafka, Spark Streaming, Postgres, Streamlit and Docker

Notifications You must be signed in to change notification settings

Naman-18/RealtimeElectionPipeline

Repository files navigation

Realtime Election Analysis

This project focuses on building a robust system to generate real-time insights into election voting statistics and leaderboards, fostering a transparent, efficient, and effective voting process. The architecture ensures seamless data streaming, processing, and visualization to deliver real-time updates to end-users.

  • Real-Time Insights: Continuously updated voting statistics and leaderboards for instant visibility into election progress.
  • Transparency: Promotes fairness and openness by providing stakeholders with up-to-date and accurate data.
  • Scalability and Efficiency: Designed to handle high volumes of concurrent data without compromising performance.

Technologies Used

  • Kafka: For real-time data streaming and message brokering.
  • PostgreSQL: For storing and querying election data.
  • Python: The primary programming language used for data processing and analysis.
  • Spark Streaming: For processing data in real-time and performing analytics.
  • Streamlit: For creating the interactive web application.
  • Docker: Docker is used for containerization, which simplifies deployment and ensures that the application runs consistently across different environments.

Features

  • Real-time Data Visualization: Get live updates on voting statistics and metrics.
  • Dynamic Charts: Visualize data using pie charts and bar charts for better insights.
  • User-friendly Interface: Easy navigation through the dashboard for viewing election data.
  • Custom Refresh Interval: Users can set a refresh interval for real-time data updates.

app_demo

System Architecture

system architecture

Data Model

data_model

Project Setup

Prerequisites

  • Python 3.9 or above installed on your machine
  • Docker Compose installed on your machine
  • Docker installed on your machine

Steps to setup environment

  1. Clone this repository.
  2. Navigate to the root containing the Docker Compose file.
  3. Run the following command start Zookeeper, Kafka and Postgres containers in detached mode
docker-compose up -d
  1. Setup a Virtual environment
python -m venv venv
source venv/bin/activate  # On Windows use `venv\Scripts\activate`
  1. Install the required packages
pip install -r requirements.txt
  1. Update config.json as per your system
{
    // Database credentials
    "database": {
        "host": "localhost",
        "username": "election_user",
        "password": "election_pass",
        "db_name": "voting",
        "port": 5433
    },
    "tables": ["candidates", "voters", "votes"], # List of tables
    "randomuser_url": "https://randomuser.me/api/?nat=in", # RandomUser API Base URL
    "parties": ["BJP", "INC", "TDP", "BSP", "SP", "AAP"], # List of Political Parties
    "total_candidates": 12, # configuration for total candidates 
    "total_voters": 1000, # configuration for total voters
    "voting_interval": 0.5, # configuration for voting simulation interval
    "kafka_topics": {
        "votes_topic": "votes_topic" # List of kafka topics
    },
    "base_dir": "/Users/naman/Desktop/DataEngineering/RealtimeElectionPipeline/" # Base directory or root path
}
  1. Run setup.py to create Postgres tables and generate data
python3 setup.py

Steps to Run the App

Terminal 1 -> Consuming the voter information from Postgres, generating voting data and producing voting data to the Kafka topic:

python3 voting_app.py

Terminal 2 -> Spark streaming Jobs consuming the voting data from Kafka topic, enriching the data, calculate aggregates and producing data to specific topics on Kafka:

python3 spark-streaming.py

Terminal 3 -> Running the Streamlit app:

streamlit run streamlit_app/app.py

About

This repository contains the code for a realtime election project. The system is built using Python, Kafka, Spark Streaming, Postgres, Streamlit and Docker

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages