Skip to content

This project demonstrates how to use Apache Kafka Streams to detect fraudulent activities by analyzing IP logs in real-time. By processing the streaming data, the system flags potential fraud by identifying suspicious patterns, such as repeated login attempts or access from unusual IP addresses.

Notifications You must be signed in to change notification settings

Sabareh/Fraud-Detection-Using-Kafka-Streams

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation


Fraud Detection from IP Logs using Kafka Streams

Project Overview

This project demonstrates how to use Apache Kafka Streams to detect fraudulent activities by analyzing IP logs in real-time. By processing the streaming data, the system flags potential fraud by identifying suspicious patterns, such as repeated login attempts or access from unusual IP addresses. Kafka Streams offers a scalable and fault-tolerant solution for processing large-scale log data, making it ideal for real-time fraud detection.

Features

  • Real-time Processing: Streams IP logs continuously from Kafka topics for real-time detection.
  • Fraud Detection: Identifies and flags suspicious activities based on predefined rules and patterns.
  • Scalable Architecture: Leverages Kafka Streams for horizontal scalability and fault tolerance.
  • Customizable Rules: Allows for configurable fraud detection rules to adjust sensitivity.

Technologies Used

  • Kafka Streams: Core for stream processing.
  • Apache Kafka: Message broker for distributing IP log data.
  • Java: Programming language used for building Kafka Streams application.
  • Maven: Dependency management and build tool.
  • Zookeeper: Manages Kafka cluster and configuration.
  • Docker (Optional): For containerized deployment of Kafka and Zookeeper.
  • Prometheus & Grafana (Optional): For monitoring and visualizing Kafka metrics.

Getting Started

Prerequisites

  • Apache Kafka installed locally or available via a cluster (or Docker setup).
  • Java 8+ installed.
  • Maven installed.
  • Zookeeper installed (if running Kafka locally).

Installation

  1. Clone the repository:

    git clone https://github.com/sabareh/fraud-detection-using-kafka-streams.git
    cd fraud-detection-using-kafka-streams
  2. Install Dependencies:

    Navigate to the project directory and install Maven dependencies:

    mvn clean install
  3. Start Kafka and Zookeeper:

    If using Docker, you can use the provided docker-compose.yml:

    docker-compose up

    Alternatively, start Kafka and Zookeeper manually.

  4. Configure Kafka Topics:

    Create the necessary topics for streaming logs and fraud detection alerts.

    kafka-topics.sh --create --topic ip-logs --bootstrap-server localhost:9092 --partitions 3 --replication-factor 1
    kafka-topics.sh --create --topic fraud-alerts --bootstrap-server localhost:9092 --partitions 3 --replication-factor 1
  5. Run the Application:

    Use Maven to run the application:

    mvn exec:java -Dexec.mainClass=com.yourpackage.FraudDetectionApp
  6. Send IP Logs to Kafka:

    You can simulate IP log data by producing messages to the ip-logs topic. For example:

    kafka-console-producer.sh --topic ip-logs --bootstrap-server localhost:9092

    Send some test log data to see how the system reacts.

Customizing Fraud Detection Rules

Modify the CacheIPLookup.java file to adjust the detection logic. For example, you can change the rule that flags IPs with more than X failed logins within Y minutes or alerts on access from specific geo-locations.

Project Structure

  • src/main/java: Contains the Kafka Streams application and logic for fraud detection.
  • pom.xml: Maven project file containing dependencies and build configuration.
  • docker-compose.yml: Optional Docker configuration for running Kafka and Zookeeper locally.
  • prometheus.yml: (Optional) Configuration file for setting up Prometheus for monitoring.

Example Use Case

Imagine monitoring login activity for an online banking system. The fraud detection system identifies abnormal login patterns, such as multiple failed attempts from a single IP or login attempts from IP addresses associated with known proxies or malicious sources.

Future Enhancements

  • Implement more advanced fraud detection using machine learning models.
  • Integration with other real-time alerting services like Slack or email.
  • Store detected fraud events in a database for audit purposes.
  • Support for geo-fencing by integrating with a geo-location API.

Contributing

Feel free to fork this repository and make pull requests. Any contributions, suggestions, or improvements are welcome!

License

This project is licensed under the MIT License. See the LICENSE file for details.

About

This project demonstrates how to use Apache Kafka Streams to detect fraudulent activities by analyzing IP logs in real-time. By processing the streaming data, the system flags potential fraud by identifying suspicious patterns, such as repeated login attempts or access from unusual IP addresses.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages