In today’s fast-paced digital world, real-time streaming analytics has become increasingly important as companies require to understand what customers, application and products are doing right now and react promptly. For example, companies need to analyse data in real-time to continuously monitor an application to ensure high service uptime and personalize promotional offers and product recommendations to customers.
This repository contains the sample code to build a real-time streaming analytics application using Apache Kafka on AWS. It forms the basis of the following tutorial:
The source code builds Apache Flink application and creates the following resources on AWS using the provided AWS CloudFormation:
- Amazon OpenSearch Cluster
- Amazon ECS Cluster and a Task definition
- Amazon Kinesis Data Analytics streaming application
- Amazon EC2 Instance to serve as Kafka client
- Amazon EC2 Instance to serve as Nginx proxy
- Security groups and AWS IAM roles
Follow the following steps to build the Apache Flink application:
- Install the following dependencies:
- Move to the right directory:
cd flink-clickstream-consumer
- Build the Apache Flink application file:
mvn clean package
💡 A file named flink-clickstream-consumer/target/ClickStreamProcessor-1.0.jar
will be created. This is your Apache Flink application.
Once you created built the Flink application, you can deploy the deploy the required resources on AWS to start building your real-time streaming analytics application. The required steps are detailed in the following tutorial:
See CONTRIBUTING for more information.
This project is licensed under the MIT-0 License. See the LICENSE file.