The example application consists of two services written in Scala, an ingestion service (code) and an aggregation service (code). The ingestion service subscribes to the Twitter Streaming API and receives fresh tweets filtered by a list of terms. Any raw tweet is sent to the Kafka topic 'tweets' in JSON. The aggregation service retrieves raw tweets, parses tweets, and aggregates word counts in tumbling time windows, see the code here. Kafka Streams uses an embedded RocksDB for maintaining a local state. Any change to the aggregate will be propagated to the topic 'aggregate'.
Both services share the same SBT project, and will be located in the same fat jar including all dependencies. Which allows us to easily share code in this small example project. Both applications access the application.conf in runtime via the Settings object, see code. I wrote a small build script to compile the services, building the Docker images and running the containers.
Read the full article
- http://www.confluent.io/blog/introducing-kafka-streams-stream-processing-made-simple
- https://kafka.apache.org/documentation.html
- http://docs.confluent.io/3.0.0/streams/javadocs/index.html
- http://docs.confluent.io/3.0.0/streams/developer-guide.html#kafka-streams-dsl