Load data from files to mongodb using kafka and spark
- Load data from CSV to Kafka (Kafka producer)
- Read messages from Kafka (Kafka consumer) as spark RDD and it save into mongoDb
cd /mnt/work/installedApps/kafka_2.11-2.1.0
bin/zookeeper-server-start.sh config/zookeeper.properties
bin/kafka-server-start.sh config/server.properties
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic TopicBooks
bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic TopicBooks --from-beginning
A containerized kafka cluster, with a broker listening on localhost:9092. A docker-compose config file is provided, to run a single broker kafka cluster + single node zookeeper.
To start it: $ docker-compose -f kafka-docker/docker-compose.yml up -d
And to shut it down: docker-compose -f kafka-docker/docker-compose.yml down The ports are 2181 for ZooKeeper and 9092 for Kafka
sbt test:compile
sbt runMain org.repl.kafkasparkmongo.BXBooksLoader