AWS framework to stream, classify, index and store it using Kinesis, S3, ElasticSearch and Lambda.
-
Since, we are using boto3 to access AWS components, follow all the instructions on this page.
-
Install all python dependencies. We are using python 2.7.
pip install requirements.txt
- Update the config in
config.py
with your s3_host, kinesis_stream and Elastic Search endpoint. - Upload
models/nb.model
to your s3 bucket.
- Run the
get_kinesis
script and wait for it to be ready. This will classify, index and store the records in kinesis stream. Sometimes, it takes a little while before it starts consuming the stream, so please be patient.
python get_kinesis.py
- To create a stream of data into kinesis, run
put_kinesis
script in another terminal.
python put_kinesis.py
You can check the indexing into Elastic Search using curl
command like an example below.
curl -XGET https://search-news-group-pcagpupl573mnu3scbh3wp63vu.us-east-1.es.amazonaws.com/news/news/_search?pretty\&q=category:comp.graphics