Skip to content

Commit 1fde040

Browse files
authored
Merge pull request #1 from smallintro/python_es_kafka
article save and query using elasticsearch
2 parents c8fd40b + 33dcb42 commit 1fde040

20 files changed

+644
-128
lines changed

.gitignore

Lines changed: 1 addition & 128 deletions
Original file line numberDiff line numberDiff line change
@@ -1,129 +1,2 @@
1-
# Byte-compiled / optimized / DLL files
2-
__pycache__/
3-
*.py[cod]
4-
*$py.class
1+
.idea
52

6-
# C extensions
7-
*.so
8-
9-
# Distribution / packaging
10-
.Python
11-
build/
12-
develop-eggs/
13-
dist/
14-
downloads/
15-
eggs/
16-
.eggs/
17-
lib/
18-
lib64/
19-
parts/
20-
sdist/
21-
var/
22-
wheels/
23-
pip-wheel-metadata/
24-
share/python-wheels/
25-
*.egg-info/
26-
.installed.cfg
27-
*.egg
28-
MANIFEST
29-
30-
# PyInstaller
31-
# Usually these files are written by a python script from a template
32-
# before PyInstaller builds the exe, so as to inject date/other infos into it.
33-
*.manifest
34-
*.spec
35-
36-
# Installer logs
37-
pip-log.txt
38-
pip-delete-this-directory.txt
39-
40-
# Unit test / coverage reports
41-
htmlcov/
42-
.tox/
43-
.nox/
44-
.coverage
45-
.coverage.*
46-
.cache
47-
nosetests.xml
48-
coverage.xml
49-
*.cover
50-
*.py,cover
51-
.hypothesis/
52-
.pytest_cache/
53-
54-
# Translations
55-
*.mo
56-
*.pot
57-
58-
# Django stuff:
59-
*.log
60-
local_settings.py
61-
db.sqlite3
62-
db.sqlite3-journal
63-
64-
# Flask stuff:
65-
instance/
66-
.webassets-cache
67-
68-
# Scrapy stuff:
69-
.scrapy
70-
71-
# Sphinx documentation
72-
docs/_build/
73-
74-
# PyBuilder
75-
target/
76-
77-
# Jupyter Notebook
78-
.ipynb_checkpoints
79-
80-
# IPython
81-
profile_default/
82-
ipython_config.py
83-
84-
# pyenv
85-
.python-version
86-
87-
# pipenv
88-
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
89-
# However, in case of collaboration, if having platform-specific dependencies or dependencies
90-
# having no cross-platform support, pipenv may install dependencies that don't work, or not
91-
# install all needed dependencies.
92-
#Pipfile.lock
93-
94-
# PEP 582; used by e.g. github.com/David-OConnor/pyflow
95-
__pypackages__/
96-
97-
# Celery stuff
98-
celerybeat-schedule
99-
celerybeat.pid
100-
101-
# SageMath parsed files
102-
*.sage.py
103-
104-
# Environments
105-
.env
106-
.venv
107-
env/
108-
venv/
109-
ENV/
110-
env.bak/
111-
venv.bak/
112-
113-
# Spyder project settings
114-
.spyderproject
115-
.spyproject
116-
117-
# Rope project settings
118-
.ropeproject
119-
120-
# mkdocs documentation
121-
/site
122-
123-
# mypy
124-
.mypy_cache/
125-
.dmypy.json
126-
dmypy.json
127-
128-
# Pyre type checker
129-
.pyre/

README.md

Lines changed: 91 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,93 @@
11
# python-elasticsearch-with-kafka
22
Python Restful service with elasticsearch and kafka integration
3+
4+
### 1. Server setup
5+
1.1 Download and install Java 1.8 or higher
6+
1.2 Download and install [Kafka Server](https://kafka.apache.org/downloads)
7+
1.3 Download and install [Elasticsearch Server](https://www.elastic.co/downloads/elasticsearch)
8+
9+
### 2. Client setup
10+
2.1 Download and install [Python](https://www.python.org/downloads/)
11+
12+
2.2 Create and activate a python virtual environment
13+
> py -m venv eskafkavenv # can use any name in place of eskafkavenv
14+
15+
2.3 Activate the created virtual environment
16+
> eskafkavenv\Scripts\activate
17+
18+
#### 3. Install python packages
19+
3.1 pip install kafka-python
20+
3.2 pip install urllib3
21+
3.3 pip install certifi
22+
3.4 pip install elasticsearch
23+
Alternatively we can download the tar.gz package from the Download page and run command
24+
> py -m pip install .\<python-package-name>.tar.gz # replace the <python-package-name>
25+
26+
## 4. Start Kafka server
27+
4.1 Start Zookeeper node instance
28+
> linux$ bin/zookeeper-server-start.sh config/zookeeper.properties
29+
> windows$ bin\windows\zookeeper-server-start.bat config\zookeeper.properties
30+
31+
4.2 Start Kafka server
32+
> linux$ bin/kafka-server-start.sh config/server.properties
33+
> windows$ bin\windows\kafka-server-start.bat config\server.properties
34+
35+
4.3 Once kafka service is up create the kafka topic
36+
> bin/kafka-topic.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic kafka-message-topic
37+
38+
4.4 consuming test message
39+
> bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic kafka-message-topic --from-beginning
40+
41+
4.5 produce test message
42+
> bin/kafka-console-producer.sh --bootstrap-server localhost:9092 --topic kafka-message-topic
43+
> Hello Listeners
44+
45+
## 5. Start Elasticsearch server
46+
Elasticsearch is a distributed, real-time, search analysis platform.
47+
Elasticsearch can store data in json format, and hence can be used as NoSQL database.
48+
> bin/elasticsearch.bat
49+
50+
- index: An index is equivalent to database in relational database
51+
- mapping: A mapping is equivalent to schema in relational database
52+
53+
## 6 Elasticsearch REST APIs
54+
6.1 Check elasticsearch is running
55+
> http://localhost:9200
56+
57+
6.2 Index APIs
58+
6.2.1 Create an index with name articles
59+
> PUT http://localhost:9200/articles
60+
61+
6.2.2 Query an index with name articles
62+
> GET http://localhost:9200/articles
63+
64+
6.2.3 Delete an index with name articles
65+
> DELETE http://localhost:9200/articles
66+
67+
6.2.4 Update Index with proper mappings
68+
> PUT http://localhost:9200/articles
69+
{
70+
"mappings": {
71+
"dynamic": "strict",
72+
"properties": {
73+
"author": {"type": "text"},
74+
"title": {"type": "text"},
75+
"publish_date": {"type": "date"}
76+
}
77+
}
78+
}
79+
80+
6.3 Document APIs
81+
6.3.1 Add a document to index. Adding id at the end is optional while saving document
82+
> POST http://localhost:9200/articles/_doc/1
83+
{"author": "Sushil", "title": "Small intro to Elasticsearch using Python", "publish_date": "2021-11-14"}
84+
85+
6.3.2 Get a document from index by id
86+
> GET http://localhost:9200/articles/_doc/1
87+
88+
6.3.3 Get all documents from index
89+
> GET http://localhost:9200/articles/_doc/_search
90+
91+
6.3.4 Delete a document
92+
> DELETE http://localhost:9200/_doc/1
93+

article-search/.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
.idea
2+
*.iml
3+

article-search/README.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
## article-search service
2+
3+
- This is a demo application to save, and search the article information using elasticsearch.
4+
- kafka is used to send error message in case of exception as an alarm.
5+
- article-service is made restful using FastAPI and uvicorn.
6+
- Read the [README](https://github.com/smallintro/python-elasticsearch-with-kafka/README) file to know how to set up the environment to run and test this application.
7+
- Run the main.py file to start the service.
8+
- Access the rest service at [127.0.0.1:8080/docs](http://127.0.0.1:8080/docs)
9+

article-search/app/.gitignore

Lines changed: 130 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,130 @@
1+
# Byte-compiled / optimized / DLL files
2+
__pycache__/
3+
*.py[cod]
4+
*$py.class
5+
6+
# C extensions
7+
*.so
8+
9+
# Distribution / packaging
10+
.Python
11+
build/
12+
develop-eggs/
13+
dist/
14+
downloads/
15+
eggs/
16+
.eggs/
17+
lib/
18+
lib64/
19+
parts/
20+
sdist/
21+
var/
22+
wheels/
23+
pip-wheel-metadata/
24+
share/python-wheels/
25+
*.egg-info/
26+
.installed.cfg
27+
*.egg
28+
MANIFEST
29+
30+
# PyInstaller
31+
# Usually these files are written by a python script from a template
32+
# before PyInstaller builds the exe, so as to inject date/other infos into it.
33+
*.manifest
34+
*.spec
35+
36+
# Installer logs
37+
pip-log.txt
38+
pip-delete-this-directory.txt
39+
40+
# Unit test / coverage reports
41+
htmlcov/
42+
.tox/
43+
.nox/
44+
.coverage
45+
.coverage.*
46+
.cache
47+
nosetests.xml
48+
coverage.xml
49+
*.cover
50+
*.py,cover
51+
.hypothesis/
52+
.pytest_cache/
53+
54+
# Translations
55+
*.mo
56+
*.pot
57+
58+
# Django stuff:
59+
*.log
60+
local_settings.py
61+
db.sqlite3
62+
db.sqlite3-journal
63+
64+
# Flask stuff:
65+
instance/
66+
.webassets-cache
67+
68+
# Scrapy stuff:
69+
.scrapy
70+
71+
# Sphinx documentation
72+
docs/_build/
73+
74+
# PyBuilder
75+
target/
76+
77+
# Jupyter Notebook
78+
.ipynb_checkpoints
79+
80+
# IPython
81+
profile_default/
82+
ipython_config.py
83+
84+
# pyenv
85+
.python-version
86+
87+
# pipenv
88+
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
89+
# However, in case of collaboration, if having platform-specific dependencies or dependencies
90+
# having no cross-platform support, pipenv may install dependencies that don't work, or not
91+
# install all needed dependencies.
92+
#Pipfile.lock
93+
94+
# PEP 582; used by e.g. github.com/David-OConnor/pyflow
95+
__pypackages__/
96+
97+
# Celery stuff
98+
celerybeat-schedule
99+
celerybeat.pid
100+
101+
# SageMath parsed files
102+
*.sage.py
103+
104+
# Environments
105+
.env
106+
.venv
107+
env/
108+
venv/
109+
ENV/
110+
env.bak/
111+
venv.bak/
112+
*.iml
113+
114+
# Spyder project settings
115+
.spyderproject
116+
.spyproject
117+
118+
# Rope project settings
119+
.ropeproject
120+
121+
# mkdocs documentation
122+
/site
123+
124+
# mypy
125+
.mypy_cache/
126+
.dmypy.json
127+
dmypy.json
128+
129+
# Pyre type checker
130+
.pyre/

article-search/app/requirements.txt

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
requests
2+
starlette
3+
pydantic
4+
fastapi
5+
uvicorn
6+
urllib3
7+
certifi
8+
kafka-python
9+
elasticsearch

article-search/app/src/__init__.py

Whitespace-only changes.

0 commit comments

Comments
 (0)