|
| 1 | +# Care Pet ScyllaDB IoT example |
| 2 | + |
| 3 | +This is an example project that demonstrates a generic IoT use case with |
| 4 | +ScyllaDB in Python. |
| 5 | + |
| 6 | +The project simulates an IoT application for pet owners to monitor a variety |
| 7 | +of metrics about their pets (for example heart rate or temperature). |
| 8 | + |
| 9 | +The application has three modules: |
| 10 | + |
| 11 | +* Migrate (`python src/migrate.py`) - creates keyspace and tables in ScyllaDB |
| 12 | +* Sensor (`python src/sensor.py`) - generates random IoT data and inserts it into ScyllaDB |
| 13 | +* API (`python src/api.py`) - REST API service to fetch data from ScyllaDB |
| 14 | + |
| 15 | +## Get started |
| 16 | + |
| 17 | +### Prerequisites: |
| 18 | +* [Python 3.7+](https://www.python.org/downloads/) |
| 19 | +* [Virtualenv](https://virtualenv.pypa.io/en/latest/installation.html) |
| 20 | +* [docker](https://www.docker.com/) |
| 21 | +* [docker-compose](https://docs.docker.com/compose/) |
| 22 | + |
| 23 | +### Clone repository and install dependencies |
| 24 | +Clone the repository and open the root directory of the project: |
| 25 | +```bash |
| 26 | +git clone https://github.com/scylladb/care-pet |
| 27 | +cd care-pet/python |
| 28 | +``` |
| 29 | + |
| 30 | +Create a new virtual environment and activate it: |
| 31 | +```bash |
| 32 | +virtualenv env |
| 33 | +source env/bin/activate |
| 34 | +``` |
| 35 | + |
| 36 | +Install all Python dependencies: |
| 37 | +```bash |
| 38 | +pip install -r requirements.txt |
| 39 | +``` |
| 40 | + |
| 41 | +### Start Docker containers (skip this if you use Scylla Cloud) |
| 42 | +Spin up a local ScyllaDB cluster with three nodes using `docker` and `docker-compose`: |
| 43 | +```bash |
| 44 | +docker-compose up -d |
| 45 | + |
| 46 | +Creating carepet-scylla3 ... done |
| 47 | +Creating carepet-scylla2 ... done |
| 48 | +Creating carepet-scylla1 ... done |
| 49 | +``` |
| 50 | + |
| 51 | +This command starts three ScyllaDB nodes in containers: |
| 52 | +* `carepet-scylla1` |
| 53 | +* `carepet-scylla2` |
| 54 | +* `carepet-scylla3` |
| 55 | + |
| 56 | +You can inspect any of these nodes by using the `docker inspect` command, |
| 57 | +for example: |
| 58 | +```bash |
| 59 | +docker inspect carepet-scylla1 |
| 60 | + |
| 61 | +[ |
| 62 | + { |
| 63 | + "Id": "c87128b7d0ca4a31a84da78875c8b4181283c34783b6b0a78bffbacbbe45fcc2", |
| 64 | + "Created": "2023-01-08T21:17:13.212585687Z", |
| 65 | + "Path": "/docker-entrypoint.py", |
| 66 | + "Args": [ |
| 67 | + "--smp", |
| 68 | + "1" |
| 69 | + ], |
| 70 | + "State": { |
| 71 | + "Status": "running", |
| 72 | + "Running": true, |
| 73 | +... |
| 74 | +``` |
| 75 | +
|
| 76 | +### Connect to ScyllaDB and create the database schema |
| 77 | +To connect to your ScyllaDB storage within the container, you need to know the |
| 78 | +IP address of one of the running nodes. |
| 79 | +This is how you can get the IP address of the first node running in the container: |
| 80 | +```bash |
| 81 | +docker inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' carepet-scylla1 |
| 82 | +``` |
| 83 | +
|
| 84 | +You will need to reference this value multiple times later so if it's easier |
| 85 | +for you can save it as a variable `NODE1`: |
| 86 | +```bash |
| 87 | +NODE1=$(docker inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' carepet-scylla1) |
| 88 | +``` |
| 89 | +
|
| 90 | +Now you can run the migration script that creates the required keyspace and tables: |
| 91 | +```bash |
| 92 | +python src/migrate.py -h $NODE1 |
| 93 | +
|
| 94 | +Creating keyspace... |
| 95 | +Done. |
| 96 | +Migrating database... |
| 97 | +Done. |
| 98 | +``` |
| 99 | +
|
| 100 | +See the database schema using [cqlsh](https://cassandra.apache.org/doc/latest/cassandra/tools/cqlsh.html) in the container: |
| 101 | +
|
| 102 | +```bash |
| 103 | +docker exec -it carepet-scylla1 cqlsh |
| 104 | +cqlsh> DESCRIBE KEYSPACES; |
| 105 | +
|
| 106 | +carepet system_auth system_distributed_everywhere system_traces |
| 107 | +system_schema system system_distributed |
| 108 | +
|
| 109 | +cqlsh> USE carepet; |
| 110 | +cqlsh:carepet> DESCRIBE TABLES; |
| 111 | +
|
| 112 | +owner pet sensor sensor_avg measurement |
| 113 | +
|
| 114 | +cqlsh:carepet> DESCRIBE TABLE pet; |
| 115 | +
|
| 116 | +CREATE TABLE carepet.pet ( |
| 117 | + owner_id uuid, |
| 118 | + pet_id uuid, |
| 119 | + address text, |
| 120 | + age int, |
| 121 | + name text, |
| 122 | + weight float, |
| 123 | + PRIMARY KEY (owner_id, pet_id) |
| 124 | +) WITH CLUSTERING ORDER BY (pet_id ASC) |
| 125 | + AND bloom_filter_fp_chance = 0.01 |
| 126 | + AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'} |
| 127 | + AND comment = '' |
| 128 | + AND compaction = {'class': 'SizeTieredCompactionStrategy'} |
| 129 | + AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} |
| 130 | + AND crc_check_chance = 1.0 |
| 131 | + AND dclocal_read_repair_chance = 0.0 |
| 132 | + AND default_time_to_live = 0 |
| 133 | + AND gc_grace_seconds = 864000 |
| 134 | + AND max_index_interval = 2048 |
| 135 | + AND memtable_flush_period_in_ms = 0 |
| 136 | + AND min_index_interval = 128 |
| 137 | + AND read_repair_chance = 0.0 |
| 138 | + AND speculative_retry = '99.0PERCENTILE'; |
| 139 | +
|
| 140 | +cqlsh:carepet> exit; |
| 141 | +
|
| 142 | +
|
| 143 | +``` |
| 144 | +
|
| 145 | +At this point you have ScyllaDB running with the correct keyspace and tables. |
| 146 | +
|
| 147 | +### Generate and ingest IoT data |
| 148 | +Start ingesting IoT data (it's suggested to do this in a new separate terminal |
| 149 | +because this process runs indefinitely). Make sure you're still in the virtual |
| 150 | +environment: |
| 151 | +```bash |
| 152 | +source env/bin/activate |
| 153 | +NODE1=$(docker inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' carepet-scylla1) |
| 154 | +python src/sensor.py -h $NODE1 --measure 2 --buffer-interval 10 |
| 155 | +
|
| 156 | +Welcome to the Pet collar simulator |
| 157 | +New owner # 1cfbc0e5-6b05-476d-b170-2660cf40c02a |
| 158 | +New pet # 1a0800ee-7643-4794-af7b-2ecaaf7078fc |
| 159 | +New sensor(0) # b6155934-bd4e-47de-8649-1fad447aa036 |
| 160 | +New sensor(1) # d2c62c4d-9621-469d-b62c-41ef2271fca7 |
| 161 | +sensor # b6155934-bd4e-47de-8649-1fad447aa036 type T, new measure: 100.55118431400851, ts: 2023-01-08 17:36:17.126374 |
| 162 | +sensor # d2c62c4d-9621-469d-b62c-41ef2271fca7 type L, new measure: 37.486651732296835, ts: 2023-01-08 17:36:17.126516 |
| 163 | +``` |
| 164 | +
|
| 165 | +This command starts a script that generates and ingests random IoT data coming |
| 166 | +from two sensors every other second and inserts the data in batches |
| 167 | +every ten seconds. Whenever you see `Pushing data` in the command line that is |
| 168 | +when data actually gets insterted into ScyllaDB. |
| 169 | +
|
| 170 | +Optional: You can modify the frequency of the generated data by changing the |
| 171 | +`--measure` and `--buffer-interval` arguments. For example, |
| 172 | +you can generate new data points every three seconds and insert the batches |
| 173 | +every 30 seconds: |
| 174 | +```bash |
| 175 | +source env/bin/activate |
| 176 | +NODE1=$(docker inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' carepet-scylla1) |
| 177 | +python src/sensor.py -h $NODE1 --measure 3 --buffer-interval 30 |
| 178 | +``` |
| 179 | +
|
| 180 | +You can run multiple ingestion processes in parallel if you wish. |
| 181 | +
|
| 182 | +### Set up and test REST API |
| 183 | +In a new terminal, start running the API server (make sure that `port 8000` is free): |
| 184 | +```bash |
| 185 | +source env/bin/activate |
| 186 | +NODE1=$(docker inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' carepet-scylla1) |
| 187 | +python src/api.py -h $NODE1 |
| 188 | +
|
| 189 | +INFO: Started server process [696274] |
| 190 | +INFO: Waiting for application startup. |
| 191 | +INFO: Application startup complete. |
| 192 | +INFO: Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit) |
| 193 | +``` |
| 194 | +
|
| 195 | +The API server is running on `http://127.0.0.1:8000`. Test with your |
| 196 | +browser, or curl, if it works properly: |
| 197 | +```bash |
| 198 | +curl http://127.0.0.1:8000 |
| 199 | +
|
| 200 | +{"message":"Pet collar simulator API"} |
| 201 | +``` |
| 202 | +
|
| 203 | +Next, you will test the following API endpoints: |
| 204 | +* `/api/owner/{owner_id}` |
| 205 | +
|
| 206 | + Returns all available data fields about the owner. |
| 207 | +* `/api/owner/{owner_id}/pets` |
| 208 | +
|
| 209 | + Returns the owner's pets. |
| 210 | +* `/api/pet/{pet_id}/sensors` |
| 211 | +
|
| 212 | + Returns all the sensors of a pet. |
| 213 | +
|
| 214 | +To test these endpoints, you need to provide either an `owner_id` or a `pet_id` |
| 215 | +as URL path parameter. You can get these values by copying them from the |
| 216 | +beginning of output of the ingestion script: |
| 217 | +```bash |
| 218 | +source env/bin/activate |
| 219 | +NODE1=$(docker inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' carepet-scylla1) |
| 220 | +python src/sensor.py -h $NODE1 --measure 1 --buffer-interval 6 |
| 221 | +
|
| 222 | +Welcome to the Pet collar simulator |
| 223 | +New owner # 1cfbc0e5-6b05-476d-b170-2660cf40c02a <-- This is what you need! |
| 224 | +New pet # 1a0800ee-7643-4794-af7b-2ecaaf7078fc <-- This is what you need! |
| 225 | +New sensor(0) # b6155934-bd4e-47de-8649-1fad447aa036 |
| 226 | +New sensor(1) # d2c62c4d-9621-469d-b62c-41ef2271fca7 |
| 227 | +``` |
| 228 | +
|
| 229 | +Copy the UUID values right after "New owner #" and "New pet #". A UUID value |
| 230 | +looks like this: |
| 231 | +``` |
| 232 | +1cfbc0e5-6b05-476d-b170-2660cf40c02a |
| 233 | +``` |
| 234 | +
|
| 235 | +**Test `/api/owner/{owner_id}`** |
| 236 | +
|
| 237 | +Paste the owner id from the terminal into the endpoint URL and open it with |
| 238 | +your browser or use `curl`, for example: |
| 239 | +```bash |
| 240 | +curl http://127.0.0.1:8000/api/owner/4f42fb80-c209-4d19-8c43-daf554f1be23 |
| 241 | +
|
| 242 | +{"owner_id":"4f42fb80-c209-4d19-8c43-daf554f1be23","address":"home","name":"Vito Russell"} |
| 243 | +``` |
| 244 | +
|
| 245 | +**Test `/api/owner/{owner_id}/pets`** |
| 246 | +
|
| 247 | +Use the same owner id value to test this endpoint, for example: |
| 248 | +```bash |
| 249 | +curl http://127.0.0.1:8000/api/owner/4f42fb80-c209-4d19-8c43-daf554f1be23/pets |
| 250 | +
|
| 251 | +[{"owner_id":"4f42fb80-c209-4d19-8c43-daf554f1be23","pet_id":"44f1624e-07c2-4971-85a5-85b9ad1ff142","address":"home","age":20,"name":"Duke","weight":14.41481876373291}] |
| 252 | +``` |
| 253 | +
|
| 254 | +**Test `/api/pet/{pet_id}/sensors`** |
| 255 | +
|
| 256 | +Finally, use a pet id to test this endpoint, for example: |
| 257 | +```bash |
| 258 | +curl http://127.0.0.1:8000/api/pet/44f1624e-07c2-4971-85a5-85b9ad1ff142/sensors |
| 259 | +
|
| 260 | +[{"pet_id":"44f1624e-07c2-4971-85a5-85b9ad1ff142","sensor_id":"4bb1d214-712b-453b-b53a-ac5d4df4a1f8","type":"T"},{"pet_id":"44f1624e-07c2-4971-85a5-85b9ad1ff142","sensor_id":"e81915d6-1155-45e4-9174-c58e4cb8cecf","type":"L"}] |
| 261 | +``` |
| 262 | +
|
| 263 | +## Structure |
| 264 | +Package structure: |
| 265 | +
|
| 266 | +| Name | Purpose | |
| 267 | +| ----------------------------------------| -------------------------------------| |
| 268 | +| [/src/db](/src/db) | Database config and client folder | |
| 269 | +| [/src/db/cql](/src/db/cql) | CQL scripts | |
| 270 | +| [/src/db/client](/src/db/client.py) | ScyllaDB client library | |
| 271 | +| [/src/server](/src/server) | FastAPI application folder | |
| 272 | +| [/src/server/app.py](/src/server/app.py)| FastAPI application | |
| 273 | +| [/src/api.py](/src/api.py ) | Script to start the API server | |
| 274 | +| [/src/migrate.py](/src/migrate.py) | Schema creation | |
| 275 | +| [/src/sensor.py](/src/sensor.py) | IoT data ingestion | |
0 commit comments