Event stream API

REST API for:

Creating and managing event streams
Sending data to event streams

Tests

Tests are run using tox: make test

For tests and linting we use pytest, flake8 and black.

Setup

make init

Run

Login to aws: make login-dev

Set necessary environment variables:

export AWS_PROFILE=okdata-dev
export AWS_REGION=eu-west-1
export OKDATA_ENVIRONMENT=dev

Start up Flask app locally. Binds to port 5000 by default:

make run

Note: make init will not install the boto3 library, since this dependency is already installed on the server. Therefore you must either run make test (which installs boto3) or run .build_venv/bin/python -m pip install boto3 before make run

Change port/environment:

export FLASK_ENV=development
export FLASK_RUN_PORT=8080

Deploy

Deploy to both dev and prod is automatic via GitHub Actions on push to main. You can alternatively deploy from local machine (requires saml2aws) with: make deploy or make deploy-prod.

Creating and managing event streams

curl commands for the API

Create a new event stream: curl -H "Authorization: bearer $TOKEN" -H "Content-Type: application/json" --data '{}' -XPOST http://127.0.0.1:8080/{dataset-id}/{version}

Enable an event sink: curl -H "Authorization: bearer $TOKEN" -H "Content-Type: application/json" --data '{"type":"s3"}' -XPOST http://127.0.0.1:8080/{dataset-id}/{version}/sinks

Get all sinks: curl -H "Authorization: bearer $TOKEN" -XGET http://127.0.0.1:8080/{dataset-id}/{version}/sinks

Get a single sink: curl -H "Authorization: bearer $TOKEN" -XGET http://127.0.0.1:8080/{dataset-id}/{version}/sinks/{sink_type}

Disable an event sink: curl -H "Authorization: bearer $TOKEN" -H "Content-Type: application/json" -XDELETE http://127.0.0.1:8080/{dataset-id}/{version}/sinks/{sink_type}

Terminology and resource definitions

Stream resource

This is the base resource of your event stream. In other words the Stream resource can be regarded as the event stream whilst the Subscribable and Sink resources can be regarded as features on the event stream. The Stream's Cloud Formation stack contains the following resources:

RawDataStream: Kinesis data stream dp.{confidentiality}.{dataset_id}.raw.{version}.json.
RawPipelineTrigger: Lambda event source mapping from RawDataStream to pipeline-router.
ProcessedDataStream: Kinesis data stream dp.{confidentiality}.{dataset_id}.processed.{version}.json.
ProcessedPipelineTrigger: Lambda event source mapping from ProcessedDataStream to pipeline-router.

Subscribable resource

The Subscribable resource can be regarded as a feature on the event stream that can either be enabled or disabled. If enabled, you connect to event-data-subscription websocket API and listen to events on your event stream. The subscribable's Cloud Formation stack consists of the following AWS resources:

SubscriptionSource: Lambda event source mapping from ProcessedDataStream to event-data-subscription.

Sink resource

The Sink resources can be regarded as destinations that your event stream writes to and entities with access can read from. So far we have two different event-sinks.

S3 sink

The S3 sink's Cloud Formation stack contains the following AWS resources:

SinkS3Resource: Kinesis firehose delivery stream with source=ProcessedDataStream and destination=S3.
SinkS3ResourceIAM: Iam role for consuming data from ProcessedDataStream and writing objects to S3. The role is used by SinkS3Resource.

Elasticsearch sink

The Elasticsearch sink's Cloud Formation stack contains the following AWS resources:

SinkElasticsearchResource: Kinesis firehose delivery stream with source=ProcessedDataStream, destination=S3 and S3 backup for failed documents.
SinkElasticsearchResourceIAM: Iam role for consuming data from ProcessedDataStream and posting to ES(elastic search). The role is used by SinkElasticsearchResource.
SinkElasticsearchS3BackupResourceIAM: IAM role for writing objects to S3. The role is used by SinkElasticsearchResource.

Get historical data

When an Elasticsearch sink is enabled and when data is stored(not backward compatible ), you can access data in a given date through: {url}/streams/{dataset-id}/{version}/events?from_date={from_date}&to_date={to_date}

Example prod: https://api.data.oslo.systems/streams/renovasjonsbiler-status/1/events?from_date=2020-10-18&to_date=2020-10-19

Name		Name	Last commit message	Last commit date
Latest commit History 225 Commits
.github		.github
clients		clients
database		database
notifications		notifications
resources		resources
services		services
test		test
.bumpversion.cfg		.bumpversion.cfg
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
app.py		app.py
handler.py		handler.py
mypy.ini		mypy.ini
package-lock.json		package-lock.json
package.json		package.json
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
serverless.yaml		serverless.yaml
setup.py		setup.py
tox.ini		tox.ini
util.py		util.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Event stream API

Tests

Setup

Run

Deploy

Creating and managing event streams

curl commands for the API

Terminology and resource definitions

Stream resource

Subscribable resource

Sink resource

S3 sink

Elasticsearch sink

Get historical data

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 5

Uh oh!

Languages

License

oslokommune/okdata-event-stream-api

Folders and files

Latest commit

History

Repository files navigation

Event stream API

Tests

Setup

Run

Deploy

Creating and managing event streams

curl commands for the API

Terminology and resource definitions

Stream resource

Subscribable resource

Sink resource

S3 sink

Elasticsearch sink

Get historical data

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 5

Uh oh!

Languages

Packages