This repository is used to setup infrastructure when developing locally using Kafka/OpenSearch.
The repository consists of a set of docker-compose files which are all referenced in the .env file. This allows invoking docker compose up <service-name>
on a service in any of the docker-compose files, from the root of the repository.
See also: https://docs.cheetah.trifork.dev/reference/development-infrastructure
docker compose up --quiet-pull
- Follow: https://docs.cheetah.trifork.dev/getting-started/guided-tour/prerequisites#run-standard-jobs
- Run
docker network create cheetah-infrastructure
The infrastructure requires a lot of resources, especially memory when running all services at once.
Here is some basic profiling done while running through WSL2 with 16GB RAM:
# See if your docker supports memory limits
docker info --format '{{json .MemoryLimit}}'
# Get total memory for docker
docker info --format '{{json .MemTotal}}' | numfmt --from=auto --to=iec
# Get total CPUs for docker
docker info --format '{{json .NCPU}}'
Profile | MEM USAGE / LIMIT |
---|---|
core | 2.4GB / 4.4GB |
kafka | 1.3GB / 2.2GB |
opensearch | 1.9GB / 2.9GB |
full | 2.9GB / 5.2GB |
Estimated requirements:
Profile | CPUs | Docker available memory (RAM) | Disk space (Images) |
---|---|---|---|
Minimum | 2 | 4GB | >6.6GB |
Recommended | 8 | 8GB | >20GB |
Best | 16 | 16GB | >40GB |
The development infrastructure follows the Reference Security Model.
For local development we are using Keycloak inside docker-compose/keycloak.yaml
as a local IDP.
See sections below for details on security model configuration.
The kafka setup consists of different services:
- kafka - Strimzi Kafka with the Cheetah Kafka Authorizer
- zookeeper - Strimzi Zookeeper
- redpanda - A Console provides a user interface to manage multiple Kafka connect clusters. https://docs.redpanda.com/docs/manage/console/
- kafka-setup - A bash script which sets up a Kafka User for redpanda to use when connecting to Kafka, as well as some predefined topics. The topics to be created are determined by the environment variable INITIAL_KAFKA_TOPICS, which can be set in the
.env
file or overritten in your local environment. - schema-registry - Schema registry
- kafka-minion - Kafka Prometheus exporter
Run:
docker compose --profile=kafka --profile=oauth --profile=schemaregistry --profile=redpanda up -d
When all of the services are running, you can go to:
- http://localhost:9898/topics to see the different topics in redpanda.
- http://localhost:8081 to see the schema-registry UI.
- http://localhost:8081/apis to the the schema-registry api documentation
5 different listeners is setup for Kafka on different internal and external ports (see server.properties for the configuration):
localhost:9092
- Used for connecting to kafka with OAuth2 authentication from outside the docker environment.localhost:9093
- Used for connecting to kafka without authentication from outside the docker environment.kafka:19092
- Used for connecting to kafka with OAuth2 authentication from a docker container in thecheetah-infrastructure
docker network.kafka:19093
- Used for connecting to kafka without authentication from a docker container in thecheetah-infrastructure
docker network.kafka:19094
- Only used by Redpanda, since it does not support Oauth2.
To require Oauth2 authentication when connecting to kafka, you can remove ;User:ANONYMOUS
from the super.users
property in server.properties.
This will cause all connections from unauthenticated sources to be rejected by CheetahKafkaAuthorizer
.
The OpenSearch setup consists of different services:
- OpenSearch - OpenSearch data storage solution
- OpenSearch-Dashboard - Dashboard solution for interacting with OpenSearch API
- OpenSearch Configurer - Uses OpenSearch Template Configuration Script to setup Index Templates and more.
Files placed in any subdirectory of config/opensearch-configurer/ are automatically applied to the OpenSearch instance.
Run:
docker compose --profile=opensearch --profile=opensearch_dashboard up -d
When all of the services are running, you can go to:
- http://localhost:9200/ OpenSearch
- http://localhost:5602 to see the dashboard UI
- http://localhost:9200/_cat/indices to see all current indices
Services should connect using the OAuth2 protocol.
You can choose to set DISABLE_SECURITY_DASHBOARDS_PLUGIN=true
and DISABLE_SECURITY_PLUGIN=true
to disable security completely.
When working locally, you can use admin:admin
user and query OpenSearch like this:
curl -k -s -u "admin:admin" $OPENSEARCH_URL/_cat/indices
If you do not want to use basicauth locally, you can get a token using this curl command:
ACCESS_TOKEN=$(curl -s -X POST $OPENSEARCH_TOKEN_URL \
-H "Content-Type: application/x-www-form-urlencoded" \
-d "grant_type=client_credentials&client_id=$OPENSEARCH_CLIENT_ID&client_secret=$OPENSEARCH_CLIENT_SECRET&scope=$OPENSEARCH_SCOPE" \
| jq -r '.access_token')
#| grep -o '"access_token":"[^"]*' | grep -o '[^"]*$')
And query OpenSearch like this:
curl -k -s -H "Authorization: Bearer $ACCESS_TOKEN" $OPENSEARCH_URL/_cat/indices
The Timescale setup consists of different services:
- TimescaleDB PostgreSQL with the timescale extension
- PgAdmin GUI for managing TimescaleDB
Run:
docker compose --profile=timescale up -d
When all of the services are running, you can go to:
- http://localhost:5432/ TimescaleDB
- http://localhost:5050 to see the PgAdmin UI
By default a single user is setup:
- Username:
postgres
, Password:admin
List of profiles:
- full
- core
- kafka
- opensearch
- observability
- timescale
Here is further explanation on what each profile starts.
Images / profiles | kafka-core | opensearch-core | schema-registry-core | core | kafka | opensearch | observability | timescale | full |
---|---|---|---|---|---|---|---|---|---|
Keycloak | x | x | x | x | x | x | x | x | |
Kafka | x | x | x | x | x | x | |||
Redpanda console | x | x | |||||||
Opensearch | x | x | x | x | |||||
Opensearch dashboard | x | x | |||||||
Opensearch configurer | x | x | x | x | |||||
Schema registry | x | x | x | x | |||||
Prometheus | x | x | |||||||
Grafana | x | x | |||||||
Timescale | x |
Keycloak is used as a local identity provider, to be able to mimic a production security model with service to service authentication.
- OpenID Endpoint Configuration: http://localhost:1852/realms/local-development/.well-known/openid-configuration
- Token Endpoint http://localhost:1852/realms/local-development/protocol/openid-connect/token
A set of default clients have been defined which covers most common usecases.
All roles are mapped to the roles
claim in the JWT. This configuration is defined in local-development.json and is applied to keycloak using the keycloak-setup
service.
To modify the configuration either go to the admin console (Username: admin
Password: admin
) or edit the local-development.json
following this guide
- Default access
- Description: Read and write access to all data Kafka, OpenSearch and Schema registry
- client_id:
default-access
- client_secret:
default-access-secret
- default_scopes: [ ]
- optional_scopes:
kafka
- Roles:
Kafka_*_all
- Roles:
opensearch
- Roles:
opensearch_default_access
- Roles:
schema-registry
- Roles:
sr-producer
- Roles:
- Default write
- Description: Write access to all data in Kafka, OpenSearch and Schema registry
- client_id:
default-write
- client_secret:
default-write-secret
- default_scopes: [ ]
- optional_scopes:
kafka
- Roles:
Kafka_*_write
- Roles:
opensearch
- Roles:
opensearch_default_write
opensearch_default_delete
- Roles:
schema-registry
- Roles:
sr-producer
- Roles:
- Default read
- Description: Read access to all data in Kafka, OpenSearch and Schema registry
- client_id:
default-read
- client_secret:
default-read-secret
- default_scopes: [ ]
- optional_scopes:
kafka
- Roles:
Kafka_*_read
- Roles:
opensearch
- Roles:
opensearch_default_read
- Roles:
- Users
- Description: User login via browser such as OpenSearch Dashboard (See Users for user details)
- client_id:
users
- client_secret:
users-secret
- default_scopes: [ ]
- optional_scopes:
kafka
opensearch
schema-registry
- Custom client
- Description: A custom client which can be configured using Environment variables. Useful for pipelines where services require custom roles.
- client_id: $DEMO_CLIENT_NAME
- client_secret: $DEMO_CLIENT_SECRET
- default_scopes:
custom-client
- Roles: $DEMO_CLIENT_ROLES - Should be a comma separated list e.g. (
my_view_role,my_edit_role,my_admin_role
)
- Roles: $DEMO_CLIENT_ROLES - Should be a comma separated list e.g. (
- optional_scopes: [ ]
- developer
- Username:
developer
- Password:
developer
- Roles:
opensearch_developer
opensearch_default_read
Kafka_*_all
sr-producer
- Username: