Skip to content

Commit

Permalink
Updated Readme (#53) (#54)
Browse files Browse the repository at this point in the history
* Update release.yaml

- increased access token lifetime

* Updated local setup guide

* Update deploy.yaml

- added curly braces  for array values to mongodb config injection
- create monitoring namespace on install
- quoted consumer secrets
- changed fallback tag
- use --set-json for injecting kickstart data
- remove kickstart injection for now as the db is already set up

* Small Adjustments

- More fine-grained time selection for queries
- Fixed cron-job running multiple times after scheduling

* Added Dataflow Architecture Images

* Update release.yaml

- adjusted env variables for consistency

* Adjusted Readme

- included breaking-changes for artifact availability
- setup guide for gcloud sql as fusionauth database
- more in-depth description of realtime-analytics data flow and report generation

* Update README.md

- fixed heading link
  • Loading branch information
BenjaminBruenau authored Feb 23, 2024
1 parent 6ecd843 commit cadda84
Show file tree
Hide file tree
Showing 12 changed files with 162 additions and 15 deletions.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added .github/assets/realtime-analytics-dataflow.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added .github/assets/ruettel-report-generation.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added .github/assets/ruettel-report-report.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added .github/assets/ruettel-report-transparent.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 2 additions & 2 deletions .github/workflows/release.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -53,8 +53,8 @@ jobs:
run: |
sbt assembly
docker build -t analysis:0.1.0 .
docker tag analysis:0.1.0 ${{ vars.GCLOUD_REGION }}-docker.pkg.dev/${{vars.PROJECT_ID}}/ruettel-report/analysis:${{ steps.get-tag.outputs.short_ref }}
docker push ${{ vars.GCLOUD_REGION }}-docker.pkg.dev/${{vars.PROJECT_ID}}/ruettel-report/analysis:${{ steps.get-tag.outputs.short_ref }}
docker tag analysis:0.1.0 ${{ env.GAR_LOCATION }}-docker.pkg.dev/${{vars.PROJECT_ID}}/ruettel-report/analysis:${{ steps.get-tag.outputs.short_ref }}
docker push ${{ env.GAR_LOCATION }}-docker.pkg.dev/${{vars.PROJECT_ID}}/ruettel-report/analysis:${{ steps.get-tag.outputs.short_ref }}
working-directory: ./backend/Analysis

- name: Build Images
Expand Down
4 changes: 2 additions & 2 deletions .helm/ruettel-chart-local-values.yaml
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@


image:
repository: europe-west6-docker.pkg.dev/instant-heading-405914/ruettel-report-dev/ # or europe-west6-docker.pkg.dev/instant-heading-405914/ruettel-report/
repository: europe-west6-docker.pkg.dev/instant-heading-405914/ruettel-report/ # or europe-west6-docker.pkg.dev/instant-heading-405914/ruettel-report/
pullPolicy: IfNotPresent #Always
#Overrides the image tag whose default is the chart appVersion.
tag: "1.0.6"
tag: "1.0.8"

replicas:
free: 1
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
apiVersion: "sparkoperator.k8s.io/v1beta2"
kind: ScheduledSparkApplication
metadata:
name: spark-pi-scheduled
name: spark-analysis-scheduled
namespace: premium
spec:
schedule: {{ .Values.analysis.schedule }}
Expand All @@ -26,7 +26,7 @@ spec:
spark.jars.ivy: "/tmp/ivy"
spark.kubernetes.memoryOverheadFactor: "0.2"
restartPolicy:
type: Always
type: Never
volumes:
- name: "test-volume"
hostPath:
Expand Down
109 changes: 103 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,21 @@
[![codecov](https://codecov.io/gh/BenjaminBruenau/RuettelReport/graph/badge.svg?token=7OKGD5WV2H)](https://codecov.io/gh/BenjaminBruenau/RuettelReport)


## Local Setup Guide
## !Important!

As of 21.2.2023 our test-period for Google Cloud has expired, meaning our Artifact Repositories for both develop and production
are no longer accessible. To setup and run/deploy the application the artifacts need to be released to an own registry
(e.g. by changing the action environment variables for `GCLOUD_REGION`, `GAR_LOCATION` and `PROJECT_ID`).
The values for `image.repository` in the [ruettel-chart](.helm/ruettel-chart/values.yaml) or the [ruettel-chart-local-values.yaml](.helm/ruettel-chart-local-values.yaml)
then need to be adjusted accordingly.

## Setup Guide
**(with local Postgres DB for FusionAuth)**

To setup the application in Google Cloud with FusionAuth using a GCloud SQL DB see [this](#cloudsql-for-fusionauth-guide).

This setup flow is also suitable for a deployment to the Kubernetes Engine of a Cloud Provider, the kubeconfig only needs to
point to the K8s Cluster in the Cloud.
### Clone Project

````shell
Expand Down Expand Up @@ -61,11 +73,12 @@ kubectl apply -f kong-prometheus-plugin.yaml

1. [Port Forward](#FusionAuth) FusionAuth
2. Access its UI in the browser
3. Go to `Settings` -> `Key Manager`
4. View the `premium` and `free` key and copy both their public key entries
5. Replace the values for `kong.premiumConsumerSecret` and `kong.freeConsumerSecret` (in `ruettel-chart-local-values.yaml`)
3. Login with the values defined for the admin account in the kickstart property inside `local-fa-values.yaml`
4. Go to `Settings` -> `Key Manager`
5. View the `premium` and `free` key and copy both their public key entries
6. Replace the values for `kong.premiumConsumerSecret` and `kong.freeConsumerSecret` (in `ruettel-chart-local-values.yaml`)
with their corresponding public key value
6. Proceed with the Application Chart Installation
7. Proceed with the Application Chart Installation

```shell
helm install ruettel-chart ./ruettel-chart -f ruettel-chart-local-values.yaml --set image.tag=<your desired release version / latest>
Expand Down Expand Up @@ -105,8 +118,92 @@ Get Logs of specific SparkApplication Job:
kubectl logs spark-analysis-driver -n premium
````

## CloudSQL for FusionAuth Guide
(with PostgresDB provided by GoogleCloud for FusionAuth - [Reference](https://fusionauth.io/docs/get-started/download-and-install/kubernetes/gke#create-a-database))

- this will only work with a vpc native gke cluster as the created db will have no external endpoint
- when using terraform to setup the cluster it will be VPC_NATIVE by default per the configuration (see [cluster.tf](.terraform/cluster.tf))

```shell
export PROJECT_ID=<your-project-id>
export DB_NAME=<your-db-name>
export REGION=<your-gcloud-region>
```

**Setup the database**
````shell
gcloud beta sql instances create "${DB_NAME}" \
--project="${PROJECT_ID}" \
--database-version=POSTGRES_12 \
--tier=db-g1-small \
--region="${REGION}" \
--network=default \
--no-assign-ip
````

**Configure the default user**
````shell
gcloud sql users set-password postgres \
--instance="${DB_NAME}" \
--password=<your-password>
````


**Verify installation**
````shell
gcloud sql instances list
````

The following values need to be adjusted in [fa-values.yaml](.helm/fa-values.yaml):
- `database.host` (-> internal endpoint of the db)
- `database.root.user` (-> postgres)
- `database.root.password` (-> password you set up for the default user)
- `database.user`
- `database.password`
- optionally: `kickstart.data` if FusionAuth should not be configured manually

### Point Kubeconfig to GKE Cluster

`gcloud container clusters get-credentials <cluster-name> --region europe-west6`

## Architecture


![Ruettel-Report Banner](./.github/assets/architecture-transparent.png)
![architecture](./.github/assets/architecture-transparent.png)



## (Soft) Realtime Analytics - Data Flow

- when querying data (either via the API of the QueryService or DataTransformer) the results will be written to Kafka
- the SparkStructured Streaming Analysis Service will aggregate the queried data in a streaming manner in two ways:
- Per Batch = Aggregations per Query of the user
- Complete = Aggregations are continuously updated for each incoming Query Data
- the results are then written to a mongodb collection and enriched with the tenantID and userID to only provide the
analysis results to the user inside a tenancy who queried the data that was aggregated
- when accessing the ui the latest complete aggregations are fetched
- a socket connection to the project-management service is established to get the latest analysis results once they are written to the database
- this is done by establishing a Listener to the MongoDB Change Streams for each user (after authenticating the socket-connection)
- only inserts in the realtime-analytics collections that match the users id and his tenantid are listened to
- once a change streams event matches that criteria a message with the new analysis results is sent back to the client so it can be visualized

![architecture](./.github/assets/realtime-analytics-dataflow-transparent.png)



## Report Generation (the actual Rüttel Report)

- every day at 8 a.m. a cron-job as a scheduled SparkApplication is run
- this job will aggregate the available data for each tenant in kafka (retention time = 1 week), report collection per tenant
- spark will compute metrics which can later be used for interesting statistical computation for basic predictions
(e.g. probability for x events of a certain type happening in a row or probability of an event happening at a specific timestamp/time-range)
- each report can then be visualized in the Frontend and exported as a PDF


![architecture](./.github/assets/ruettel-report-transparent.png)

<br>

![report visualization](.github/assets/ruettel-report-report.png)

Non-transparent versions of the architecture images are available [here](.github/assets).
51 changes: 50 additions & 1 deletion backend/Analysis/Analysis.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,4 +42,53 @@
"type": "Feature"
}
```



# DataFrames are grouped by tenants and users (to only take their aggregations for queried data into account for e.g. visualization)

+------------+----------+--------------------+
| tenantId| userId| value|
+------------+----------+--------------------+
|test-tenant2|marcooo123|{"geometry":{"coo...|
|test-tenant2|marcooo123|{"geometry":{"coo...|
|test-tenant2|marcooo123|{"geometry":{"coo...|
|test-tenant2| andiii123|{"geometry":{"coo...|
|test-tenant2| andiii123|{"geometry":{"coo...|
|test-tenant2| andiii123|{"geometry":{"coo...|
| test-tenant| benni123|{"geometry":{"coo...|
| test-tenant| benni123|{"geometry":{"coo...|
| test-tenant| benni123|{"geometry":{"coo...|
| test-tenant| joel123|{"geometry":{"coo...|
| test-tenant| joel123|{"geometry":{"coo...|
| test-tenant| joel123|{"geometry":{"coo...|
| test-tenant| joel123|{"geometry":{"coo...|
| test-tenant| joel123|{"geometry":{"coo...|
| test-tenant| joel123|{"geometry":{"coo...|
| test-tenant| maggo123|{"geometry":{"coo...|
| test-tenant| maggo123|{"geometry":{"coo...|
| test-tenant| maggo123|{"geometry":{"coo...|
| test-tenant| maggo123|{"geometry":{"coo...|
| test-tenant| maggo123|{"geometry":{"coo...|
+------------+----------+--------------------+


+------------+----------+-----+------------------+--------------------+
| tenantId| userId|count| avg_magnitude|high_magnitude_count|
+------------+----------+-----+------------------+--------------------+
| test-tenant| benni123| 6|1.6433333433333335| 0|
| test-tenant| maggo123| 6|1.6433333433333335| 0|
| test-tenant| joel123| 6|1.6433333433333335| 0|
|test-tenant2|marcooo123| 3|1.6433333433333333| 0|
|test-tenant2| andiii123| 3|1.6433333433333333| 0|
+------------+----------+-----+------------------+--------------------+


+------------+----------+---+---+---+---+----+
| tenantId| userId|0-2|2-4|4-6|6-8|8-10|
+------------+----------+---+---+---+---+----+
| test-tenant| benni123| 6| 0| 0| 0| 0|
| test-tenant| maggo123| 6| 0| 0| 0| 0|
| test-tenant| joel123| 6| 0| 0| 0| 0|
|test-tenant2|marcooo123| 3| 0| 0| 0| 0|
|test-tenant2| andiii123| 3| 0| 0| 0| 0|
+------------+----------+---+---+---+---+----+
1 change: 1 addition & 0 deletions frontend/components/api/ApiFilterBlock.vue
Original file line number Diff line number Diff line change
Expand Up @@ -231,6 +231,7 @@ const handleApiSave = (event: MouseEvent) => {
<div v-if="item[1].type === 'dateTime'">
<PrimeCalendar
showIcon
showTime
iconDisplay="input"
v-model="item[1].value"
@input="updateDateValue(item[1], $event)"
Expand Down
4 changes: 2 additions & 2 deletions frontend/components/form/Settings.vue
Original file line number Diff line number Diff line change
Expand Up @@ -82,8 +82,8 @@ watch(darkMode,(b)=>{
projectSettingsStore.$subscribe((mutation, state) => {
if (mutation.events.key && Object.keys(projectSettingsStore.theme).includes(mutation.events.key)) {
console.log('Theme Change')
if (mutation && state) {
console.debug('Possible Theme Change')
projectSettingsStore.setupTheme()
}
})
Expand Down

0 comments on commit cadda84

Please sign in to comment.