Ilum Orchestrate

Built on top of Ilum, which uses Apapche Spark for Big Data solutions.

Ilum Orchestrate is a framework aimed at organising complex data flow into distinct stages with clear input/output format and automated testing, while also abstracting away complex system logic from data scientists

How does it work?

Data Flow Architect builds a graph, where each node corresponds to the stage. He specifies the headers of input/output data, the type of storage/stream that data is moving from.
The data scientist uses annotation lib when creating a script. This way he does not creat spark session and manually connect to the data locations, but instead uses labels specified by the architect.
Testers write their test-scripts in a similar way to data scientists. Their scripts return json objects with custom metrics. All the scripts and test related to single stage are pushed to the queue and the ilum job is launched, which launches each script and test and then generates job results, allowing to compare and choose the best algorithm
Project administrator can easily manage roles of each user, restricting their access to stages, that they should not work with

What is the main idea behind Ilum Orchestrate?

Data scientists should not care about other stages of the data flow, as well as about technical details of the system. The only thing they should care about is input data, output data and the algorithms/spark sql queries they are going to use to convert one to another to achieve maximum ranking according to metrics calculated by testers
The Data Flow layer should not be dependent on the storage or streaming used. If anything will be changed in the cluster, the architecture can be adapted easily

Quick start:

Start minikube:

minikube start --cpus 4 --memory 8192 --addons metrics-server

Deploy Ilum:

helm repo add ilum https://charts.ilum.cloud
helm install ilum ilum/ilum

Deploy ilum orchestry

helm repo add jobs-manager https://Delienelu310.github.io/jobs_manager/jobs-manager-helm-repo/
helm install jobsmanager jobs-manager/jobs-manager-helm-chart

Port-forward to use client

kubectl port-forward svc/jobs-manager-client 3000:80

Finally go to localhost:3000 in your browser

jobs manager helm chart configuration:

jobs_manager:
  admin: 
    username: admin
    password: admin
    fullname: null

mongodb:
  admin:
    username: mongoadmin
    password: mongoadmin
  uri: "mongodb://mongoadmin:mongoadmin@jobs-manager-mongodb:27017"

minio:
  admin:
    username: minioadmin
    password: minioadmin

Important notes:

minio admin credentials should be the same as in Ilum part of cluster
mongon db credentials must correspond to uri

Example of luanching helm chart with custom config:

helm isntall jobsmanager jobs-manager/jobs-manager-helm-chart-1.0.0 \
    --set jobs_manager.admin.username=myadmin,jobs_manager.admin.password=mypassword

Usage

Build data flow architecture

Go to project page and use interactive canvas.
Use JobNode Mod to create stages
Using Cursor Mod, click on bars to open bar menu, where you can add input/outputs
Use Connect Mod to connect inputs and outputs, while configuring the channel headers and type

Video example: https://github.com/user-attachments/assets/8cc0933b-f69b-4eb0-8952-82173fe5e120

Create script

Create java project with maven and jdk8+
Add dependency from Ilum:

<dependency>
    <groupId>cloud.ilum</groupId>
    <artifactId>ilum-job-api</artifactId>
    <version>6.0.0</version>
</dependency>

Add dependency from Ilum Orchestry:

<repositories>
    <repository>
      <id>github</id>
      <url>https://maven.pkg.github.com/Delienelu310/jobs_connection_lib</url>
    </repository>
</repositories>

<dependency>
      <groupId>com.ilumusecase</groupId>
      <artifactId>jobs_connection_lib</artifactId>
      <version>1.0</version>
</dependency>

Create class that implements Job interface from ilum-job-api
Use annotations @JobNode on class, @InputChannel and @OutputChannel on its static fields of type Dataset<Row>
In the method run, before main code, create JobProcessor object and start it.
In your code, read data from datasets with @InputChannel and assigns output datasets to fields with @OutputChannel
In the end of your code, call finish method of JobProcessor

How it should look like:

@JobNode
public final class SoloRSI implements Job {
    @InputChannel(label = "RSI")
    static public Dataset<Row> RSI;
    @OutputChannel(label = "Signal")
    static public Dataset<Row> signals;

    @Override
    public Option<String> run(SparkSession sparkSession, scala.collection.immutable.Map<String, Object> config) {
        Map<String, Object> configJava = JavaConverters.mapAsJavaMapConverter(config).asJava();

        JobProcessor jobProcessor = new JobProcessor(SoloRSI.class, sparkSession, configJava);

        jobProcessor.start();

        //... your code

        SoloRSI.signals = outputDataset;

        jobProcessor.finish();

        return Some.apply("DONE");

    }
}

For more examples look into trading_bot folder of the project

Create Script-Tester in a similar way

Use @TestJob instead of@JobNode, use TestJobProcessor instead of JobProcessor, use @OutputChannelTestDataset instead of @OutputChannel
read from @OutputChannelTestDataset, don`t assign dataset to it in the code
In the end of code return Json String of format

{
  "metric_1" : "value_1",
  ...
  "metric_n" : "value_n"
}

For more examples look into trading_bot/test-indicators folder of the project

Upload Script an add it to queue

Go to Project page, choose stage, scroll down to Job Node details and click More to navigate to Job Node page
Choose JobsFiles Folder and use Upload File form to upload file
Choose Job Scripts Folder and create script, specifying its full class name
In Job Script menu, add jobs file as a dependency
In Job Script menu add the job script to queue, specifying the queue type: jobsQueue for job scripts, testingQueue for tester scripts

Video example: https://github.com/user-attachments/assets/99003533-2296-46f4-8612-09e0e7d4c241

Run the Job Node to Test job Scripts

Go to Job Node Page
Write ilum group details
Click Start
Wait for a minute
Go to Job Resuts and Job Errors folder to see the result or errors

Video example: https://github.com/user-attachments/assets/ede68e3c-e90e-4246-8dcf-3e95af725af0

Manage Roles of a project and a job node:

Contribution

How to contribute?

clone project code

git clone https://github.com/Delienelu310/jobs_manager.git
cd /jobs_manager

working on client
1. Deploy the cluster as in quick start, but dont port-forward client in the end
2. Port-forward jobs_manager backend to the localhost:
```
  kubectl port-forward svc/jobs-manager 8080:8080
```
1. To run:
```
  cd ./jobs_manager_client
  npm start  
```
working on backend
1. Deploy the cluster as in quick start, use default configuration
2. Remove existing jobs_manager backend:
```
```
  kubectl delete deployment jobs-manager
```
```
1. Build your own cd ./jobs_manager mvn package eval $(minikube docker-env up) docker build -t jobs_manager . kubectl delete -f deployment.yaml kubectl apply -f deployment.yaml kubectl port-forward svc/jobs-manager 8080:8080

working with jobs_connection_lib

Build the library

```
  cd ./jobs_connection_lib
  mvn package  
```

Move package file to the backend resources

```
  cd ../
  mv ./jobs_connection_lib/target/jobs_connection_lib-1.0.jar ./jobs_manager/src/main/resources/jobs_connection_lib.jar
```

Build Backend as in previous step

Submit a pull request

If you'd like to contribute, please fork the repository and open a pull request to the main branch.

What to contribute?

Python connection lib library
New channels types

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Ilum Orchestrate

How does it work?

What is the main idea behind Ilum Orchestrate?

Quick start:

jobs manager helm chart configuration:

Usage

Contribution

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 180 Commits
jobs-manager-helm-chart		jobs-manager-helm-chart
jobs-manager-helm-repo		jobs-manager-helm-repo
jobs_connection_lib		jobs_connection_lib
jobs_manager		jobs_manager
jobs_manager_client		jobs_manager_client
planning		planning
trading_bot		trading_bot
.gitignore		.gitignore
README.md		README.md

Delienelu310/jobs_manager

Folders and files

Latest commit

History

Repository files navigation

Ilum Orchestrate

How does it work?

What is the main idea behind Ilum Orchestrate?

Quick start:

jobs manager helm chart configuration:

Usage

Contribution

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Languages

Packages