Rucat

Rucat is design to be a server for managing multiple kinds of big-data applications across different platforms. For now, it only supports Apache Spark 3.5.3 (with spark-connect enabled by default) on Kubernetes. More engines and platforms will be supported in the future.

Rucat is a Boy / Girl name, meaning is Guider, Discipline and Adventurer. The Numerology Number for the name Rucat is 9.

Note: This project is still in the early stage of development. It is not recommended to use it in production.

Architecture

Idea

fully async
decouple rest server and k8s, apache spark

flowchart
    server(rucat server)
    engine(Apache Spark, ...)
    monitor(rucat state monitor)
    resource-manager(k8s, ...)
    database[(surreal, ...)]
    user -- REST requests --> server

    server -- create, stop, delete engine / get engine info --> database
    monitor -- check / update engine state --> database
    monitor -- trigger operation / get resource info --> resource-manager
    resource-manager -- create / delete / get info --> engine

Rucat Engine State

stateDiagram

    [*] --> WaitToStart: START
    WaitToStart --> Terminated: STOP
    WaitToStart --> [*]: DELETE
    WaitToStart --> TriggerStart: (one state monitor takes the engine)

    TriggerStart --> StartInProgress: create k8s pod
    TriggerStart --> ErrorClean: create resource error

    StartInProgress --> Running: pod running
    StartInProgress --> WaitToTerminate: STOP
    StartInProgress --> ErrorWaitToClean: resource in error state

    Running --> WaitToTerminate: STOP

    WaitToTerminate --> Running: RESTART
    WaitToTerminate --> TriggerTermination: (one state monitor takes the engine)

    TriggerTermination --> TerminateInProgress: delete pod

    TerminateInProgress --> Terminated: pod removed

    Terminated --> WaitToStart: RESTART
    Terminated --> [*]: DELETE

    ErrorWaitToClean --> ErrorTriggerClean: (one state monitor takes the engine)
    ErrorTriggerClean --> ErrorCleanInProgress: delete pod
    ErrorCleanInProgress --> ErrorClean: pod removed
    ErrorClean --> [*]: DELETE

How to test

cargo test

Configurations

rucat server configurations

command line arguments:

--config-path <path>  # the path of the configuration file

configuration file:

{
    "auth_enable": <bool>,  # enable auth or not when sending request to rucat server. Only supported hard-coded username and password for now.
    "database": { # database configurations. Only support SurrealDB for now.
        "credentials": { # credentials for connecting to the database. Only supported hard-coded username and password for now.
            "username": "admin",
            "password": "admin"
        },
        "uri": "rucat-surrealdb:8000" # URI of the database server.
    }
}

rucat state monitor configurations

command line arguments: Rucat state monitor does not have any command line arguments for now.
configuration file: Path of the configuration file is hard-coded as /rucat_state_monitor/config.json.

{
    "check_interval_millis": < non zero u64 >, # the interval of checking the engine state in milliseconds.
    "database": { # same as the database configurations in rucat server.
      "credentials": {
          "username": "admin",
          "password": "admin"
      },
      "uri": "rucat-surrealdb:8000"
    }
}

REST APIs

Create engine: create a new engine

POST /engine

request body:

{
  "name": <non empty string>, # the name of the engine
  "engine_type": <string>, # the type of the engine, only support "Spark" for now.
  "version": <string>, # version of the engine.
  "configs": { # the configurations of the engine (Spark configurations for now)
    "spark.executor.instances": "1"
  }
}

return:

{ "id": <string> engine id}

Get engine: get the engine info

GET /engine/<engine_id>

return:

{
  "name": <string> engine name,
  "engine_type": <string> type of engine,
  "version": <string> version of the engine,
  "state": <string> engine state,
  "configs": { # the configurations of the engine
    "spark.executor.instances": "1"
  },
  "create_time": <date> created time of the engine
}

List engines: list all engines

GET /engines

return:

[
    {id: <string> engine id},
]

Stop engine: stop the engine

POST /engine/<engine_id>/stop

return: None

Restart engine: Make a stopped engine running again

POST /engine/<engine_id>/restart

return: None

Delete engine: Remove all resources and info of the engine

DELETE /engine/<engine_id>

return: None

How to deploy on k8s and use

build dockers:

cd rucat/kubernetes/docker/
bash build rucat_server.sh
bash build rucat_state_monitor.sh

deploy on k8s: helm install rucat rucat
create an Apache Spark using the REST API
connect to the spark connect server

TODO

Handle timeout for Trigger* states.
catch the spark driver log before deleting?
implement rucat-client (based on spark-connect-rs)
mock resource client. https://github.com/asomers/mockall
rucat server HA
multi rucat state monitors
More resource clients: Yarn, Spark standalone, Spark local, rust shuttle etc.
expose spark rpc port and web ui port
update UpdateEngineStateResponse as enum

Debug

Dummy command that can make a pod running forever: tail -f /dev/null

Name		Name	Last commit message	Last commit date
Latest commit History 166 Commits
.github/workflows		.github/workflows
examples		examples
kubernetes		kubernetes
rucat_client		rucat_client
rucat_common		rucat_common
rucat_server		rucat_server
rucat_state_monitor		rucat_state_monitor
.gitignore		.gitignore
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Rucat

Architecture

Idea

Rucat Engine State

How to test

Configurations

rucat server configurations

rucat state monitor configurations

REST APIs

Create engine: create a new engine

Get engine: get the engine info

List engines: list all engines

Stop engine: stop the engine

Restart engine: Make a stopped engine running again

Delete engine: Remove all resources and info of the engine

How to deploy on k8s and use

TODO

Debug

About

Releases

Packages

Languages

License

HaoYang670/rucat

Folders and files

Latest commit

History

Repository files navigation

Rucat

Architecture

Idea

Rucat Engine State

How to test

Configurations

rucat server configurations

rucat state monitor configurations

REST APIs

Create engine: create a new engine

Get engine: get the engine info

List engines: list all engines

Stop engine: stop the engine

Restart engine: Make a stopped engine running again

Delete engine: Remove all resources and info of the engine

How to deploy on k8s and use

TODO

Debug

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages