23 Oct 14:21

mickours

0411f34

24.10.0 Latest

Latest

We are proud to announce the release of:

✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨

Ryax 24.10.0

✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨

Multi-site full power!

New features

A new service called Ryax Worker can now be used to attached any Slurm or Kubernetes cluster resources
Ryax can now run any action on SLURM and Kubernetes seamlessly
Action are now scheduled according to user defined constraints and objectives
Add the possibility to pin Ryax services to a dedicated resources (nodeSelector)
Enhance Ryax documentation with updated content (doc)
New Jupyter Notebook action with GPU support in default actions
Action builds now can be canceled
Kubernetes addon now support injection of service

Bug fixes and Improvements

Fix volume permission for NFS based storage volumes (defaults to 1200 now)
Fix fail properly when a pip install fails during builds

Upgrade to this version

This is a major release of Ryax which implies some extra step for the upgrade.

Update configuration

This release introduce a new service, the Worker. In order to define the nodes that will be used by your actions, the Worker requires a site configuration. Please, add a configuration in your Ryax installation configuration file using the following example: in your local cluster has a node pool named default with a label my.provider.com/pool-name: default on each node, it has 4 CPU and 8G of memory per node.

worker:
  values:
     config:
       site:
         name: local
         spec:
           nodePools:
           - cpu: 4
             memory: 8G
             name: default
             selector:
               my.provider.com/pool-name: default

See the Worker configuration documentation for more details.

Update DNS

If you use public IP with TLS enabled, you will need to create a new DNS entry to support all subdomain for your cluster. This is used for example for an external Worker to access the internal container repository.
Please add an entry in your DNS using star notation:
*.<clusterName>.<domainName>

See installation doc for more details.

Add HPC site

The users of HPC actions have to install a Worker dedicated to each cluster following this documentation.

Apply and clean

Once configured, you can apply the configuration with ryax-adm as usual.

The log capture service, Loki, was moved into the ryaxns namespace. Thus, the old Loki deployment can be removed.
After applying, we have to remove the old deployment:

helm uninstall -n ryaxns-monitoring loki
kubectl delete pvc -n ryaxns-monitoring storage-loki-0

The Worker is now handling deployment. So, to avoid dangling actions and failing deployment, you have to clean the Runner state.
Be aware that, this will reset the execution history and stop all running workflows.

ryax-adm clean runner worker

Assets 2

18 Jun 12:41

mickours

24.06.0

51b9753

24.06.0

We are proud to announce the release of:

✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨

Ryax 24.06.0

✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨

Control and stability.

New features

Add a Kubernetes Addon to customize action deployment (label, nodeSelector, annotations, serviceAccount)

Bug fixes and Improvements

Fix impossible to add dynamic output enum Values
Fix addon default values from ryax_metadata.yaml no available in UI
Better error handling for action deployments
Fix hpc addon support of files in custom script
Fix python-cuda build fails in some case
Fix UID overlap when using NFS CSI Driver
Fix OutOfMemory during git scan lead to inconsistent state

Upgrade to this version

Usual process: update the version in the config file and apply!

Assets 2

12 Jan 11:37

mickours

24.01.0

6a06182

24.01.0

We are proud to announce the release of

✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨

Ryax 24.01.0

✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨

This release focus on Reliability and Security 💪

The changelog:

Bugfixes and Improvements

Fix connection issues on broker restart
migrate Helm chart repository to an OCI standard repository
Fix SSH Slurm execution issue with files
Do not use root user inside the action builder container

Upgrade to this version

If you have set the chartRegistry (you probably didn't) in your configuration values file please change the Chart repository URL to url: registry.ryax.org/release-charts.

Assets 2

09 Feb 10:12

mickours

24.02.0

f95ec96

24.02.0

We are proud to announce the release of

✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨

Ryax 24.02.0

✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨

This release brings better HPC offloading support !

The changelog:

Bug Fixes and Improvements

Add HPC offloading capability to run custom script on nodes directly for
parallel jobs
Better error handling in HPC offloading deployment and execution
Fix HPC Offloading log capture
Runs can now be canceled and deleted from the UI
Fix dynamic outputs edition and improve display
Fix action not undeployed in some corner case

Upgrade to this Version

HPC action have to be deleted and recreated to have the custom script
parameters available.

Assets 2

28 Dec 13:29

mickours

23.12.0

f95ec96

23.12.0

We are proud to announce the release of

✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨

Ryax 23.12.0

✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨

This release focus on Scaling and Performance 🚀

The changelog:

New features

Improve HPC offloading with optimized IO and image build
1 to N scaling of actions with better Kubernetes autoscale support
Show a clear error message on Action failure due to resource limits
Optional IO for Actions

Bug fixes and Improvements

Improve database query performance and Runner responsiveness
Fix actions undeploying during Runner restarts
Fix monitoring configuration for KubeProxy
Fix workflow deletion failed in some conditions
Fix RabbitMQ failure to respond to liveness probe

Upgrade to this version

The RabbitMQ deployment needs to be replaced. To do so, uninstall it before the
update (communication between services will stop during update):

helm uninstall -n ryaxns rabbitmq

Then, proceed with the normal upgrade process.

To avoid errors on connections between service, restart them after the upgrade with:

kubectl delete pod -n ryaxns -l ryax.tech/resource-type=internal

Assets 2

12 Oct 13:04

mickours

23.10.0

6f7dcb3

23.10.0

We are proud to announce the release of

✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨

Ryax 23.10.0

✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨

The changelog:

New features and improvements

Trigger with python with CUDA support
Graceful stop for running workflows
Auto reload on UI update (PWA support)

Bug fixes

Better error message when scanning badly formatted action metadata
Fix add repository modal layout
Fix refresh error on OpenAPI UI in some cases

Upgrade to this version

No action needed!

Assets 2

19 Sep 08:46

mickours

23.09.0

e15f5d9

23.09.0

We are proud to announce the release of

✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨

Ryax 23.09.0

✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨

This release focus on Observability and Performance, enjoy!

The changelog:

New features

Instant logs on Triggers
Better logs display for the Runs
Update of Prometheus to the latest version
Performance metrics are now exported and available in a dashboard in Grafana
Add internal tracing in the Runner with Tempo to query traces in Grafana
Run details panel rework

Bug fixes

Improve database query performance and Runner responsiveness
Fix errors on version change in some cases
Fix error when stored file size is too big

Upgrade to this version

Admins should take care of the following elements when upgrading to this version.

Instant log

To get instant log, you have to rebuild the Actions. To do so, just run
"Build All" on the Library on your repository and the next deployment will use
the updated version.

Prometheus update

The update of Prometheus requires the following manual operation, before running the update. This will update the CRD and remove the old version of Prometheus.

kubectl apply --server-side -f https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/v0.68.0/example/prometheus-operator-crd/monitoring.coreos.com_alertmanagerconfigs.yaml --force-conflicts
kubectl apply --server-side -f https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/v0.68.0/example/prometheus-operator-crd/monitoring.coreos.com_alertmanagers.yaml --force-conflicts
kubectl apply --server-side -f https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/v0.68.0/example/prometheus-operator-crd/monitoring.coreos.com_podmonitors.yaml --force-conflicts
kubectl apply --server-side -f https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/v0.68.0/example/prometheus-operator-crd/monitoring.coreos.com_probes.yaml --force-conflicts
kubectl apply --server-side -f https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/v0.68.0/example/prometheus-operator-crd/monitoring.coreos.com_prometheusagents.yaml --force-conflicts
kubectl apply --server-side -f https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/v0.68.0/example/prometheus-operator-crd/monitoring.coreos.com_prometheuses.yaml --force-conflicts
kubectl apply --server-side -f https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/v0.68.0/example/prometheus-operator-crd/monitoring.coreos.com_prometheusrules.yaml --force-conflicts
kubectl apply --server-side -f https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/v0.68.0/example/prometheus-operator-crd/monitoring.coreos.com_scrapeconfigs.yaml --force-conflicts
kubectl apply --server-side -f https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/v0.68.0/example/prometheus-operator-crd/monitoring.coreos.com_servicemonitors.yaml --force-conflicts
kubectl apply --server-side -f https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/v0.68.0/example/prometheus-operator-crd/monitoring.coreos.com_thanosrulers.yaml --force-conflicts

helm uninstall -n ryaxns-monitoring prometheus

Now you can run the update to reinstall the new Prometheus version with the
usual ryax-adm apply command.

Grafana's credentials are reset by this update, user is ryax and the password can be obtained with:

kubectl get secret --namespace ryaxns-monitoring grafana-cedentials -o jsonpath="{.data.admin-password}" | base64 -d

Assets 2

04 Jul 12:20

mickours

23.07.0

0fd6e86

23.07.0

We are proud to announce the release of

✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨

Ryax 23.07.0

✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨

The changelog:

New features and improvements

Allows users to set a specific HTTP status code with its API result
Actions can send a user-defined error with custom HTTP status code
Update Loki (logs capture) and Cert Manager (SSL certificate manager) to the latest version

Bug fixes

Fix OpenAPI page not always in sync with deployed workflows

Upgrade to this version

This update requires uninstalling the old Loki version before installing the
new one.

Before the update, just remove the old Loki version with:

helm uninstall -n ryaxns-monitoring loki

Be aware that some logs might not be captured before the new version is up and
running.

More details on Loki upgrade: https://grafana.com/docs/loki/next/installation/helm/upgrade-from-2.x/

Assets 2

16 Jun 12:09

mickours

23.06.0

b3ec4c0

23.06.0

We are proud to announce the release of

✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨

Ryax 23.06.0

✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨

The changelog:

New features and improvements

HPC offloading using Singularity with multi user support
Cuda GPU supports with the python3-cuda language
Resources request support (CPU, Memory, Time, GPU)
Keep the home .cache directory between runs
Allows user to rebuild already built actions
Better internal runs state management
Use the latest Minio version

Bug fixes

Show deployment error details when it happen
Always show a notification when an error happen
Fix certificate injection for our internal registry with docker daemon
UI now show the deployment errors if any

Upgrade to this version

WARNING: This update requires an update which implies a maintenance period to
copy the data from one store to another.

Minio migration (for production)

The internal filestore, Minio, upgrade requires to migrate the data from the old instance to the
new. For more details, see https://min.io/docs/minio/linux/operations/install-deploy-manage/migrate-fs-gateway.html

Get old filestore credentials

echo OLD_FILESTORE_SRV
kubectl get secret --namespace "ryaxns" ryax-filestore-secret -o jsonpath="{.data.filestore}" | base64 -d
echo OLD_FILESTORE_ACCESS
kubectl get secret --namespace "ryaxns" ryax-filestore-secret -o jsonpath="{.data.filestore-access}" | base64 -d
echo OLD_FILESTORE_SECRET
kubectl get secret --namespace "ryaxns" ryax-filestore-secret -o jsonpath="{.data.filestore-secret}" | base64 -d

Connect to the new Minio pod

MINIO_POD="$(kubectl -n ryaxns get pods --selector app.kubernetes.io/name=minio -o jsonpath='{.items[0].metadata.name}')"
kubectl -n ryaxns exec -ti $MINIO_POD -- bash

Now inside the Minio pod (replace the variables by the values from previous
steps):

mc alias set old http://OLD_FILESTORE_SRV OLD_FILESTORE_ACCESS OLD_FILESTORE_SECRET
mc alias set new http://localhost:9000 ryax $MINIO_ROOT_PASSWORD
mc mb new/ryax-filestore
mc mirror --preserve old/ryax-filestore new/ryax-filestore

You can now safely remove the old filestore deployment with:

helm uninstall -n ryaxns ryax-filestore

Clean (for dev)

Clean the internal state of the services to avoid error of missing file when
upgrading minio

ryax-adm clean studio runner

Assets 2

05 May 13:06

mickours

23.05.0

c1bb357

23.05.0

We are proud to announce the release of

✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨

Ryax 23.05.0

✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨ ✨

The changelog:

New features and improvements

arm64 architecture support

Bug fixes

Updates off all services and dependencies

Upgrade to this version

The rabbitmq pass is needed for the update. You can get it with:

kubectl get secret --namespace "ryaxns" ryax-broker-secret -o jsonpath="{.data.rabbitmq-password}" | base64 -d

Then, add the password you just get to the cluster configuration values:

rabbitmq:
  values:
    auth:
      password: <PASSWORD>

If you have an old version of traefik Helm chart, you might have an upgrade
error. In that case run the following command and retry (WARNING: This will create a
small downtime):

kubectl delete ingressroute -n kube-system traefik-dashboard
kubectl delete deployments.apps -n kube-system traefik

Assets 2

Releases: RyaxTech/ryax-engine

24.10.0

Ryax 24.10.0

New features

Bug fixes and Improvements

Upgrade to this version

Update configuration

Update DNS

Add HPC site

Apply and clean

24.06.0

Ryax 24.06.0

New features

Bug fixes and Improvements

Upgrade to this version

24.01.0

Ryax 24.01.0

Bugfixes and Improvements

Upgrade to this version

24.02.0

Ryax 24.02.0

Bug Fixes and Improvements

Upgrade to this Version

23.12.0

Ryax 23.12.0

New features

Bug fixes and Improvements

Upgrade to this version

23.10.0

Ryax 23.10.0

New features and improvements

Bug fixes

Upgrade to this version

23.09.0

Ryax 23.09.0

New features

Bug fixes

Upgrade to this version

Instant log

Prometheus update

23.07.0

Ryax 23.07.0

New features and improvements

Bug fixes

Upgrade to this version

23.06.0

Ryax 23.06.0

New features and improvements

Bug fixes

Upgrade to this version

Minio migration (for production)

Clean (for dev)

23.05.0

Ryax 23.05.0

New features and improvements

Bug fixes

Upgrade to this version