Skip to content

AMOS 2025 - Worker mangement and orchestration #256

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 364 commits into
base: main
Choose a base branch
from

Conversation

ClProsser
Copy link
Contributor

What kind of change does this PR introduce?
feature

What is the current behavior?
Currently, worker nodes are not supported by EMBArk.

What is the new behavior (if this is a feature change)?
In this PR, we provide a basic implementation of our project mission: Orchestration in EMBArk, including worker setup/configuration. As of now, it is possible to create workers, configure/update them, run an analysis on them and retrieve results.

This PR includes:

  • Settings app:
    • Allows to dynamically set settings using the EMBArk webinterface
    • Currently only supports enabling orchestrators/workers. If enabled, a worker app is added to the sidebar (once the page is reloaded)
  • Worker management:
    • Offline workers can be created using configurations, by specifying a SSH user name, password and an IP range. EMBArk then creates every reachable worker in this IP range
    • Workers are then base-configured, i.e. a sudoers file is created if the user isn't root
    • Once a worker is base-configured, an offline installation can be triggered. EMBArk fetches all necessary artifacts, transfers them to the worker and initiates the installation
    • If something is bugged, a worker can be soft-resetted (stop running docker containers, delete analysis related files) or hard-resetted (uninstall and remove EMBA and all its dependencies)
    • Every two minutes, worker related system information is fetched (Available CPU cores, RAM, available/total storage)
  • Update management:
    • EMBArk keeps track which versions of dependencies (EMBA repo, EMBA docker image, APT dependencies, external dependencies) are installed on each worker
    • EMBArk can check if new dependency versions are available for download
    • If new versions are available, a diff can be shown (available vs installed) and individual dependencies can be updated. If a worker is free, this is done immediately, otherwise the update is delayed until the worker is free again (e.g. after the current update, or after the current analysis)
  • Orchestrator:
    • Schedules queued analysis tasks and assigns them to workers, once free workers are available
    • Keeps track which workers are busy/free
    • Starts the analysis on the worker. To do this, the firmware is copied to the worker. Once this is done, EMBArk checks if the analysis is still running (e.g. if the docker container is still up). In addition, EMBArk zips and transfers partial results to the EMBArk host.
  • Async tasks: To reduce response times and thus waiting times in the UI, we added Celery

Does this PR introduce a breaking change?
No

Other information:
This is the third and main release of the AMOS 2025 EMBArk project.

As we are still testing the system as a whole, this is a draft PR. The prevention of overlapping IP ranges and SSH key auth is not included yet.

LukaDeka and others added 30 commits July 2, 2025 12:46
Signed-off-by: Luka Dekanozishvili <luka.dekanozishvili1@gmail.com>
Signed-off-by: Luka Dekanozishvili <luka.dekanozishvili1@gmail.com>
Signed-off-by: Luka Dekanozishvili <luka.dekanozishvili1@gmail.com>
Signed-off-by: Luka Dekanozishvili <luka.dekanozishvili1@gmail.com>
Signed-off-by: Luka Dekanozishvili <luka.dekanozishvili1@gmail.com>
Signed-off-by: Luka Dekanozishvili <luka.dekanozishvili1@gmail.com>
Signed-off-by: Luka Dekanozishvili <luka.dekanozishvili1@gmail.com>
Signed-off-by: Luka Dekanozishvili <luka.dekanozishvili1@gmail.com>
Signed-off-by: Luka Dekanozishvili <luka.dekanozishvili1@gmail.com>
Signed-off-by: Luka Dekanozishvili <luka.dekanozishvili1@gmail.com>
Signed-off-by: Luka Dekanozishvili <luka.dekanozishvili1@gmail.com>
Signed-off-by: Luka Dekanozishvili <luka.dekanozishvili1@gmail.com>
Signed-off-by: Luka Dekanozishvili <luka.dekanozishvili1@gmail.com>
Signed-off-by: ClProsser <clemens.prosser@gmail.com>
Signed-off-by: ClProsser <clemens.prosser@gmail.com>
Signed-off-by: ClProsser <clemens.prosser@gmail.com>
Signed-off-by: ClProsser <clemens.prosser@gmail.com>
Signed-off-by: ClProsser <clemens.prosser@gmail.com>
Signed-off-by: ashiven <nevisha@pm.me>
Signed-off-by: ashiven <nevisha@pm.me>
Signed-off-by: ashiven <nevisha@pm.me>
Signed-off-by: ashiven <nevisha@pm.me>
Signed-off-by: ashiven <nevisha@pm.me>
Signed-off-by: ashiven <nevisha@pm.me>
Signed-off-by: ashiven <nevisha@pm.me>
Signed-off-by: ashiven <nevisha@pm.me>
Signed-off-by: ashiven <nevisha@pm.me>
Signed-off-by: ashiven <nevisha@pm.me>
Signed-off-by: ashiven <nevisha@pm.me>
Signed-off-by: ashiven <nevisha@pm.me>
ClProsser and others added 25 commits July 15, 2025 21:25
Signed-off-by: ClProsser <clemens.prosser@gmail.com>
Signed-off-by: ClProsser <clemens.prosser@gmail.com>
Signed-off-by: ClProsser <clemens.prosser@gmail.com>
Signed-off-by: ashiven <nevisha@pm.me>
Signed-off-by: ashiven <nevisha@pm.me>
Signed-off-by: ashiven <nevisha@pm.me>
Signed-off-by: ashiven <nevisha@pm.me>
Signed-off-by: ClProsser <clemens.prosser@gmail.com>
Signed-off-by: ashiven <nevisha@pm.me>
Co-authored by Johannes Kunow <j.kunow@tu-berlin.de>

Signed-off-by: Fridtjof Damm <soenke.f.damm@campus.tu-berlin.de>
Signed-off-by: ashiven <nevisha@pm.me>
Signed-off-by: ashiven <nevisha@pm.me>
Signed-off-by: ashiven <nevisha@pm.me>
Signed-off-by: ClProsser <clemens.prosser@gmail.com>
Signed-off-by: ClProsser <clemens.prosser@gmail.com>
Signed-off-by: ClProsser <clemens.prosser@gmail.com>
Signed-off-by: ClProsser <clemens.prosser@gmail.com>
Signed-off-by: ClProsser <clemens.prosser@gmail.com>
Signed-off-by: ClProsser <clemens.prosser@gmail.com>
Signed-off-by: ClProsser <clemens.prosser@gmail.com>
Signed-off-by: ClProsser <clemens.prosser@gmail.com>
Signed-off-by: ClProsser <clemens.prosser@gmail.com>
IS_UBUNTU=$(awk -F= '/^NAME/{print $2}' /etc/os-release)
[[ ${IS_UBUNTU} == "Ubuntu" ]] && IS_UBUNTU=true || IS_UBUNTU=false

function downloadPackage() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could not find a word-splitting issue here, what did you find?

However, I noticed that the result of the awk command includes quotes, thus the comparison is always false. I updated the check.

Signed-off-by: ClProsser <clemens.prosser@gmail.com>
@ClProsser ClProsser marked this pull request as ready for review July 21, 2025 07:34
@ClProsser
Copy link
Contributor Author

Now, this PR is not in draft-state anymore.

Three problems should be addressed soon:

  1. Configuration overlapping: Currently, ip ranges in configurations might overlap, resulting in workers having multiple related configurations. In this case, multiple features might break each other (e.g. Transferring an SSH key (as password auth was deactivated after the first configuration scan), ...)
  2. Dependency state is set as available even if a dependency setup fails: EMBArk caches all worker dependencies for later use (Repo, docker image, external dir, apt dependencies). As of now, EMBArk has no strategy to deal with failing dependency setup on the EMBArk host. Assuming e.g. that the EMBArk host has no internet connection, the docker image can't be downloaded. Thus the setup might fail. A solution might be to retry, and stop if it failed 3 times.
  3. Pipfile.lock updates: As we added multiple python packages, the state of the Pipfile.lock does not reflect the state in the main branch. We did not just update all packages, as we are not familiar with your strategy in selecting and verifying packages.

Furthermore, it might be reasonable to refactor the BoundedExecutor class to celery tasks, and if needed, benefit from multiple celery processes.

If there are any change requests in this PR, feel free to comment them in this PR, depending on its size we'll fix it.

Copy link
Member

@BenediktMKuehne BenediktMKuehne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work

Looking forward to merging this
...once I have tested it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unsure why this file is called that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this file we define the function new_autoadd_client, which uses paramiko.AutoAddPolicy. However, CodeQL does not allow this (https://codeql.github.com/codeql-query-help/python/py-paramiko-missing-host-key-validation/). The only way to define exceptions is to exclude whole files. Thus a file codeql_ignore.py was created, and added to /codeql-config.yml.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants