Skip to content

ALEPH-435 Fix DBUS error when enabling controller #815

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 20, 2025

Conversation

olethanh
Copy link
Collaborator

Fix Jira ALEPH-435

Dbus error when enabling VM controller org.freedesktop.DBus.Error.ServiceUnknown: The name :1.612 was not provided by any .service files

It seems to occur in some case after the dbus deamon reload it's config

 dbus-daemon[1415]: [system] Reloaded configuration

Generally when doing unattended-upgrade.

A similiar error "Connection closed" happend if the dbus daemon is restarted.

dbus-daemon[1415]: [system] Reloaded configuration

Complete stack trace

    : Traceback (most recent call last):
    :   File "/opt/aleph-vm/aleph/vm/orchestrator/views/init.py", line 436, in update_allocations
    :     await start_persistent_vm(instance_item_hash, pubsub, pool)
    :   File "/opt/aleph-vm/aleph/vm/orchestrator/run.py", line 264, in start_persistent_vm
    :     execution = await create_vm_execution(vm_hash=vm_hash, pool=pool, persistent=True)
    :                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    :   File "/opt/aleph-vm/aleph/vm/orchestrator/run.py", line 60, in create_vm_execution
    :     execution = await pool.create_a_vm(
    :                 ^^^^^^^^^^^^^^^^^^^^^^^
    :   File "/opt/aleph-vm/aleph/vm/pool.py", line 147, in create_a_vm
    :     self.systemd_manager.enable_and_start(execution.controller_service)
    :   File "/opt/aleph-vm/aleph/vm/systemd.py", line 77, in enable_and_start
    :     self.enable(service)
    :   File "/opt/aleph-vm/aleph/vm/systemd.py", line 35, in enable
    :     self.manager.EnableUnitFiles([service], False, True)
    :   File "/usr/lib/python3/dist-packages/dbus/proxies.py", line 141, in call
    :     return self._connection.call_blocking(self._named_service,
    :            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    :   File "/usr/lib/python3/dist-packages/dbus/connection.py", line 634, in call_blocking
    :     reply_message = self.send_message_with_reply_and_block(

Explain what problem this PR is resolving

Related ClickUp, GitHub or Jira tickets : ALEPH-XXX

Self proofreading checklist

  • The new code clear, easy to read and well commented.
  • New code does not duplicate the functions of builtin or popular libraries.
  • An LLM was used to review the new code and look for simplifications.
  • New classes and functions contain docstrings explaining what they provide.
  • All new code is covered by relevant tests.
  • Documentation has been updated regarding these changes.
  • Dependencies update in the project.toml have been mirrored in the Debian package build script packaging/Makefile

Changes

How to test

  1. Start the supervisor
  2. Restart dbus sudo systemctl restart dbus.service
  3. Launch an instance using the allocate endpoint
    e.g. with commands
### Launch bad VM from hash control/allocations
POST http://localhost:4020/control/allocations
Content-Type: application/json
X-Auth-Signature: test
Accept: application/json


{
  "persistent_vms": [],
  "instances": [
    "decadecadecadecadecadecadecadecadecadecadecadecadecadecadecadddd"
  ]
}

olethanh added 2 commits June 17, 2025 10:58
Fix Jira ALEPH-435

Dbus error when enabling VM controller org.freedesktop.DBus.Error.ServiceUnknown: The name :1.612 was not provided by any .service files

It seems to occur in some case after the dbus deamon reload it's config

```
 dbus-daemon[1415]: [system] Reloaded configuration
```
Generally when doing unattended-upgrade.

A similiar error "Connection closed" happend if the dbus daemon is restarted.

 dbus-daemon[1415]: [system] Reloaded configuration

Complete stack trace
```
    : Traceback (most recent call last):
    :   File "/opt/aleph-vm/aleph/vm/orchestrator/views/init.py", line 436, in update_allocations
    :     await start_persistent_vm(instance_item_hash, pubsub, pool)
    :   File "/opt/aleph-vm/aleph/vm/orchestrator/run.py", line 264, in start_persistent_vm
    :     execution = await create_vm_execution(vm_hash=vm_hash, pool=pool, persistent=True)
    :                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    :   File "/opt/aleph-vm/aleph/vm/orchestrator/run.py", line 60, in create_vm_execution
    :     execution = await pool.create_a_vm(
    :                 ^^^^^^^^^^^^^^^^^^^^^^^
    :   File "/opt/aleph-vm/aleph/vm/pool.py", line 147, in create_a_vm
    :     self.systemd_manager.enable_and_start(execution.controller_service)
    :   File "/opt/aleph-vm/aleph/vm/systemd.py", line 77, in enable_and_start
    :     self.enable(service)
    :   File "/opt/aleph-vm/aleph/vm/systemd.py", line 35, in enable
    :     self.manager.EnableUnitFiles([service], False, True)
    :   File "/usr/lib/python3/dist-packages/dbus/proxies.py", line 141, in call
    :     return self._connection.call_blocking(self._named_service,
    :            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    :   File "/usr/lib/python3/dist-packages/dbus/connection.py", line 634, in call_blocking
    :     reply_message = self.send_message_with_reply_and_block(
```
Copy link

codecov bot commented Jun 17, 2025

Codecov Report

Attention: Patch coverage is 24.56140% with 43 lines in your changes missing coverage. Please review.

Project coverage is 65.01%. Comparing base (f6d5a24) to head (374be44).
Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
src/aleph/vm/systemd.py 24.56% 41 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #815      +/-   ##
==========================================
- Coverage   65.27%   65.01%   -0.26%     
==========================================
  Files          85       85              
  Lines        7685     7729      +44     
  Branches      664      670       +6     
==========================================
+ Hits         5016     5025       +9     
- Misses       2460     2493      +33     
- Partials      209      211       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@nesitor nesitor merged commit 18f2648 into main Jun 20, 2025
33 of 37 checks passed
@nesitor nesitor deleted the ol-aleph-435-dbus-error branch June 20, 2025 09:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants