Skip to content

Fix start error in load_persistent_executions #759

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Feb 20, 2025

Conversation

olethanh
Copy link
Collaborator

Fix parse error that prevented Supervisor from starting when loading persistant executions, as it could not parse the gpu field

Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/opt/aleph-vm/aleph/vm/orchestrator/__main__.py", line 4, in <module>
    main()
  File "/opt/aleph-vm/aleph/vm/orchestrator/cli.py", line 379, in main
    supervisor.run()
  File "/opt/aleph-vm/aleph/vm/orchestrator/supervisor.py", line 184, in run
    asyncio.run(pool.load_persistent_executions())
  File "/usr/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/usr/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
    return future.result()
  File "/opt/aleph-vm/aleph/vm/pool.py", line 252, in load_persistent_executions
    execution.gpus = parse_raw_as(List[HostGPU], saved_execution.gpus)
  File "pydantic/tools.py", line 74, in pydantic.tools.parse_raw_as
    obj = load_str_bytes(
  File "pydantic/parse.py", line 37, in pydantic.parse.load_str_bytes
    return json_loads(b)
  File "/usr/lib/python3.10/json/__init__.py", line 339, in loads
    raise TypeError(f'the JSON object must be str, bytes or bytearray, '
TypeError: the JSON object must be str, bytes or bytearray, not NoneType

Explain what problem this PR is resolving

Related ClickUp, GitHub or Jira tickets : ALEPH-XXX

Self proofreading checklist

  • The new code clear, easy to read and well commented.
  • New code does not duplicate the functions of builtin or popular libraries.
  • An LLM was used to review the new code and look for simplifications.
  • New classes and functions contain docstrings explaining what they provide.
  • All new code is covered by relevant tests.
  • Documentation has been updated regarding these changes.
  • Dependencies update in the project.toml have been mirrored in the Debian package build script packaging/Makefile

Changes

Explain the changes that were made. The idea is not to list exhaustively all the changes made (GitHub already provides a full diff), but to help the reviewers better understand:

  • which specific file changes go together, e.g: when creating a table in the front-end, there usually is a config file that goes with it
  • the reasoning behind some changes, e.g: deleted files because they are now redundant
  • the behaviour to expect, e.g: tooltip has purple background color because the client likes it so, changed a key in the API response to be consistent with other endpoints

How to test

Explain how to test your PR.
If a specific config is required explain it here (account, data entry, ...)

Print screen / video

Upload here screenshots or videos showing the changes if relevant.

Notes

Things that the reviewers should know: known bugs that are out of the scope of the PR, other trade-offs that were made.
If the PR depends on a PR in another repo, or merges into another PR (i.o. main), it should also be mentioned here

Fix parse error that prevented Supervisor from starting when loading
persistant executions, as it could not parse the gpu field

    Traceback (most recent call last):
      File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
        return _run_code(code, main_globals, None,
      File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
        exec(code, run_globals)
      File "/opt/aleph-vm/aleph/vm/orchestrator/__main__.py", line 4, in <module>
        main()
      File "/opt/aleph-vm/aleph/vm/orchestrator/cli.py", line 379, in main
        supervisor.run()
      File "/opt/aleph-vm/aleph/vm/orchestrator/supervisor.py", line 184, in run
        asyncio.run(pool.load_persistent_executions())
      File "/usr/lib/python3.10/asyncio/runners.py", line 44, in run
        return loop.run_until_complete(main)
      File "/usr/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
        return future.result()
      File "/opt/aleph-vm/aleph/vm/pool.py", line 252, in load_persistent_executions
        execution.gpus = parse_raw_as(List[HostGPU], saved_execution.gpus)
      File "pydantic/tools.py", line 74, in pydantic.tools.parse_raw_as
        obj = load_str_bytes(
      File "pydantic/parse.py", line 37, in pydantic.parse.load_str_bytes
        return json_loads(b)
      File "/usr/lib/python3.10/json/__init__.py", line 339, in loads
        raise TypeError(f'the JSON object must be str, bytes or bytearray, '
    TypeError: the JSON object must be str, bytes or bytearray, not NoneType
@olethanh olethanh force-pushed the ol-fix-start-on-load_persistent_executions branch from 16bca52 to 367cddc Compare February 20, 2025 10:02
Copy link

codecov bot commented Feb 20, 2025

Codecov Report

Attention: Patch coverage is 0% with 1 line in your changes missing coverage. Please review.

Project coverage is 63.32%. Comparing base (07a7adb) to head (2840128).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
src/aleph/vm/pool.py 0.00% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main     #759   +/-   ##
=======================================
  Coverage   63.32%   63.32%           
=======================================
  Files          77       77           
  Lines        6844     6844           
  Branches      568      568           
=======================================
  Hits         4334     4334           
  Misses       2329     2329           
  Partials      181      181           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@nesitor nesitor merged commit 92249a9 into main Feb 20, 2025
21 of 22 checks passed
@nesitor nesitor deleted the ol-fix-start-on-load_persistent_executions branch February 20, 2025 16:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants