Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

service startup II #3208

Open
wants to merge 42 commits into
base: devel
Choose a base branch
from
Open

service startup II #3208

wants to merge 42 commits into from

Conversation

andre-merzky
Copy link
Member

@andre-merzky andre-merzky commented Jul 5, 2024

This improves agent service startup, and allows to expose service endpoints to tasks.

This closes #2899 and #3062.

@andre-merzky andre-merzky marked this pull request as draft July 12, 2024 22:36
@andre-merzky andre-merzky marked this pull request as ready for review July 15, 2024 12:26
@andre-merzky andre-merzky changed the base branch from devel to feature/service_startup July 15, 2024 12:50
@andre-merzky
Copy link
Member Author

This is ready for review.

@andre-merzky
Copy link
Member Author

TODO AM: add example

Copy link

codecov bot commented Sep 20, 2024

Codecov Report

Attention: Patch coverage is 46.66667% with 176 lines in your changes missing coverage. Please review.

Project coverage is 42.84%. Comparing base (8be5cd2) to head (ee76d0a).

Files with missing lines Patch % Lines
src/radical/pilot/client.py 21.84% 93 Missing ⚠️
src/radical/pilot/task_manager.py 8.33% 22 Missing ⚠️
src/radical/pilot/agent/agent_0.py 80.19% 20 Missing ⚠️
src/radical/pilot/session.py 23.52% 13 Missing ⚠️
src/radical/pilot/task.py 42.10% 11 Missing ⚠️
src/radical/pilot/agent/launch_method/flux.py 0.00% 6 Missing ⚠️
src/radical/pilot/agent/executing/base.py 83.33% 4 Missing ⚠️
src/radical/pilot/raptor/master.py 0.00% 2 Missing ⚠️
src/radical/pilot/agent/executing/sleep.py 0.00% 1 Missing ⚠️
src/radical/pilot/agent/scheduler/continuous.py 66.66% 1 Missing ⚠️
... and 3 more
Additional details and impacted files
@@            Coverage Diff             @@
##            devel    #3208      +/-   ##
==========================================
- Coverage   43.51%   42.84%   -0.67%     
==========================================
  Files          96       97       +1     
  Lines       10968    11167     +199     
==========================================
+ Hits         4773     4785      +12     
- Misses       6195     6382     +187     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@andre-merzky
Copy link
Member Author

TODO AM: add example

Done

@andre-merzky
Copy link
Member Author

andre-merzky commented Sep 26, 2024

Note: service specs on the pilot description still work?

@AymenFJA
Copy link
Contributor

AymenFJA commented Sep 30, 2024

I tested this work on two machines :

  • Local server, and it passed successfully.
  • UVA Rivanna, and it passed successfully

RADICAL-Stack:

(base) jovyan@34a75d890b57:~$ radical-stack

  python               : /opt/conda/bin/python3
  pythonpath           : 
  version              : 3.9.13
  virtualenv           : base

  radical.analytics    : 1.60.0
  radical.entk         : 1.60.0
  radical.gtod         : 1.81.0
  radical.pilot        : 1.82.0-v1.81.0-53-g4c5009f05@feature/service_startup_2
  radical.utils        : 1.81.0

(base) jovyan@34a75d890b57:~$ 

For Local Server:

(base) jovyan@34a75d890b57:~/work/radical.pilot$ python examples/misc/service_tasks.py 
{'0': '12345'}
found my_service: {'0': '12345'}
task.000000: b'foo'

task.000001: b'foo'

task.000002: b'foo'

task.000003: b'foo'


For Rivanna:

(rct2) -bash-4.4$cat rp-submit.slurm-64529125.*
{'0': '12345'}
found my_service: {'0': '12345'}
task.000000: 
task.000001: 
task.000002: 
task.000003: 
(rct2) -bash-4.4$hostname
udc-ba38-32c0

@andre-merzky
Copy link
Member Author

andre-merzky commented Sep 30, 2024

Does Rivanna allow that port to be opened and used? Any indication in the server logs or task logs on what happened?
The example is very simplistic obviously...

Note from call: we don't expect that simple example to work universally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Expand service tasks with capability to run on every node
3 participants