Skip to content

Conversation

@israbbani
Copy link
Contributor

@israbbani israbbani commented Sep 11, 2025

This PR stacks on #56352 .

For more details about the resource isolation project see #54703.

This PR the following functions to move a process into the system cgroup:

  • CgroupManagerInterface::AddProcessToSystemCgroup
  • CgroupDriverInterface::AddProcessToCgroup

I've also added integration tests for SysFsCgroupDriver and unit tests for CgroupManager.

Let me explain how these APIs will be used. In the next PR, the raylet will

  • be passed a list of pids of system processes that are started before the raylet starts and need to be moved into the system cgroup (e.g. gcs_server)
  • call CgroupManagerInterface::AddProcessToSystemCgroup for each of these pids to move them into the system cgroup.

israbbani and others added 30 commits July 24, 2025 20:39
to perform cgroup operations.

Signed-off-by: irabbani <irabbani@anyscale.com>
Signed-off-by: irabbani <irabbani@anyscale.com>
instead of clone for older kernel headers < 5.7 (which is what we have
in CI)

Signed-off-by: irabbani <irabbani@anyscale.com>
Signed-off-by: irabbani <irabbani@anyscale.com>
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
Signed-off-by: Ibrahim Rabbani <israbbani@gmail.com>
Signed-off-by: irabbani <irabbani@anyscale.com>
Signed-off-by: irabbani <irabbani@anyscale.com>
Signed-off-by: irabbani <irabbani@anyscale.com>
Signed-off-by: irabbani <irabbani@anyscale.com>
Signed-off-by: irabbani <irabbani@anyscale.com>
Signed-off-by: irabbani <irabbani@anyscale.com>
Signed-off-by: irabbani <irabbani@anyscale.com>
Signed-off-by: irabbani <irabbani@anyscale.com>
Signed-off-by: irabbani <irabbani@anyscale.com>
Signed-off-by: irabbani <irabbani@anyscale.com>
Signed-off-by: irabbani <irabbani@anyscale.com>
Signed-off-by: irabbani <irabbani@anyscale.com>
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
Signed-off-by: Ibrahim Rabbani <israbbani@gmail.com>
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
Signed-off-by: Ibrahim Rabbani <israbbani@gmail.com>
Signed-off-by: irabbani <irabbani@anyscale.com>
Signed-off-by: irabbani <irabbani@anyscale.com>
fix CI.

Signed-off-by: irabbani <irabbani@anyscale.com>
Signed-off-by: irabbani <irabbani@anyscale.com>
Signed-off-by: irabbani <irabbani@anyscale.com>
Signed-off-by: irabbani <irabbani@anyscale.com>
Signed-off-by: Ibrahim Rabbani <irabbani@anyscale.com>
Base automatically changed from irabbani/cgroups-9 to master September 16, 2025 22:17
Signed-off-by: Ibrahim Rabbani <irabbani@anyscale.com>
/**
Moves the process into the system leaf cgroup (@see kLeafCgroupName).
To move the pid, the process must have read, write, and execute permissions for the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Add a colon at the end of the line since it preceeds a list

@edoakes edoakes merged commit abd40b3 into master Sep 17, 2025
4 of 5 checks passed
@edoakes edoakes deleted the irabbani/cgroups-10 branch September 17, 2025 13:23
edoakes pushed a commit that referenced this pull request Sep 18, 2025
…6626)

Broken in #56446. 

This should stop being possible once there's a single cgroups target
exported (as highlighted in #54703).

I've fixed the broken build and I've added a temporary test target that
builds the noop implementations as part of Linux CI so it gets caught in
premerge.

---------

Signed-off-by: irabbani <israbbani@gmail.com>
israbbani added a commit that referenced this pull request Sep 22, 2025
…oupDriver to move processes into system cgroup (#56446)"

This reverts commit abd40b3.
zma2 pushed a commit to zma2/ray that referenced this pull request Sep 23, 2025
…r to move processes into system cgroup (ray-project#56446)

This PR stacks on ray-project#56352 .

For more details about the resource isolation project see
ray-project#54703.

This PR the following functions to move a process into the system
cgroup:
* CgroupManagerInterface::AddProcessToSystemCgroup
* CgroupDriverInterface::AddProcessToCgroup

I've also added integration tests for SysFsCgroupDriver and unit tests
for CgroupManager.

Let me explain how these APIs will be used. In the next PR, the raylet
will
* be passed a list of pids of system processes that are started before
the raylet starts and need to be moved into the system cgroup (e.g.
gcs_server)
* call CgroupManagerInterface::AddProcessToSystemCgroup for each of
these pids to move them into the system cgroup.

---------

Signed-off-by: Ibrahim Rabbani <irabbani@anyscale.com>
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
Signed-off-by: Zhiqiang Ma <zhiqiang.ma@intel.com>
zma2 pushed a commit to zma2/ray that referenced this pull request Sep 23, 2025
…y-project#56626)

Broken in ray-project#56446.

This should stop being possible once there's a single cgroups target
exported (as highlighted in ray-project#54703).

I've fixed the broken build and I've added a temporary test target that
builds the noop implementations as part of Linux CI so it gets caught in
premerge.

---------

Signed-off-by: irabbani <israbbani@gmail.com>
Signed-off-by: Zhiqiang Ma <zhiqiang.ma@intel.com>
ZacAttack pushed a commit to ZacAttack/ray that referenced this pull request Sep 24, 2025
…r to move processes into system cgroup (ray-project#56446)

This PR stacks on ray-project#56352 .

For more details about the resource isolation project see
ray-project#54703.

This PR the following functions to move a process into the system
cgroup:
* CgroupManagerInterface::AddProcessToSystemCgroup
* CgroupDriverInterface::AddProcessToCgroup

I've also added integration tests for SysFsCgroupDriver and unit tests
for CgroupManager.

Let me explain how these APIs will be used. In the next PR, the raylet
will
* be passed a list of pids of system processes that are started before
the raylet starts and need to be moved into the system cgroup (e.g.
gcs_server)
* call CgroupManagerInterface::AddProcessToSystemCgroup for each of
these pids to move them into the system cgroup.

---------

Signed-off-by: Ibrahim Rabbani <irabbani@anyscale.com>
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
Signed-off-by: zac <zac@anyscale.com>
ZacAttack pushed a commit to ZacAttack/ray that referenced this pull request Sep 24, 2025
…y-project#56626)

Broken in ray-project#56446. 

This should stop being possible once there's a single cgroups target
exported (as highlighted in ray-project#54703).

I've fixed the broken build and I've added a temporary test target that
builds the noop implementations as part of Linux CI so it gets caught in
premerge.

---------

Signed-off-by: irabbani <israbbani@gmail.com>
Signed-off-by: zac <zac@anyscale.com>
elliot-barn pushed a commit that referenced this pull request Sep 24, 2025
…r to move processes into system cgroup (#56446)

This PR stacks on #56352 .

For more details about the resource isolation project see
#54703.

This PR the following functions to move a process into the system
cgroup:
* CgroupManagerInterface::AddProcessToSystemCgroup
* CgroupDriverInterface::AddProcessToCgroup

I've also added integration tests for SysFsCgroupDriver and unit tests
for CgroupManager.

Let me explain how these APIs will be used. In the next PR, the raylet
will
* be passed a list of pids of system processes that are started before
the raylet starts and need to be moved into the system cgroup (e.g.
gcs_server)
* call CgroupManagerInterface::AddProcessToSystemCgroup for each of
these pids to move them into the system cgroup.

---------

Signed-off-by: Ibrahim Rabbani <irabbani@anyscale.com>
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>
elliot-barn pushed a commit that referenced this pull request Sep 24, 2025
…6626)

Broken in #56446. 

This should stop being possible once there's a single cgroups target
exported (as highlighted in #54703).

I've fixed the broken build and I've added a temporary test target that
builds the noop implementations as part of Linux CI so it gets caught in
premerge.

---------

Signed-off-by: irabbani <israbbani@gmail.com>
Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>
marcostephan pushed a commit to marcostephan/ray that referenced this pull request Sep 24, 2025
…r to move processes into system cgroup (ray-project#56446)

This PR stacks on ray-project#56352 .

For more details about the resource isolation project see
ray-project#54703.

This PR the following functions to move a process into the system
cgroup:
* CgroupManagerInterface::AddProcessToSystemCgroup
* CgroupDriverInterface::AddProcessToCgroup

I've also added integration tests for SysFsCgroupDriver and unit tests
for CgroupManager.

Let me explain how these APIs will be used. In the next PR, the raylet
will
* be passed a list of pids of system processes that are started before
the raylet starts and need to be moved into the system cgroup (e.g.
gcs_server)
* call CgroupManagerInterface::AddProcessToSystemCgroup for each of
these pids to move them into the system cgroup.

---------

Signed-off-by: Ibrahim Rabbani <irabbani@anyscale.com>
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
Signed-off-by: Marco Stephan <marco@magic.dev>
marcostephan pushed a commit to marcostephan/ray that referenced this pull request Sep 24, 2025
…y-project#56626)

Broken in ray-project#56446.

This should stop being possible once there's a single cgroups target
exported (as highlighted in ray-project#54703).

I've fixed the broken build and I've added a temporary test target that
builds the noop implementations as part of Linux CI so it gets caught in
premerge.

---------

Signed-off-by: irabbani <israbbani@gmail.com>
Signed-off-by: Marco Stephan <marco@magic.dev>
elliot-barn pushed a commit that referenced this pull request Sep 27, 2025
…r to move processes into system cgroup (#56446)

This PR stacks on #56352 .

For more details about the resource isolation project see
#54703.

This PR the following functions to move a process into the system
cgroup:
* CgroupManagerInterface::AddProcessToSystemCgroup
* CgroupDriverInterface::AddProcessToCgroup

I've also added integration tests for SysFsCgroupDriver and unit tests
for CgroupManager.

Let me explain how these APIs will be used. In the next PR, the raylet
will
* be passed a list of pids of system processes that are started before
the raylet starts and need to be moved into the system cgroup (e.g.
gcs_server)
* call CgroupManagerInterface::AddProcessToSystemCgroup for each of
these pids to move them into the system cgroup.

---------

Signed-off-by: Ibrahim Rabbani <irabbani@anyscale.com>
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>
elliot-barn pushed a commit that referenced this pull request Sep 27, 2025
…6626)

Broken in #56446. 

This should stop being possible once there's a single cgroups target
exported (as highlighted in #54703).

I've fixed the broken build and I've added a temporary test target that
builds the noop implementations as part of Linux CI so it gets caught in
premerge.

---------

Signed-off-by: irabbani <israbbani@gmail.com>
Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>
dstrodtman pushed a commit that referenced this pull request Oct 6, 2025
…r to move processes into system cgroup (#56446)

This PR stacks on #56352 .

For more details about the resource isolation project see
#54703.

This PR the following functions to move a process into the system
cgroup:
* CgroupManagerInterface::AddProcessToSystemCgroup
* CgroupDriverInterface::AddProcessToCgroup

I've also added integration tests for SysFsCgroupDriver and unit tests
for CgroupManager.

Let me explain how these APIs will be used. In the next PR, the raylet
will
* be passed a list of pids of system processes that are started before
the raylet starts and need to be moved into the system cgroup (e.g.
gcs_server)
* call CgroupManagerInterface::AddProcessToSystemCgroup for each of
these pids to move them into the system cgroup.

---------

Signed-off-by: Ibrahim Rabbani <irabbani@anyscale.com>
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
Signed-off-by: Douglas Strodtman <douglas@anyscale.com>
dstrodtman pushed a commit to dstrodtman/ray that referenced this pull request Oct 6, 2025
…y-project#56626)

Broken in ray-project#56446.

This should stop being possible once there's a single cgroups target
exported (as highlighted in ray-project#54703).

I've fixed the broken build and I've added a temporary test target that
builds the noop implementations as part of Linux CI so it gets caught in
premerge.

---------

Signed-off-by: irabbani <israbbani@gmail.com>
Signed-off-by: Douglas Strodtman <douglas@anyscale.com>
justinyeh1995 pushed a commit to justinyeh1995/ray that referenced this pull request Oct 20, 2025
…r to move processes into system cgroup (ray-project#56446)

This PR stacks on ray-project#56352 .

For more details about the resource isolation project see
ray-project#54703.

This PR the following functions to move a process into the system
cgroup:
* CgroupManagerInterface::AddProcessToSystemCgroup
* CgroupDriverInterface::AddProcessToCgroup

I've also added integration tests for SysFsCgroupDriver and unit tests
for CgroupManager.

Let me explain how these APIs will be used. In the next PR, the raylet
will
* be passed a list of pids of system processes that are started before
the raylet starts and need to be moved into the system cgroup (e.g.
gcs_server)
* call CgroupManagerInterface::AddProcessToSystemCgroup for each of
these pids to move them into the system cgroup.

---------

Signed-off-by: Ibrahim Rabbani <irabbani@anyscale.com>
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
justinyeh1995 pushed a commit to justinyeh1995/ray that referenced this pull request Oct 20, 2025
…y-project#56626)

Broken in ray-project#56446. 

This should stop being possible once there's a single cgroups target
exported (as highlighted in ray-project#54703).

I've fixed the broken build and I've added a temporary test target that
builds the noop implementations as part of Linux CI so it gets caught in
premerge.

---------

Signed-off-by: irabbani <israbbani@gmail.com>
landscapepainter pushed a commit to landscapepainter/ray that referenced this pull request Nov 17, 2025
…r to move processes into system cgroup (ray-project#56446)

This PR stacks on ray-project#56352 .

For more details about the resource isolation project see
ray-project#54703.

This PR the following functions to move a process into the system
cgroup:
* CgroupManagerInterface::AddProcessToSystemCgroup
* CgroupDriverInterface::AddProcessToCgroup

I've also added integration tests for SysFsCgroupDriver and unit tests
for CgroupManager.

Let me explain how these APIs will be used. In the next PR, the raylet
will
* be passed a list of pids of system processes that are started before
the raylet starts and need to be moved into the system cgroup (e.g.
gcs_server)
* call CgroupManagerInterface::AddProcessToSystemCgroup for each of
these pids to move them into the system cgroup.

---------

Signed-off-by: Ibrahim Rabbani <irabbani@anyscale.com>
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
landscapepainter pushed a commit to landscapepainter/ray that referenced this pull request Nov 17, 2025
…y-project#56626)

Broken in ray-project#56446. 

This should stop being possible once there's a single cgroups target
exported (as highlighted in ray-project#54703).

I've fixed the broken build and I've added a temporary test target that
builds the noop implementations as part of Linux CI so it gets caught in
premerge.

---------

Signed-off-by: irabbani <israbbani@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Issues that should be addressed in Ray Core go add ONLY when ready to merge, run all tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants