Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os.cpu_count() should return a count assigned to a container or that the process is restricted to #80235

Closed
keirlawson mannequin opened this issue Feb 20, 2019 · 48 comments
Labels
3.12 bugs and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage stdlib Python modules in the Lib dir

Comments

@keirlawson
Copy link
Mannequin

keirlawson mannequin commented Feb 20, 2019

BPO 36054
Nosy @vstinner, @giampaolo, @tiran, @jab, @methane, @matrixise, @corona10, @Zheaoli, @mcnelsonphd, @NargiT, @Redeyed

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2019-02-20.16:56:22.384>
labels = ['interpreter-core', '3.8', '3.7', 'library', 'performance']
title = 'On Linux, os.count() should read cgroup cpu.shares and cpu.cfs (CPU count inside docker container)'
updated_at = <Date 2021-10-11.14:25:34.762>
user = 'https://github.com/keirlawson'

bugs.python.org fields:

activity = <Date 2021-10-11.14:25:34.762>
actor = 'corona10'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Interpreter Core', 'Library (Lib)']
creation = <Date 2019-02-20.16:56:22.384>
creator = 'keirlawson'
dependencies = []
files = []
hgrepos = []
issue_num = 36054
keywords = []
message_count = 24.0
messages = ['336117', '336126', '336146', '336148', '336149', '338724', '338778', '338780', '338783', '339401', '339404', '339429', '339439', '353690', '364894', '364898', '364901', '364902', '364916', '365071', '365075', '366811', '403622', '403654']
nosy_count = 12.0
nosy_names = ['vstinner', 'giampaolo.rodola', 'christian.heimes', 'jab', 'methane', 'matrixise', 'corona10', 'Manjusaka', 'mcnelsonphd', 'galen', 'nargit', 'RedEyed']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'performance'
url = 'https://bugs.python.org/issue36054'
versions = ['Python 3.7', 'Python 3.8']

@keirlawson
Copy link
Mannequin Author

keirlawson mannequin commented Feb 20, 2019

There appears to be no way to detect the number of CPUs allotted to a Python program within a docker container. With the following script:

import os

print("os.cpu_count(): " + str(os.cpu_count()))
print("len(os.sched_getaffinity(0)): " + str(len(os.sched_getaffinity(0))))

when run in a container (from an Ubuntu 18.04 host) I get:

docker run -v "$PWD":/src/ -w /src/ --cpus=1 python:3.7 python detect_cpus.py
os.cpu_count(): 4
len(os.sched_getaffinity(0)): 4

Recent vesions of Java are able to correctly detect the CPU allocation:

docker run -it --cpus 1 openjdk:10-jdk
Feb 20, 2019 4:20:29 PM java.util.prefs.FileSystemPreferences$1 run
INFO: Created user preferences directory.
| Welcome to JShell -- Version 10.0.2
| For an introduction type: /help intro

jshell> Runtime.getRuntime().availableProcessors()
$1 ==> 1

@keirlawson keirlawson mannequin added 3.7 (EOL) end of life stdlib Python modules in the Lib dir performance Performance or resource usage labels Feb 20, 2019
@matrixise
Copy link
Member

I would like to work on this issue.

@matrixise
Copy link
Member

so, I have also tested with the last docker image of golang.

docker run --rm --cpus 1 -it golang /bin/bash

here is the code of golang:

package main

import "fmt"
import "runtime"

func main() {
cores := runtime.NumCPU()
fmt.Printf("This machine has %d CPU cores.\n", cores)
}

Here is the output

./demo
This machine has 4 CPU cores.

When I try with grep on /proc/cpuinfo
I get this result

grep processor /proc/cpuinfo -c
4

I will test with openjdk because it's related to Java and see if I can get the result of 1

@matrixise
Copy link
Member

ok, I didn't see your test with openjdk:10, sorry

@keirlawson
Copy link
Mannequin Author

keirlawson mannequin commented Feb 20, 2019

I believe this is related to this ticket: https://bugs.python.org/issue26692

Looking at Java's implementation it seems like they are checking if cgroups are enabled via /proc/self/cgroup and then if it is parsing the cgroup information out of the filesystem.

@matrixise
Copy link
Member

I am really sorry but I thought to work on this issue but it's not the case. Feel free to submit a PR.

@matrixise matrixise added interpreter-core (Objects, Python, Grammar, and Parser dirs) 3.8 (EOL) end of life labels Mar 24, 2019
@Zheaoli
Copy link
Mannequin

Zheaoli mannequin commented Mar 25, 2019

I think that I may work on a PR for this issue. Is there anybody has worked on it ?

@matrixise
Copy link
Member

Hi Manjusaka,

Could you explain your solution, because I have read the code of
openjdk, (C++) and I am going to be honnest, it was not complex but not
really clear.

Also, if you need my help for the review or for the construction of your
PR, I can help you.

Have a nice day,

Stéphane

@Zheaoli
Copy link
Mannequin

Zheaoli mannequin commented Mar 25, 2019

Hi Stéphane

Thanks a lot!

In my opinion, I would like to make an independent library that name is cgroups. For ease of use and compatibility, I think it's better than combining code with the os module.

Thanks for you working!

Manjusaka

@Zheaoli
Copy link
Mannequin

Zheaoli mannequin commented Apr 3, 2019

Hi Stéphane:

I have checked the JVM implantation about container improvements. I confirm that maybe we need a new Libary for container environment. I don't think that combine it into the os module is a good idea. I will make a PR during this week.

@tiran
Copy link
Member

tiran commented Apr 3, 2019

@Zheaoli
Copy link
Mannequin

Zheaoli mannequin commented Apr 4, 2019

Yes, not only but also support get real memory limit.

look at https://blogs.oracle.com/java-platform-group/java-se-support-for-docker-cpu-and-memory-limits

@matrixise
Copy link
Member

Yes, not only but also support get real memory limit.

look at https://blogs.oracle.com/java-platform-group/java-se-support-for-docker-cpu-and-memory-limits

Yep, but in this case, you have to create an other issue for the memory
limit.

@mcnelsonphd
Copy link
Mannequin

mcnelsonphd mannequin commented Oct 1, 2019

Is this issue still being worked on as a core feature? I needed a solution for this using 2.7.11 to enable some old code to work properly/nicely in a container environment on AWS Batch and was forced to figure out what OpenJDK was doing and came up with a solution. The process in OpenJDK seems to be, find where the cgroups for docker are located in the file system, then depending on the values in different files you can determine the number of CPUs available.

The inelegant code below is what worked for me:

def query_cpu():
	if os.path.isfile('/sys/fs/cgroup/cpu/cpu.cfs_quota_us'):
		cpu_quota = int(open('/sys/fs/cgroup/cpu/cpu.cfs_quota_us').read().rstrip())
		#print(cpu_quota) # Not useful for AWS Batch based jobs as result is -1, but works on local linux systems
	if cpu_quota != -1 and os.path.isfile('/sys/fs/cgroup/cpu/cpu.cfs_period_us'):
		cpu_period = int(open('/sys/fs/cgroup/cpu/cpu.cfs_period_us').read().rstrip())
		#print(cpu_period)
		avail_cpu = int(cpu_quota / cpu_period) # Divide quota by period and you should get num of allotted CPU to the container, rounded down if fractional.
	elif os.path.isfile('/sys/fs/cgroup/cpu/cpu.shares'):
		cpu_shares = int(open('/sys/fs/cgroup/cpu/cpu.shares').read().rstrip())
		#print(cpu_shares) # For AWS, gives correct value * 1024.
		avail_cpu = int(cpu_shares / 1024)
	return avail_cpu

This solution makes several assumptions about the cgroup locations within the container vs dynamically finding where those files are located as OpenJDK does. I also haven't included the more robust method in case cpu.quota and cpu.shares are -1.

Hopefully this is a start for getting this implemented.

@vstinner vstinner changed the title Way to detect CPU count inside docker container On Linux, os.count() should read cgroup cpu.shares and cpu.cfs (CPU count inside docker container) Oct 1, 2019
@Zheaoli
Copy link
Mannequin

Zheaoli mannequin commented Mar 23, 2020

Hello Mike, thanks for your code.

I think it's a good way

I think if cpu.quota and cpu.shares are -1, just return the original value in os.cpu_count() is OK

@Zheaoli
Copy link
Mannequin

Zheaoli mannequin commented Mar 23, 2020

I will make a PR in this week

@vstinner
Copy link
Member

I'm not sure that it's a good idea to change os.cpucount(). I suggest to add a new function instead.

os.cpu_count() iss documented as:
"Return the number of CPUs in the system."
https://docs.python.org/dev/library/os.html#os.cpu_count

By the way, the documentation adds:

"This number is not equivalent to the number of CPUs the current process can use. The number of usable CPUs can be obtained with len(os.sched_getaffinity(0))"

@tiran
Copy link
Member

tiran commented Mar 23, 2020

I suggest that your provide a low solution that returns general information from cgroup v1 and unified cgroup v2 rather than a solution that focuses on CPU only. Then you can provide a high level interface that returns effective CPU cores.

cgroup v2 (unified hierarchy) has been around for 6 years and slowly gains traction as container platforms start to support them.

@Zheaoli
Copy link
Mannequin

Zheaoli mannequin commented Mar 24, 2020

Actually, we already have some third party libs to support cgroup. But most of them get these questions

  1. They are not std lib

  2. They are just support cgroup1

But if we want to add a new std lib. Should we create a PEP?

@Zheaoli
Copy link
Mannequin

Zheaoli mannequin commented Mar 26, 2020

Hello guys, I some ideas about this issue

First, maybe we should add a new API named cpu_usable_count(). I think it's more meaningful than the os.sched_getaffinity(0)

Second, more and more people use docker to run their app today. So people need an official way to get the environment info, not just cpu, but the memory, the network traffic limit. Because the docker is based on the CGroup in Linux, maybe we can add a cgroup lib as an official supported lib.

but I'm not sure about this idea, because there are some problem.

  1. the CGroup is only supported on Linux. I'm not sure that adding a platform-specific lib is a good idea

  2. Many languages are not adding cgroup official yet. Maybe there are some languages are optimized for the cgroup (such as Java in JVM)

@vstinner
Copy link
Member

Hello guys,

Please try to find a more inclusive way to say hello: https://heyguys.cc/ ;-)

@NargiT
Copy link
Mannequin

NargiT mannequin commented Apr 20, 2020

Do we have any news about this?

@RedEyed
Copy link
Mannequin

RedEyed mannequin commented Oct 11, 2021

Do we have any news about this?

There is IBM effort to do this in container level, so that os.cpu_count() will return right result in container

https://www.phoronix.com/scan.php?page=news_item&px=Linux-CPU-Namespace

@corona10
Copy link
Member

There is IBM effort to do this in container level, so that os.cpu_count() will return right result in container

Good news!

@ben-spiller
Copy link

If we create a new API rather than putting this into cpu_count() we would need to consider updating existing uses of cpu_count() in the stdlib such as the way default max_workers is set in the ThreadPoolExecutor to use the cgroups-aware value if available (since ignoring the cgroups settings is almost always a bad and dangerous default)

@kosta
Copy link

kosta commented Apr 4, 2023

Regarding @d1gl3 code:

I need this in containers running in kubernetes. In my preliminary testing on google kubernetes engine:

  • /sys/fs/cgroup/cpu/cpu.shares / 1024 corresponds to the pods cpu request
  • /sys/fs/cgroup/cpu/cfs_quota_us / /sys/fs/cgroup/cpu/cfs_period_us corresponds to the cpu limit
  • /sys/fs/cgroup/cpu.max does not exist

However, outside of kubernetes, /sys/fs/cgroup/cpu/cpu.shares returns 1024, even on systems where os.cpu_count() returns 2 or 10.

I will try to find documentation that clears this up.

@kosta
Copy link

kosta commented Apr 12, 2023

Update:

About CPU requests (e.g. from kubernetes)

I am primarily interested in the kubernetes cpu request (not limit!) use case, which is based on cpu shares, but that is not easy to read in a standardized way. In the end, my solution was to inject an env var from kubernetes and parse that from python, like this:

...
spec:
  ...
  containers:
  - ...
    env:
    - name: KUBERNETES_CPU_REQUEST_MILLIS
        valueFrom:
        resourceFieldRef:
          resource: requests.cpu
          divisor: 1m

However that is not suitable to be read from the python std lib, at least until there is some standard to follow.

OpenJDK used to read /sys/fs/cgroup/cpu/cpu.shares and use that (divided by 1024) as Runtime.availableProcessors().
However that was removed in JDK-8281181, commit, more discussion. The main reason is that why it's easy to read CPU shares, it's hard to get the total amount of cpu shares across all processes. And even then, that is not really a meaningful number. (This is why I use the env var, see above)

About CPU limits

It probably makes sense for python to respect a cpu limit if it is set. For that, using cfs_quota_us / cfs_period_us (rounded up) is probaby the right call. OpenJDK does that as well.

@kosta
Copy link

kosta commented Apr 12, 2023

Is this related/duplicate of #70879?

@indygreg
Copy link
Contributor

The fundamental limitation with reading cpu.shares (or cpu.weight in cgroups v2) is that the number needs to be compared against sibling cgroups. While you are typically able to read your own process's cgroups, there's no guarantee you can read the peer cgroups. In the context of Kubernetes, a container may be able to read its own cgroup values but not values for other containers in the same Pod.

OpenJDK essentially said since we can't reliably deduce CPU count from cgroups, so we're going to force end-users to tell us what they want.

OpenJDK's position is defensible. But it neglects something very important: it is a very common convention for Linux containers to normalize the cgroup shares value so 1024 = 1 CPU. The last I looked this is how both Docker and Kubernetes work. While an implementation detail, it is one that has existed for years and that spans container ecosystems.

Given how prevalent it is, I would argue there is value in supporting a mode where CPU count can be deduced from cpu.shares / cpu.weight using the common 1024=1 normalization convention. This allows end-users to easily automatically derive CPU counts when running in popular container environments without forcing end-users to keep their container CPU requests and a Python argument in sync. For a few releases OpenJDK had the -XX:+UseContainerCpuShares flag to support this turnkey derivation. But they ripped it out and newer versions of OpenJDK are more customer hostile as a result.

Note: what we really want is the Linux kernel to expose a virtual file to expose an absolute value for number of CPUs available to a process. That would seemingly make this problem go away for everyone.

@vstinner
Copy link
Member

For now, it seems like parsing cgroups remains blurry and solving that problem with a PyPI project would be better, until the "container" topic is fully standardized in Linux. Hint: there is no such thing as "container" in the Linux kernel, only tons of "namespaces" :-) See: https://lwn.net/Articles/740621/

@vstinner
Copy link
Member

There is IBM effort to do this in container level, so that os.cpu_count() will return right result in container
https://www.phoronix.com/scan.php?page=news_item&px=Linux-CPU-Namespace

This work didn't get merged into the Linux kernel.

@vstinner
Copy link
Member

pylint parses different cgroup files:

  • /sys/fs/cgroup/cpu/cpu.cfs_quota_us
  • /sys/fs/cgroup/cpu/cpu.cfs_period_us
  • /sys/fs/cgroup/cpu/cpu.shares

Code: https://github.com/d1gl3/pylint/blob/2625c341e07f60276dcd2c23a8b47f49f80688a0/pylint/lint/run.py#L34-L80

@vstinner
Copy link
Member

I closed issue #70879 as a duplicate of this issue.

@gpshead
Copy link
Member

gpshead commented Sep 22, 2023

Why do people want to care about cgroups shares? That isn't really counting anything concrete. What matters is how many possible cores could execute code for this process in parallel.

Whether or not the OS is actually going to schedule our process (tree) across that many cores at any given moment in the larger (often opaque to us) system is not what an os API should be attempting to guess.

The os API should be able to return the maximum potential parallelism we can get. Not predict the future. (#109652 is a potential step in that direction)

@vstinner
Copy link
Member

What matters is how many possible cores could execute code for this process in parallel.

Is there a reliable way to retrieve this information?

@indygreg
Copy link
Contributor

Yes, strictly speaking cgroups shares/weight has nothing to do with parallelism or number of CPUs. This cgroups mechanism is a way of specifying a proportion of the kernel scheduler's time slice that a cgroup can consume. A lot of people mistakenly believe that by specifying cpus=N to their container runtime they are asking for N CPUs. What they are actually asking for is N CPUs worth of time from the scheduler. They are free to consume those resources using as few or many threads as they want (subject to additional cgroups limitations) and the kernel will schedule work on as few or as many physical CPU cores as it chooses.

However, it is extremely popular to size thread/process to match the proportional CPU count given to the container runtime. And container runtimes make it easy to just divide the cgroups share by 1024 to deduce that value. So everyone does that. It mostly just works for the purposes people use it for.

Now if container runtimes exposed the resource settings to containers (by injecting an environment variable perhaps?), that is something I may potentially use. But at the same time, cgroups feels more canonical. We also have to consider that dynamic resource sizing for containers/cgroups is a thing after all! I can't imagine the maintainers of container runtimes will want to try to propagate mutable state into the container. It is far easier for them to do nothing and just tell end-users to query cgroups.

@gpshead gpshead changed the title On Linux, os.count() should read cgroup cpu.shares and cpu.cfs (CPU count inside docker container) os.cpu_count() should return a count assigned to a container or that the process is restricted to Sep 22, 2023
@vstinner
Copy link
Member

If cpu_count() starts parsing cgroups, I would suggest using the minimum between what's announced in cgroups and what's seen by taking in account CPU affinity: that's what does pylint.

See #80235 (comment) and the pointed code:

return min(cpu_share, cpu_count)

where cpu_count is len(sched_getaffinity(0)).

@vstinner
Copy link
Member

What I dislike here is that the ratio converting cgroups CPU shares to a number of logical CPUs is not standardized.

@indygreg proposes to announce this constant on a command line option / env var to solve this issue: #109595 (comment).

Maybe 1024 can be used by default, but it's important that it's possible to configure this value.

@corona10
Copy link
Member

FYI, now we provide a way to restrict CPU resources from #109595.
This can be the alternative way to handle this issue.

@vstinner
Copy link
Member

After reading a lot about Linux cgroups CPU share, I don't think that they should be used in os.cpu_count() or os.process_cpu_count(). I close the issue.

The relationship between a CPU share and a number of CPU is not clear and caused a lot of headaches to many people.

Obviously, some people have different opinion and I suggest them to implement something similar to pylint or create a project on PyPI, get feedback from other users, and then come back with a strong rationale to convince why it's very important to add it to Python :-)

Maybe later we can consider to add a "get_cgroups_cpu_share()" function, but I'm not convinced about using it to compute cpu_count() or process_cpu_count().

@robsonpeixoto
Copy link

robsonpeixoto commented Oct 12, 2023

@vstinner os not possible to define os.cpu_count() based on CPU share. IMHO should only care about quota(CPU limit), because is the only information that can be easily used to define the os.cpu_count().

This project can be useful https://github.com/uber-go/automaxprocs

@vstinner
Copy link
Member

IMHO should only care about quota(CPU limit)

I mean that in general, Linux cgroups CPU limits are not directly related to a whole number of CPU and so it's hard to use such limit to update os.cpu_count() or os.process_cpu_count().

Please create a PyPI project to parse Linux cgroups and see how it goes ;-)

So far, since 2019, I didn't see any clear implementation of "cpu_count" based on Linux cgroups. The closest thing that I saw so far was pylint code, but I don't think that we should do something like that in Python stdlib.

@blackliner
Copy link

Is there a TL&DR for anyone coming to this thread from google?

I.e.; What is the canonical way to find the (lower) CPU core number due to cgroup/container resource limitations?

@vstinner
Copy link
Member

I.e.; What is the canonical way to find the (lower) CPU core number due to cgroup/container resource limitations?

It was decided to not handle this problem in Python. You should use a 3rd party project for that.

@blackliner
Copy link

Can we provide some example, follow up links that could suggest the one or other way to solve this in our application code?

Or can the environment solve this with an approach similar to #109595 via an env variable? Will ask in that thread over there 👍

@vstinner
Copy link
Member

Or can the environment solve this with an approach similar to #109595 via an env variable? Will ask in that thread over there 👍

You can use PYTHON_CPU_COUNT env var to override os.cpu_count(): https://docs.python.org/dev/using/cmdline.html#envvar-PYTHON_CPU_COUNT

Can we provide some example, follow up links that could suggest the one or other way to solve this in our application code?

I don't think that we should provide examples in Python documentation. This problem is not solved, there are only heuristics, it depends on your environment and other stuff.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.12 bugs and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage stdlib Python modules in the Lib dir
Projects
None yet
Development

No branches or pull requests