Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

process.Percent returns incorrect value on multiple cores #755

Closed
codemanCn opened this issue Aug 28, 2019 · 3 comments · Fixed by #776
Closed

process.Percent returns incorrect value on multiple cores #755

codemanCn opened this issue Aug 28, 2019 · 3 comments · Fixed by #776

Comments

@codemanCn
Copy link

hello:
Some days before you push process.go, you change code

        delta_proc := t2.Total() - t1.Total()
	overall_percent := ((delta_proc / delta) * 100) * float64(numcpu)
	return overall_percent

        TO:

	delta_proc := t2.Total() - t1.Total()
	overall_percent := ((delta_proc / delta) * 100) * float64(numcpu)
	return math.Min(100, math.Max(0, overall_percent))

According to my observation ,this is probably wrong.
when the process useing 60% of a six core processor ,the process.Percent return 100 to me not 6 * 100 * 60%

@Lomanic
Copy link
Collaborator

Lomanic commented Aug 28, 2019

Looks like the percent calculation was wrong all along then, as I don't think having more than 100% is expectable (it's called "Percent" after all).

Or the other way, if we assume the calculation was right before, the fix is as simple as return math.Min(numcpu * 100, math.Max(0, overall_percent)).

The commit you mention is f4e2355 btw (way easier to find afterwards instead of "this commit from days ago").

@Lomanic Lomanic changed the title In windows process.Percent return incorrect value process.Percent returns incorrect value on multiple cores Aug 28, 2019
lrita added a commit to lrita/gopsutil that referenced this issue Sep 3, 2019
make it return percent of all cpu core usage which will bewteen 0 and 100
lrita added a commit to lrita/gopsutil that referenced this issue Sep 4, 2019
@sajoupa
Copy link
Contributor

sajoupa commented Sep 19, 2019

Hi,

In general, a "percent" can totally be greater than 100 ("inflation at 250%"), or lesser than 0 ("-30% discount !"). It's just a ratio, multiplied by 100.
In the context of CPU usage, the most common reference is 1 core, and percent representations often go over 100:

$ top -b -n 1
[...]
  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
33756 libvirt+  20   0 9979.1m 4.233g  23052 S 400.0  4.5   9:53.96 qemu-system-x86

CPU usage is taken from /proc/<pid>/stat, which gives the amount of time that a process has been scheduled, measured in clock ticks. Several processes can be scheduled during a same clock tick, or a single process can be executed at the same time on several cores. So, I can really see no reason to cap the value at 100.

lrita@9870316 would make things more consistent, but the reference would be the total capacity of the machine, not 1 core.

I strongly suggest f4e2355 should just be reverted.

Thanks !

(Adding a floor at 0 should have no consequence because a negative value is theoretically impossible. I'd suggest raising an error instead of forcing a 0 value, but no strong opinion)

@mlashley
Copy link

I've just spent the last $hours debugging this after 1.11.2 => 1.12.3 telegraf upgrade.
A multi-threaded process can very well consume e.g. 4s of CPU time during 1s wall-time on a 4-core machine.
Assume for the existing code, delta_proc for my 4-threaded process is 1000 jiffies, over a wall-clock interval =250 jiffies, running on my 6-cpu system

func (p *Process) PercentWithContext(ctx context.Context, interval time.Duration) (float64, error) {
...
  delta := (now.Sub(p.lastCPUTime).Seconds()) * float64(numcpu)
  ret := calculatePercent(p.lastCPUTimes, cpuTimes, delta, numcpu)
delta_proc := t2.Total() - t1.Total()
overall_percent := ((delta_proc / delta) * 100) * float64(numcpu)
return math.Min(100, math.Max(0, overall_percent))

((1000/ (6*250))*100) * 6 = (1000/1500) *100) * 6 = 400.

If you /were/ to constrain this to 0..100 for a 'process' - then 100 should mean that process is using all of the available CPU cores. Clearly that's not how it is functioning today, any process consuming more than a single-core's worth of CPU returns 100.

I think - where you already factor the number of cpu-cores into 'delta' - you can drop the * numcpu in the overall_percent to get the 'right' answer, if the intent was to have 100% mean 'every core doing work for this process'

Happy to send a pull-request if you agree.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants