Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support for cgroup CPUQuota and CPUPeriod prometheus limit #969

Merged
merged 1 commit into from
Feb 4, 2016
Merged

Conversation

f0
Copy link
Contributor

@f0 f0 commented Nov 22, 2015

This adds support for the CPUQuota cgroup limit.

This actually does not work , i miss something, any hints?

@cadvisorJenkinsBot
Copy link
Collaborator

Can one of the admins verify this patch?

@jimmidyson
Copy link
Collaborator

Thanks for this. What doesn't work? No metric or no values reported in the metric?

@f0
Copy link
Contributor Author

f0 commented Nov 22, 2015

cadvisor builds sucessfully, but in the prometheus output the quota is not exported , but is present in the specific cgroup

@jimmidyson
Copy link
Collaborator

Can you share the whole /metrics endpoint output please? Perhaps in a gist.

@f0
Copy link
Contributor Author

f0 commented Nov 22, 2015

@jimmidyson
Copy link
Collaborator

And you're sure you built & replaced cadvisor binary with the newly built binary? Everything looks fine to me but the output.

@f0
Copy link
Contributor Author

f0 commented Nov 22, 2015

hm, yes i am sure but i double check ...

@f0
Copy link
Contributor Author

f0 commented Nov 22, 2015

hm yes its the correct binary, but its not shown

@jimmidyson
Copy link
Collaborator

OK I'll build with your PR locally tomorrow & see if I can see what's going on.

@f0
Copy link
Contributor Author

f0 commented Nov 22, 2015

@jimmidyson thx

@jimmidyson
Copy link
Collaborator

This works perfectly fine for me locally... How are you building? Just run make & you should get a binary in the root of the project dir to use.

@jimmidyson
Copy link
Collaborator

Well when I say it works perfectly I mean I get output in the /metrics endpoint.

But in logs (running with --logtostderr):

E1123 09:15:09.954624 11428 handler.go:149] raw driver: Failed to parse int "-1" from file "/sys/fs/cgroup/cpu,cpuacct/cpu.cfs_quota_us": strconv.ParseUint: parsing "-1": invalid syntax

Need a test for this as well if possible.

@f0
Copy link
Contributor Author

f0 commented Nov 23, 2015

@jimmidyson yep this happens when no Quota is set, so it need to be coverd.... i run build/build.sh

@f0
Copy link
Contributor Author

f0 commented Nov 23, 2015

@jimmidyson hm is there a specific go version needed to build ? atm i have 1.4.3

@jimmidyson
Copy link
Collaborator

1.4 should be fine. Do you get errors when building with 1.4?

@f0
Copy link
Contributor Author

f0 commented Nov 23, 2015

@jimmidyson yes

fk@linux-3zvu:~/work/src/f0/cadvisor> make

formatting code
building binaries
cadvisor

github.com/google/cadvisor

usage: 6l [options] main.6
-1 use alternate profiling code
-8 assume 64-bit addresses
-B info
define ELF NT_GNU_BUILD_ID note
-C check Go calls to C code
-D addr
data address
-E sym
entry symbol
-I interp
set ELF interp
-L dir
add dir to library path
-H head
header type
-K add stack underflow checks
-O print pc-line tables
-Q debug byte-register code gen
-R rnd
address rounding
-S check type signatures
-T addr
text address
-V print version and exit
-W disassemble input
-X name value
define string data
-Z clear stack frame on entry
-a disassemble output
-c dump call graph
-d disable dynamic executable
-extld ld
linker to run in external mode
-extldflags ldflags
flags for external linker
-f ignore version mismatch
-g disable go package data checks
-installsuffix suffix
pkg directory suffix
-k sym
set field tracking symbol
-linkmode mode
set link mode (internal, external, auto)
-n dump symbol table
-o outfile
set output file
-r dir1:dir2:...
set ELF dynamic linker search path
-race
enable race detector
-s disable symbol table
-shared
generate shared object (implies -linkmode external)
-tmpdir dir
leave temporary files in this directory
-u reject unsafe packages
-v print link trace
-w disable DWARF generation
godep: go exit status 2
Makefile:32: recipe for target 'build' failed
make: *** [build] Error 1

@jimmidyson
Copy link
Collaborator

Just pushed a fix so you can build on 1.4 as well (ldflags was 1.5 stylee). Rebase against master & you should be able to build on 1.4 with make.

@f0
Copy link
Contributor Author

f0 commented Nov 23, 2015

ok now i can build, and i got these errors, but why , the type has a field Quota

fk@linux-3zvu:~/work/src/f0/cadvisor> make
 formatting code
 building binaries
   cadvisor
 running tests
f0/cadvisor/container/raw
container/raw/handler.go:203: spec.Cpu.Quota undefined (type v1.CpuSpec has no field or method Quota)
f0/cadvisor/metrics
metrics/prometheus.go:523: container.Spec.Cpu.Quota undefined (type v1.CpuSpec has no field or method Quota)

@f0
Copy link
Contributor Author

f0 commented Nov 23, 2015

@jimmidyson hm any Idea why spec.Cpu.Quota is not present, its in the files...

@f0 f0 closed this Nov 23, 2015
@cadvisorJenkinsBot
Copy link
Collaborator

Can one of the admins verify this patch?

@f0
Copy link
Contributor Author

f0 commented Nov 23, 2015

updated with check for -1 (quota not set), but still not work

@f0 f0 reopened this Nov 23, 2015
@f0
Copy link
Contributor Author

f0 commented Nov 24, 2015

@jimmidyson ok have fixed some things, now i get these errors while building, but why, the type has this field? And it does not work in my environment....

 running tests
 github.com/f0/cadvisor/container/raw
container/raw/handler.go:206: spec.Cpu.Quota undefined (type v1.CpuSpec has no field or method Quota)
 github.com/f0/cadvisor/metrics
metrics/prometheus.go:523: container.Spec.Cpu.Quota undefined (type v1.CpuSpec has no field or method Quota)

@jimmidyson
Copy link
Collaborator

Discussed on IRC: cadvisor needs to be cloned to $GOPATH/src/github.com/google/cadvisor & a remote set to your fork, rather than cloning to $GOPATH/src/github.com/f0/cadvisor.

@f0
Copy link
Contributor Author

f0 commented Nov 24, 2015

@jimmidyson after fixing gopath and some other orrors it finaly works THX

@f0 f0 changed the title add support for cgroup CPUQuota prometheus limit add support for cgroup CPUQuota and CPUPeriod prometheus limit Nov 24, 2015
@f0
Copy link
Contributor Author

f0 commented Nov 24, 2015

also adding support for the cpu period setting (with this we finally can calculate the percentual cpu usage )

@f0
Copy link
Contributor Author

f0 commented Nov 25, 2015

@jimmidyson when will this be merged?

@@ -13,6 +13,9 @@ container_cpu_usage_seconds_total{cpu="cpu03",id="testcontainer",image="test",na
# HELP container_cpu_user_seconds_total Cumulative user cpu time consumed in seconds.
# TYPE container_cpu_user_seconds_total counter
container_cpu_user_seconds_total{id="testcontainer",image="test",name="testcontaineralias"} 6e-09
# container_spec_cpu_quota HELP CPU quota of the container
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check the format of the lines - HELP should be at the start of the comment. You can run tests locally btw with make test.

@f0
Copy link
Contributor Author

f0 commented Dec 9, 2015

@jimmidyson

hm i dont get it now i have this error....
=== RUN TestPrometheusCollector
--- FAIL: TestPrometheusCollector (0.02s)
prometheus_test.go:189: want # HELP container_spec_cpu_quota CPU quota of the container, got # HELP container_fs_io_current Number of I/Os currently in progress

@jimmidyson
Copy link
Collaborator

The order of the output is important. Think metrics are ordered alphabetically so you'll need to put the new metric in the right place in the test file.

@f0
Copy link
Contributor Author

f0 commented Dec 9, 2015

@jimmidyson ah ok so then the test should be after
container_start_time_seconds and before container_tasks_state, but this also gives me the above error(with another error text)

@jimmidyson
Copy link
Collaborator

Should be before container_start_time_seconds.

@f0
Copy link
Contributor Author

f0 commented Dec 10, 2015

@jimmidyson also does not work

@k8s-bot
Copy link
Collaborator

k8s-bot commented Dec 11, 2015

Can one of the admins verify that this patch is reasonable to test? (reply "ok to test", or if you trust the user, reply "add to whitelist")

If this message is too spammy, please complain to ixdy.

@googlebot
Copy link
Collaborator

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for the commit author(s). If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.

@f0
Copy link
Contributor Author

f0 commented Dec 12, 2015

hm sorry false positive..... whyever the tests does not work
@jimmidyson i am out of ideas

@googlebot
Copy link
Collaborator

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for the commit author(s). If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.

@k8s-bot
Copy link
Collaborator

k8s-bot commented Dec 14, 2015

EXPERIMENTAL - CAN IGNORE

Build/test failed for commit e4f9207.

@@ -55,7 +55,11 @@ func (p testSubcontainersInfoProvider) SubcontainersInfo(string, *info.Container
Aliases: []string{"testcontaineralias"},
},
Spec: info.ContainerSpec{
Image: "test",
Image: "test",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to add in HasCpu: true, in here so that the CPU specs are output as metrics. You'll probably find then that there are a couple of other metrics which haven't been tested for that you'll need to insert into metrics/testdata/prometheus_metrics to get this to pass.

@k8s-bot
Copy link
Collaborator

k8s-bot commented Jan 28, 2016

Jenkins GCE e2e

Build/test failed for commit e4f9207.

@vishh
Copy link
Contributor

vishh commented Jan 28, 2016

Ping @f0!

@f0
Copy link
Contributor Author

f0 commented Jan 29, 2016

@vishh @jimmidyson sorry have not so much spare time atm, try to get this in the next days

@jimmidyson
Copy link
Collaborator

If you don't manage to get to it by Monday I can pull it into a separate PR & fix up the tests.

@f0
Copy link
Contributor Author

f0 commented Jan 29, 2016

@jimmidyson feel free, i have no chance to get this until monday, i am 90% offline at the weekend , sorry.

@k8s-bot
Copy link
Collaborator

k8s-bot commented Feb 1, 2016

Jenkins GCE e2e

Build/test failed for commit e4f9207.

@googlebot
Copy link
Collaborator

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for the commit author(s). If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.

@f0
Copy link
Contributor Author

f0 commented Feb 2, 2016

@jimmidyson @vishh finally done (and now i understand the test thing ....)

@k8s-bot
Copy link
Collaborator

k8s-bot commented Feb 2, 2016

Jenkins GCE e2e

Build/test passed for commit 3675a96.

@k8s-bot
Copy link
Collaborator

k8s-bot commented Feb 2, 2016

Jenkins GCE e2e

Build/test passed for commit 78592db.

@k8s-bot
Copy link
Collaborator

k8s-bot commented Feb 2, 2016

Jenkins GCE e2e

Build/test passed for commit 669bc4a.

@f0
Copy link
Contributor Author

f0 commented Feb 4, 2016

@jimmidyson @vishh any outstanding issues with this PR?

@jimmidyson
Copy link
Collaborator

no LGTM

jimmidyson added a commit that referenced this pull request Feb 4, 2016
add support for cgroup  CPUQuota  and CPUPeriod prometheus limit
@jimmidyson jimmidyson merged commit a894672 into google:master Feb 4, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants