-
Notifications
You must be signed in to change notification settings - Fork 2.4k
On-Demand container metrics #1779
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
6349de1
to
a60138c
Compare
info/v2/container.go
Outdated
@@ -234,6 +234,8 @@ type RequestOptions struct { | |||
Count int `json:"count"` | |||
// Whether to include stats for child subcontainers. | |||
Recursive bool `json:"recursive"` | |||
// Whether to update stats before returning | |||
UpdateStats bool `json:"update_stats"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder whether it would be more useful to specify this as MaxAge
, and update the stats if they are older. 0 means always update, and something else (-1?) means never update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that it is clearer. UpdateStats seemed vague, but I couldn't think of anything clearer. I have no plans to do anything other than trigger an update, mainly for responding when memcg thresholds are crossed. Do you think it is still worth doing MaxAge if we wont use it in k8s?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think it's still preferable. It's definitely useful outside of k8s, but even for the k8s usecase, there is probably some max age you could set for the memcg trigger (e.g. 10ms). It means that if a burst of requests came in simultaneously (for whatever reason), they can reuse the cached results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I changed it to MaxAgeMS *int32. Although I was wondering if just using time.Duration directly would be better?
manager/container.go
Outdated
for { | ||
houseKeepingTimer := time.NewTimer(c.nextHousekeepingInterval()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: create the timer outside the loop, and use Reset here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
manager/container.go
Outdated
// notify the calling function that housekeeping has completed | ||
close(finishedChan) | ||
case <-houseKeepingTimer.C: | ||
c.housekeepingTick(longHousekeeping) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
drain the finishedChan
after this step. I.e.
for {
select finishedChan := <-c.onDemandChan:
close(finishedChan)
default:
break
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, although I dont think we want a for loop around it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not? If there are multiple routines waiting for new stats and we just computed new stats, we should unblock them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
a60138c
to
812a1cf
Compare
Build finished. |
1 similar comment
Build finished. |
812a1cf
to
38790d5
Compare
Build finished. |
1 similar comment
Build finished. |
38790d5
to
3033ba8
Compare
Build finished. |
1 similar comment
Build finished. |
3033ba8
to
6df39e9
Compare
Build finished. |
1 similar comment
Build finished. |
6df39e9
to
473cc7d
Compare
Build finished. |
1 similar comment
Build finished. |
FYI, I plan to cut cadvisor v0.28.1, which will correspond to k8s v1.9.0 this Friday, 11/17. |
473cc7d
to
fc516a2
Compare
Build finished. |
1 similar comment
Build finished. |
@@ -236,6 +236,9 @@ type RequestOptions struct { | |||
Count int `json:"count"` | |||
// Whether to include stats for child subcontainers. | |||
Recursive bool `json:"recursive"` | |||
// Update stats if they are older than MaxAge | |||
// nil indicates no update, and 0 will always trigger an update. | |||
MaxAge *time.Duration `json:"max_age"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duration is just a wrapper around int64, so this should be safe for the api
https://golang.org/src/time/time.go?s=21095:21114#L610
fc516a2
to
dfe268c
Compare
Build finished. |
1 similar comment
Build finished. |
dfe268c
to
157b344
Compare
311c646
to
a8d0d2a
Compare
Build finished. |
1 similar comment
Build finished. |
a8d0d2a
to
98d9614
Compare
Build finished. |
3 similar comments
Build finished. |
Build finished. |
Build finished. |
Unit testing is complete. I had to plum through a clock. It should be fine to use apimachinery, right? |
manager/container_test.go
Outdated
} | ||
waitForHousekeeping.Done() | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct me if I'm wrong, but I think you can get away with not mocking this:
onDemandCache := make(chan chan struct{}, numConcurrentCalls)
for i := 0; i < numConcurrentCalls; i++ {
go func() {
c.OnDemandHousekeeping(0*time.Second)
waitForHousekeeping.Done()
}
// Wait for work to be queued
onDemandCache = append(onDemandCache, <-c.onDemandChan)
}
// Requeue work:
for _, ch := range onDemandCache {
c.OnDemandChan <- ch
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can switch to that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
d68882f
to
b1b6227
Compare
I tested this in kubernetes/kubernetes and verified that it passes the summary API test as well |
Thanks for the tests. We've tried to avoid dependencies on Kuberneties to avoid the circular dependency, but I think it's ok in this case since it's a small utility package, and only used for testing. |
Build finished. |
1 similar comment
Build finished. |
b1b6227
to
f54de49
Compare
Build finished. |
1 similar comment
Build finished. |
Ok, it should be ready |
manager/container_test.go
Outdated
waitForHousekeeping.Done() | ||
}() | ||
// Wait for work to be queued | ||
onDemandCache <- <-cd.onDemandChan |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I'd prefer a slice for the cache. It makes the requeue more straightforward.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, but the double arrow looks really cool. Done
f54de49
to
3d6ad6d
Compare
Build finished. |
1 similar comment
Build finished. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
thanks!
@dashpole --- this is really awesome. |
fyi @sjenning xmas came early! |
This is my take on implementing #1247.
This PR does the following:
There is still only a single thread which calls housekeepingTick().
The UpdateStats field is part of the v2 API in cAdvisor, but will be used only for node-level metrics in kubernetes. See kubernetes/kubernetes@master...dashpole:on_demand_metrics for the way this would be enabled in kubernetes.