-
Notifications
You must be signed in to change notification settings - Fork 18
Install the bpf program for all pods and collect/expose the data #13
base: master
Are you sure you want to change the base?
Conversation
98d10c0
to
3be4279
Compare
@@ -104,6 +104,12 @@ | |||
revision = "f3f9494671f93fcff853e3c6e9e948b3eb71e590" | |||
|
|||
[[projects]] | |||
name = "github.com/go-stack/stack" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this pkg mentioned here? it does not seem to be used anywhere (neither in cgnet or in the vendored pkgs), or vendored?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIU this file is generated by the package manager.
So to answer your question, one would have to look into the dep's behaviour.
Dockerfile
Outdated
@@ -5,5 +5,5 @@ RUN apk update && apk add --no-cache libc6-compat | |||
COPY cgnet /cgnet | |||
|
|||
EXPOSE 9101 | |||
ENTRYPOINT ["/cgnet", "serve"] | |||
ENTRYPOINT ["/cgnet", "--verbose", "2", "export" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the parameters could be on the CMD line
revision = "b105bd37f74e5d9dc7b6ad7806715c7a2b83fd3f" | ||
version = "v2.11" | ||
packages = [".","term"] | ||
revision = "74a0988b5f804e8ce9ff74fca4f16980776dff29" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason for using master instead of a released version?
bpf/controller.go
Outdated
return | ||
case <-c.quit: | ||
return | ||
case <-time.After(1000 * time.Millisecond): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could use time.Second
bpf/controller.go
Outdated
return | ||
case <-time.After(1000 * time.Millisecond): | ||
lookup(c.module, packetsKey, c.packetsHandler) | ||
lookup(c.module, bytesKey, c.bytesHandler) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why using a callback in parameter 3 instead of just having the lookup
function returning the value?
c.packetsHandle(lookup(c.module, packetsKey))
bpf/src/cgnet.c
Outdated
@@ -19,7 +19,7 @@ | |||
#include <uapi/linux/ip.h> | |||
#include "bpf_helpers.h" | |||
|
|||
struct bpf_map_def SEC("maps/count") count_map = { | |||
struct bpf_map_def SEC("maps/cgnet") count_map = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you use the same name for the section and the variable name?
something like:
struct bpf_map_def SEC("maps/count_map") count_map = {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a comment to document:
- what is the key: define the keys nearby the map definition, so it's clearer.
- what is the value: a counter
- how it is used to communicate: values initialized to zero by userspace and incremented by the ebpf program
Just before the map definition:
#define PACKETS_KEY 0
#define BYTES_KEY 1
In the count_packets function:
int packets_key = PACKETS_KEY, bytes_key = BYTES_KEY;
cmd/serve.go
Outdated
metrics.GlobalPodMetrics.TotalNumberPods.Add(1) | ||
podlog := log.New("pod", e.PodSelfLink) | ||
|
||
cgPath := buildCgroupPath(cgroupRoot, e.PodUID, e.PodQOSClass) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it possible to get this from the Kubernetes API rather than assuming the path is built in a specific way?
/cc @asymmetric @nhlfr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you could follow this and do something like client.Get().Namespace("foo").Resource("bar").URL()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@asymmetric But that only gives me the resource URL I need the path to the pods cgroup here
@alban I looked though http://godoc.org/k8s.io/api/core/v1#Pod to find a way to get the cgroup for pods but building it ourselfs seems to be the only way. Maybe @nhlfr knows...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If k8s does not provide it, you might have to ask Docker the leader pid of the container and then check in /proc/$pid/cgroup
...
You could file an issue and add a // FIXME: ...
if you want to do that later.
cmd/serve.go
Outdated
case kube.DeletePodEvent: | ||
metrics.GlobalPodMetrics.TotalNumberPods.Sub(1) | ||
metrics.TotalNum().Sub(1) | ||
log.Debug("pod gone", "pod", e.PodSelfLink, "cgroup", buildCgroupPath(cgroupRoot, e.PodUID, e.PodQOSClass)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cleanup the bpf module? Or at least. add a // FIXME: ...
to remember
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was actually why I filed #7 because I didn't know if you had to do any cleanup work.
Will fix this now...
3be4279
to
e334fb8
Compare
This introduces a bpf controller that runs the lookup loop for every cgroup we attach a program to, reads the map and executes handlers that write the tracked data to Prometheus metrics. The top subcommand uses the same code but implements different handlers. Closes #4, closes #5, closes #6, closes #9
e334fb8
to
e193f74
Compare
Any more feedback here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other than that, LGTM.
"top" command works fine.
I didn't try out the prometheus part though.
See the License for the specific language governing permissions and | ||
limitations under the License. | ||
*/ | ||
|
||
package bpf | ||
|
||
import ( | ||
"bytes" | ||
"fmt" | ||
"log" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"log"
doesn't seem to be used anywhere, so we get build error.
func (c *Controller) Stop() { | ||
c.quit <- struct{}{} | ||
if err := c.module.Close(); err != nil { | ||
Log.Error("error closing bpf.Module", "err", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't it be log
instead of Log
?
And log
needs to be imported, otherwise we get build errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah that's left over from the logging I introduced but split out of this PR. thx will use the log
pkg here
This introduces a bpf controller that runs the lookup look for every cgroup we attach a program to and executes handlers that write the tracked data to Prometheus metrics that get exposed on port 9101. The top subcommand uses the same code but implements different handlers. It also improves overall logging and adds a flag to adjust the verbosity.
Closes #4, #5, #6, #9