-
Notifications
You must be signed in to change notification settings - Fork 1
ROX-30714: track cgroup ID to container ID mapping #83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
97f35d7 to
8c3fb0c
Compare
|
Do we have some numbers on the performance benefits of doing this? e.g. how much memory saved per event, how many fewer events we're dropping etc? have we encountered throughput issues already? I'm a little skeptical of the file system approach because of the race condition between an event happening and our ability to lookup the cgroup. Presumably it means that if we're unable to resolve the cgroup for the event, then activity might not be appropriately correlated with a deployment and as a result it might not be flagged as a violation? |
@erthalion is looking into performance testing, maybe we will have some numbers in the near future.
Yeah, I'm not super keen on this approach for this same reason. I think in theory, creating and destroying containers takes some time and because cgroupfs is a virtual file system (i.e everything is computed from memory when read like /proc) we should be relatively fast at finding new container ID mappings, but I don't know if we can guarantee 100% hit rate. On the other hand, the current approach only supports 16 levels of cgroup nestings, anything beyond that will have a 100% miss ratio and it is heavier on the processes that attempt to write or create files, so 🤷🏻♂️ . Maybe we can leave both implementations in and make which we use toggleable via configuration? |
ae53ab4 to
5df4d9e
Compare
This change greatly reduces the amount of data sent in the ringbuffer and the effort the BPF hooks need to retrieve the cgroup of the current process. In exchange for these benefits, we now need to lookup and keep track of the cgroups and container IDs that exist on the system ourselves by iterating over the cgroupsfs. TODO: Add integration tests with containers.
Because capturing information for a process running in a container is not fully possible from `/proc`, some information is passed to the Process constructor directly to overwrite it. This should be fine though, because we are using a specific container that should not have major changes on how it executes the command we are expecting.
5df4d9e to
13456b9
Compare
| #[derive(Debug)] | ||
| struct ContainerIdEntry { | ||
| container_id: Option<Arc<String>>, | ||
| pub last_seen: SystemTime, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why make this public?
| }) | ||
| } | ||
|
|
||
| fn walk_cgroupfs(path: &PathBuf, map: &mut ContainerIdMap, parent_id: Option<Arc<String>>) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe rename to something like walk_cgroupfs_and_add_to_map. The current name doesn't explain what is done while we walk through the cgroupfs.
| ContainerIdCache(Arc::new(Mutex::new(map))) | ||
| } | ||
|
|
||
| fn update_unlocked(map: &mut ContainerIdMap) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe rename to update_no_lock or update_without_lock. The name implies that this updates something that is unlocked, rather than doing an update without locking.
| async fn prune(&mut self) { | ||
| let now = SystemTime::now(); | ||
| self.0.lock().await.retain(|_, value| { | ||
| now.duration_since(value.last_seen).unwrap() < Duration::from_secs(30) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe set a variable to 30 seconds and maybe pass that variable to this method to make this more configurable. Also maybe change the name to something like prun_old.
|
|
||
| pub fn start_worker(mut self, mut running: Receiver<bool>) -> JoinHandle<()> { | ||
| tokio::spawn(async move { | ||
| let mut update_interval = time::interval(time::Duration::from_secs(30)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe don't hard code 30 here.
Description
This change greatly reduces the amount of data sent in the ringbuffer and the effort the BPF hooks need to retrieve the cgroup of the current process. In exchange for these benefits, we now need to lookup and keep track of the cgroups and container IDs that exist on the system ourselves by iterating over the cgroupsfs.
Checklist
Automated testing
If any of these don't apply, please comment below.
Testing Performed
Manually tested the fact generated events have the expected container ID and added an integration test.