ROX-30437: refine host path algorithm #149
Draft
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Retrieve device id for the dentry being accessed on kernel side, then
use the mountinfo from /proc on userspace to adjust the path the user
would see on the host node.
This new logic requires us to keep track of mountinfo in userspace, so a
new EventParser type is added to do this in a cached manner and generate
events with it.
In case a device id is received that is not found in the mountinfo
cache, the cache will be rebuilt. If the device id is still not
found, an empty entry will be added for that id and we will assume we
cannot get the required information to correct the host path gathered
from kernelspace.
These changes also require some adjustments to work on k8s, so the
manifest is updated accordingly.
Checklist
Automated testing
If any of these don't apply, please comment below.
Testing Performed
Added tests will validate events generated on an overlayfs file properly
shows the event on the upper layer and the access to the underlying FS.
They also validate a mounted path on a container resolves to the correct
host path.
While developing these tests, it became painfully obvious getting the
information of the process running inside the container is not
straightforward. Because containers tend to be fairly static, we should
be able to manually create the information statically in the test and
still have everything work correctly. In order to minimize the amount of
changes on existing tests, the default Process constructor now takes
fields directly and there is a from_proc class method that builds a new
Process object from /proc. Additionally, getting the pid of a process in
a container is virtually impossible, so we make the pid check optional.