-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lib/tar: Pre-filter tar archive in Rust #112
Conversation
e1c16da
to
2961a88
Compare
lib/src/tar/write.rs
Outdated
|
||
// Completely discard stuff outside of /usr | ||
if !(matches!(path.as_str(), "/" | "/usr") || path.starts_with("usr/")) { | ||
continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd throw a warn!()
in here. I'm pretty sure we'll discover a good bunch of nasty things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, though since this is a library we should probably allow the caller to collect this data and log/print it however it wants.
Will look at that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the bigger picture I am currently thinking of this as a shortcut - longer term we'll strongly encourage the model in #12 (comment) where we have a required RUN ostree container-finalize-bootable
incantation that will clean this stuff up at build time.
Rather than trying to extend the ostree C code to support arbitrary transformations, let's do pre-parsing here in safe Rust. We keep the `/etc` → `/usr/etc` bits, but we also just completely drop everything not in `/usr` now. As noted in the comment, this pre-validation will hopefully also catch any corrupt tarballs that might be exploitable in the C libarchive codebase.
2961a88
to
520fc40
Compare
OK, that was a fun deep dive into path libraries in Rust, specifically the components method. We now parse, normalize and filter the input path in a much more robust way, and this updated code now also returns a btreemap from "prefix -> number of filtered paths" which should be sufficient to start. |
@@ -26,6 +26,35 @@ impl<T: AsyncRead> ReadBridge<T> { | |||
} | |||
} | |||
|
|||
/// A [`std::io::Write`] implementation backed by an asynchronous source. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW I submitted tokio-rs/tokio#4146
Hmm, what we should really do here actually is store the metadata about filtered paths in the commit object. However, doing that would require splitting up the write/commit phases, which in turn really wants us to be doing all of this natively in Rust instead of forking off |
Any other comments on this, or good to go? |
All good from my side 👍. As (very subjective) feedback on your last comment, I don't personally feel the need to record that information persistently. It can be interesting to log and inspect it at runtime, but in the steady state I guess it's unlikely that we'd want to parse/analyze that from the commit object itself. |
Rather than trying to extend the ostree C code to support
arbitrary transformations, let's do pre-parsing here in safe Rust.
We keep the
/etc
→/usr/etc
bits, but we also just completelydrop everything not in
/usr
now.As noted in the comment, this pre-validation will hopefully also
catch any corrupt tarballs that might be exploitable in the C libarchive
codebase.