Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add umoci insert command #237

Merged
merged 11 commits into from
May 16, 2018
Merged

add umoci insert command #237

merged 11 commits into from
May 16, 2018

Conversation

tych0
Copy link
Member

@tych0 tych0 commented Apr 17, 2018

There are cases where you only want to insert files into an image. Right
now the procedure is: unpack the image, add the files, and repack the
image. However, the OCI format lends itself exactly to this kind of
addition -- we can just add a new layer with the files we wish to insert.
Additional motivation is in the case of large layers, we can avoid a lot of
i/o, CPU time to compress, and disk space requirements.

So, let's add a command to generate this new import layer.

Signed-off-by: Tycho Andersen tycho@tycho.ws

@tych0 tych0 force-pushed the umoci-insert branch 3 times, most recently from 0207f91 to cc3fd50 Compare April 17, 2018 20:23

tg := newTarGenerator(writer, mapOptions)

err = filepath.Walk(root, func(curPath string, info os.FileInfo, err error) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be using unpriv.Walk (with the regular checks for which fs we should use). One of the things people want is for umoci to use it's magic CAP_DAC_OVERRIDE emulation to be able to handle annoying paths.

And once we add that, we'll need another integration test to make sure that the path modes are included correctly.

@tych0
Copy link
Member Author

tych0 commented Apr 23, 2018

Hey, sorry for the delay, I updated it and added a stat test.

Copy link
Member

@cyphar cyphar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also needs documentation. I can deal with the site documentation, but you can just copy the existing man page templates and modify them to fit (after you've added the mapping options of course).


# ...and too many
umoci insert --image "${IMAGE}:${TAG}" asdf 123 456
[ "$status" -ne 0 ]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer this to be in a separate test (to match the other "are the arguments handled correctly" tests).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, might be worth changing unpack/repack too, that's where I got this from, and they're just inlined there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, then scratch that (I'll double-check why I thought that was the case...).

test/insert.bats Outdated
[ "$status" -eq 0 ]

umoci insert --image "${IMAGE}:${TAG}" "${ROOT}/umoci" /tester
[ "$status" -eq 0 ]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also recommend having a rootfs with something like

mkdir -p $BUNDLE_A/rootfs/some/path
chmod 000 $BUNDLE_A/rootfs/some/path
umoci repack # whatever
umoci insert blah /some/path/blah

(This should make no difference to the actual generation, but it would ensure that rootless is "handled" in that case. Also a similar case should be handled from the host side of things to make sure rootless is properly obeyed and the added directory is actually reconstructed correctly.)

Also this test should do an unpack after the first insertion to ensure that the file actually changes between the two insertions IMO.

If all of that is a bit too much to change, I can push them on top of your branch once you're done developing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, I'm happy to do that too.

func GenerateInsertLayer(root string, prefix string, opt *MapOptions) io.ReadCloser {
root = path.Clean(root)
rootPrefixLen := len(root) - len(path.Base(root))
prefix = strings.TrimPrefix(prefix, "/")
Copy link
Member

@cyphar cyphar Apr 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be some mix of utils.CleanPath or similar. I'm not sure which. I get that you want to strip leading '/' but you can do that in the Join call. In addition, I'm not really sure what you're doing with rootPrefixLen -- why not use path.Rel?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rootPrefixLen is basically some fancy math so that if the user does: umoci insert --image ... /thing/to/insert /usr/bin, we end up with insert in /usr/bin, and not /usr/bin/thing/to/insert; path.Rel doesn't quite do the right thing here.

The TrimPrefix bit is because of this hunk: https://github.com/openSUSE/umoci/blob/master/oci/layer/tar_generate.go#L94 which causes a failure if there's an absolute path passed to the tar generator. It's not really clear why that's there, but I assume you had a reason, so I just trimmed the prefix. I'm not sure any path cleaning will do that, though.

Copy link
Member

@cyphar cyphar Apr 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

umoci insert --image ... /thing/to/insert /usr/bin, we end up with insert in /usr/bin, and not /usr/bin/thing/to/insert; path.Rel doesn't quite do the right thing here.

Shouldn't be able to just do filepath.Base?

The TrimPrefix bit is because of this hunk: https://github.com/openSUSE/umoci/blob/master/oci/layer/tar_generate.go#L94 which causes a failure if there's an absolute path passed to the tar generator. It's not really clear why that's there, but I assume you had a reason, so I just trimmed the prefix.

It's for security reasons, to double-check that the path lexically doesn't have any un-clean components (so something like /../../../../../../../blah). But you're right that I didn't consider the "/"+"/abc" != filepath.Join("/", "/abc") edge-case (because generally tar archives don't have leading / for their entries).

We could fix this by removing leading / when normalising the paths.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, because filepath.Base("/thing/to/insert") == "insert", not "/thing/to", which is what we want to know the length of. I think filepath.Dir() is the right thing though, I can change it to that.

R.e. the tar issue, you do a CleanPath() before that check, so shouldn't all the ..s be cleaned out? I suppose extra paranoia is ok, but if you don't trust your own code...

I'm happy to just trim the leading / in normalise(), though.

Copy link
Member

@cyphar cyphar Apr 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, should've been clearer with filepath.Base -- I was suggesting that we remove all of the rootPrefixLen code and just use filepath.Base to get the basename and append it to the target with filepath.Join. Unless I'm misunderstanding something, that should do the same thing (and it's more clear what you're actually trying to do with the paths).

R.e. the tar issue, you do a CleanPath() before that check, so shouldn't all the ..s be cleaned out? I suppose extra paranoia is ok, but if you don't trust your own code...

Yeah, we do a CleanPath. I'll be honest, I'm more than a little bit paranoid about the path handling code in umoci because it's the most dangerous thing we're dealing with (and there have been Docker CVEs because of missing trivial checks like the above -- some of my first Docker contributions were fixing security bugs in path sanitisation).

I'm happy to just trim the leading / in normalise(), though.

👍 Just do something like if filepath.IsAbs(path) { ... } (personally I would use filepath.Rel for it for clarity, even though the / case is a bit too trivial to justify it -- but feel free to just use strings.TrimPrefix).

@@ -0,0 +1,140 @@
/*
* umoci: Umoci Modifies Open Containers' Images
* Copyright (C) 2018 Cisco
Copy link
Member

@cyphar cyphar Apr 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of this file is copied from the other cmd/umoci/*.go files -- can you add an additional copyright line from the original?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, np

}

// TODO: add some way to specify these from the cli
reader := layer.GenerateInsertLayer(ctx.Args()[0], ctx.Args()[1], nil)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You always pass nil for MapOptions here. That doesn't look right to me -- I would copy the unpack code for map options (perhaps even refactor them out and use the same code). I'm also not a huge fan of the ctx.Args usage here -- I would use ctx.App.Metadata (as ugly of a hack it might be) to make it more consistent.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not only is it ugly, I think it's also unecessary. You can use https://godoc.org/github.com/urfave/cli#Context.GlobalBool and friends to access global stuff from a local context.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way ctx.App.Metadata is used is so that we can do parsing and other modifications to the options in the BeforeFunc (so that the errors are done at the right time and so that we can have utils_ux refactor it). I don't think Global* would help for this case. Also unless I'm mistaken you cannot place global flags after the subcommand name (though I might be completely mistaken).

}
if ctx.Args()[1] == "" {
return errors.Errorf("path cannot be empty")
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would specify which path cannot be empty. Also, I'm not sure why the prefix cannot be empty -- an empty prefix just means it should be added to the root, no?

Where "<image-path>" is the path to the OCI image, "<tag>" is the name of the
tag that the content wil be inserted into (if not specified, defaults to
"latest"), "<file>" is the file or folder to insert, and "<path>" is the prefix
inside the image to insert it into.`,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would mention that the prefix is not like tar prefixes -- where you can make prefixes that aren't entire path components. I assume this decision was intentional, so I would mention it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you mean by this, can you elaborate?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I got confused. For a minute I though that tar cvf had an option to add a string prefix, and thought that it would be worthwhile to mention this is not the same semantics. tar has transform functions but the only relevant one is sed-like transforms.

But personally I would not use the word prefix in the documentation -- say that it's the target directory. Also I have a comment about the semantics of file insertion (you cannot change the filename when you insert at the moment, which doesn't feel ideal -- I would prefer if we match cp or rsync's semantics as close as we possibly can).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. What do you want to see, then? I personally detest rsync's whole thing about trailing slash vs not, so I'd like to avoid that if possible.

@cyphar cyphar mentioned this pull request Apr 24, 2018
@cyphar
Copy link
Member

cyphar commented Apr 24, 2018

I have a concern that the semantics of umoci insert don't match cp (or rsync) when it comes to inserting files. As far as I can tell, you cannot insert a file into a path where the basename is different (the path is taken to be the directory the file will be placed into -- unlike the target semantics of most tools).

Of course, we cannot nicely just stat the target to figure out if it's a directory or a file, so we should do something like "if the target ends with a / we add it as a child of the target path with the same basename as the source, otherwise we use the target path verbatim".

Also, one other problem is how do we handle something like adding a directory over the top of another directory in the image (which could either result in a merged directory, or a directory only containing the inserted contents). I get the feeling you'd probably want to be able to delete any underlying directories on insertion (maybe as a --replace option or something) where we would add an opaque whiteout.

@tych0
Copy link
Member Author

tych0 commented Apr 24, 2018

IMO any weirdness about the user specifying a trailing / is basically terrible, and we should never do that. The semantics aren't the same as cp or rsync, but rsync's interface sucks so I'll ignore that. I could imagine something like umoci insert /path/to/insert /path/to/copy/to, which would be identical to cp, and thus require you to specify whatever prefix you want. FWIW, that's exactly what happens now, it's just that the documentation describes it in terms of prefix instead of destination path.

@cyphar
Copy link
Member

cyphar commented Apr 24, 2018

In short, there are two things that I think should be doable with umoci insert. If you like we can work on these separate to this PR or we can work on them now. Here they are:

% # inserts a file (or directory) *at* the given path -- not as a child of the directory
% umoci insert --image some_image my_new_etc_passwd /etc/passwd
% # inserts a directory, overwriting any pre-existing directory -- not just merging the new with the old
% umoci insert --image some_image --whiteout some_dir /var/existing_directory

The first is just a matter of the path we generate, and making it handle that case. The second requires generating an opaque whiteout for /var/existing_directory.

IMO any weirdness about the user specifying a trailing / is basically terrible, and we should never do that. The semantics aren't the same as cp or rsync, but rsync's interface sucks so I'll ignore that.

FWIW, I agree that rsync's interface is pretty annoying, but I can't think of a simpler way of disambiguating between "I want you to insert this file at this path" versus "I want you to insert this file under this path". If you have one that doesn't require replicating the lovely / confusion, I'd be happy to take it -- I just couldn't think of a better example. 😖

But in my view the above two operations are pretty important (though the first is mainly a quality-of-life thing) in order to be able to use this in all of the situations I can think of. But as I said above, we can merge these changes earlier and I can work on the other things before the next release -- whichever works for you.

@cyphar
Copy link
Member

cyphar commented Apr 24, 2018

FWIW, I agree that rsync's interface is pretty annoying, but I can't think of a simpler way of disambiguating between "I want you to insert this file at this path" versus "I want you to insert this file under this path".

Sorry, it should also be noted that we don't have to deal with the most confusing case with rsync -- "what if the target path already exists". Since we don't have the ability to (easily and efficiently) figure out if a path exists, that case is entirely ignored. Really then it boils down to

% umoci insert --image ABC /some/path /target/path  # /target/path = /same/path
% umoci insert --image ABC /some/path /target/path/ # /target/path/path = /same/path

Which is just a different way of making a boolean flag, like this:

% umoci insert --image ABC --verbatim /some/path /target/path # /target/path = /same/path
% umoci insert --image ABC            /some/path /target/path # /target/path/path = /same/path

Now, --verbatim is more explicit and less likely to be triggered by accident. But you could argue it's more cumbersome. But I don't really mind either way to be honest -- and I do share your opinion that changing semantics on trailing / is basically awful (though it should be noted that both cp and Linux do this in some situations -- not that it makes it right though).

@tych0
Copy link
Member Author

tych0 commented Apr 24, 2018 via email

@cyphar
Copy link
Member

cyphar commented Apr 24, 2018

I guess my argument would be that you basically never want /target/path/path, and if you do, you can just say that instead.

So it should always be: target := path.Join(targetPath, path.Base(sourcePath))

But the current implementation does /target/path/path AFAICS:

% umoci insert --image opensuse:latest COPYING /test # I would expect /test = COPYING
% sudo ./umoci unpack --image opensuse:latest bundle
% sudo ls bundle/rootfs/test # it's a directory -- /target/path would be that bundle/rootfs/test is COPYING
COPYING

Which is why I mentioned it. If you think we don't need to support /target/path/path (I think it would be okay to have that option personally -- but not supporting it definitely makes it simpler), then we need to change the current semantics to match that. Or maybe I didn't explain what I meant by /target/path/path properly?

@tych0
Copy link
Member Author

tych0 commented Apr 24, 2018 via email

@cyphar
Copy link
Member

cyphar commented Apr 27, 2018

This will now need a rebase on top of #240, to use the refactored --uid-map parsing code (it makes it easier).

@tych0
Copy link
Member Author

tych0 commented Apr 30, 2018

Thanks for the heads up. Also, I had another thought about the /target/path/path issue, which was that in fact the current version doesn't. Your contrived example repeats path, but in the real world you would not:

umoci insert myexecutable /usr/bin
umoci insert myconfig /etc

Whereas with what you're proposing, we need to do,

umoci insert myexecutable /usr/bin/myexecutable
umoci insert myconfig /etc/myconfig

Again I'm happy to switch it, but it does result in more typing for everyone, although it also allows you to rename things on the fly.

@tych0
Copy link
Member Author

tych0 commented Apr 30, 2018

Ok, I think I've addressed everything; let me know if I missed something.

@tych0
Copy link
Member Author

tych0 commented Apr 30, 2018

Ah yes, the best kind of bug, where it passes locally and fails on travis :)

@cyphar
Copy link
Member

cyphar commented Apr 30, 2018

I'll review this later today, or I can wait until you've fixed up the Travis failure (it's failing during rootless tests with bad argument errors, which is a bit odd).

Again I'm happy to switch it, but it does result in more typing for everyone, although it also allows you to rename things on the fly.

That's why I suggested having both options -- similar to how cp works (though with cp you can stat the target path to figure out if it's a directory or not -- doing that in an image would make it much more expensive). Personally I think it's necessary to have the ability to "rename things on the fly" because that's something that I think most people would expect to be able to do (we could add a -T or -t option that matches the cp semantics if we didn't want to do it using the trailing-/ semantics I proposed earlier).

@tych0
Copy link
Member Author

tych0 commented Apr 30, 2018 via email

@cyphar
Copy link
Member

cyphar commented May 1, 2018

Awesome. I'll play around with it today, and I can try to figure out why the tests are failing as well.

@tych0 tych0 force-pushed the umoci-insert branch 5 times, most recently from 7b237d9 to 77d57f8 Compare May 3, 2018 23:47
@tych0
Copy link
Member Author

tych0 commented May 4, 2018

Ok, I think I figured out what was wrong with the tests. This should be good to go now.

@tych0
Copy link
Member Author

tych0 commented May 8, 2018

Ping, any thoughts on this?

@cyphar
Copy link
Member

cyphar commented May 9, 2018

This is looking pretty good. I'm going to play around with this and do a proper review soon. This will need a rebase. Documentation would also be good, but if you prefer I can do that in a follow-up.

@tych0
Copy link
Member Author

tych0 commented May 9, 2018

I rebased it. Let's get a version in before we bikeshed on any docs.

@cyphar
Copy link
Member

cyphar commented May 9, 2018

Sure. 😉

@cyphar
Copy link
Member

cyphar commented May 9, 2018

One thing that I mentioned above is that we don't have a way of overwriting an entire directory -- so if you insert a directory over an existing directory you end up with a merge of the two (with no nice way of only getting the new directory).

% mkdir dir1 && touch dir1/should_not_exist
% mkdir dir2 && touch dir2/should_exist
% umoci insert --image image dir1 /path
% umoci insert --image image dir2 /path
% umoci unpack --image image bundle && ls bundle/rootfs/path
should_exist  should_not_exist

I will work on this feature once this is merged as well. I will probably just make it something like a --whiteout flag (since I imagine that people will be interested in both ways of doing it).

@cyphar
Copy link
Member

cyphar commented May 10, 2018

Tests are timing out on Travis (but only for rootless and only for Fedora). I can reproduce the test taking a long time -- it might even block indefinitely actually. Not sure why the test is taking so much longer than it used to..

@cyphar
Copy link
Member

cyphar commented May 10, 2018

Okay, there's now a test failure in rootless (you can't see it on Travis for some reason):

not ok 60 umoci repack [replace]
# (in test file test/repack.bats, line 204)
#   `ln -s "a \\really //weird _00..:=path " "$BUNDLE_A/rootfs/usr/bin/env"' failed
# 21M   /tmp/umoci-coverage.7NhRrl
# 0     /tmp/umoci.YCR12A
# 83M   /tmp/image/blobs
# 83M   /tmp/image
# 103M  /tmp
# oci-image-tool: reference "latest": OK
# /tmp/image: OK
# Validation succeeded
# umoci -test.coverprofile=/tmp/umoci-coverage.7NhRrl/umoci.cov.vtQiqE __DEVEL--i-heard-you-like-tests unpack --rootless --image /tmp/image:latest /tmp/umoci-integration-tmpdir.8q2KHYm5 (status=0)
# PASS
# coverage: 31.9% of statements in ./...
# Bundle validation succeeded.
# rm: remove write-protected regular file '/tmp/umoci-integration-tmpdir.8q2KHYm5/rootfs/usr/bin/env'? ln: failed to create symbolic link '/tmp/umoci-integration-tmpdir.8q2KHYm5/rootfs/usr/bin/env': File exists

@tych0
Copy link
Member Author

tych0 commented May 10, 2018

Ok, this seems unrelated -- presumably /usr/bin/env is not +w, and since you're not passing -f to the rm of it, it's blocking. Why it works on other platforms I have no idea :)

FWIW, I also see failures on current master on ubuntu of the same umoci repack test,

# /tmp/bats.8942.src: line 165: /tmp/umoci-integration-tmpdir.4dWrFtb7/rootfs/whiteout_test/.wh. THIS IS A TEST : Operation not permitted

I've been ignoring these as well.

cyphar added a commit to cyphar/umoci that referenced this pull request May 15, 2018
  use alternate directory for tests
  remove umoci.cov* files on clean
  use umoci.cover instead of umoci for stat test
  fix up TODO about supporting MapOptions for insert
  add rootless insertion check
  change the way inserted file paths are calculated
  use CleanPath instead of path.Clean
  move TrimPrefix call into normalise
  add additional copyright to insert header
  insert: use unpriv.Walk instead
  add `umoci insert` command

LGTMs: @cyphar
Closes opencontainers#237
@cyphar
Copy link
Member

cyphar commented May 15, 2018

#243 fixed the test failures. If you rebase I'll merge it, thanks!

tych0 added 11 commits May 15, 2018 07:42
There are cases where you only want to insert files into an image. Right
now the procedure is: unpack the image, add the files, and repack the
image. However, the OCI format lends itself exactly to this kind of
addition -- we can just add a new layer with the files we wish to insert.
Additional motivation is in the case of large layers, we can avoid a lot of
i/o, CPU time to compress, and disk space requirements.

So, let's add a command to generate this new import layer.

Signed-off-by: Tycho Andersen <tycho@tycho.ws>
This will make rootless insert potentially actually work?

Signed-off-by: Tycho Andersen <tycho@tycho.ws>
Signed-off-by: Tycho Andersen <tycho@tycho.ws>
We really don't need this IsAbs test, but it's what was recommended, so
here we are.

Signed-off-by: Tycho Andersen <tycho@tycho.ws>
Signed-off-by: Tycho Andersen <tycho@tycho.ws>
Signed-off-by: Tycho Andersen <tycho@tycho.ws>
Signed-off-by: Tycho Andersen <tycho@tycho.ws>
Signed-off-by: Tycho Andersen <tycho@tycho.ws>
The test suite only builds umoci.cover, so this only worked if you had done
`make` in pwd before running the test suite.

Signed-off-by: Tycho Andersen <tycho@tycho.ws>
umoci.cover and umoci.cov.* files are generated by the test suite, let's be
sure to clean all of those up as well.

Signed-off-by: Tycho Andersen <tycho@tycho.ws>
Travis seems to have trouble when inserting files from ${ROOT}, so let's
just make another directory with these files for testing purposes.

Signed-off-by: Tycho Andersen <tycho@tycho.ws>
@tych0
Copy link
Member Author

tych0 commented May 15, 2018

Cool, thanks. Looks like the tests passed this time.

@cyphar
Copy link
Member

cyphar commented May 16, 2018

LGTM. Thanks @tych0!

@cyphar cyphar merged commit 9ba2e57 into opencontainers:master May 16, 2018
cyphar added a commit that referenced this pull request May 16, 2018
  use alternate directory for tests
  remove umoci.cov* files on clean
  use umoci.cover instead of umoci for stat test
  fix up TODO about supporting MapOptions for insert
  add rootless insertion check
  change the way inserted file paths are calculated
  use CleanPath instead of path.Clean
  move TrimPrefix call into normalise
  add additional copyright to insert header
  insert: use unpriv.Walk instead
  add `umoci insert` command

LGTMs: @cyphar
Closes #237
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants