-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
oci/cas/dir: Only Clean .umoci-* directories #198
Conversation
8aec391
to
df267b2
Compare
The thing is, in order for a What this patch will do is convert a case where Effectively, what I'm saying is NACK, but maybe if you explain why you feel this is necessary I'll change my mind. Changing removal errors to be non-fatal seems like a good fix though. |
Right. And it's also protecting temporary files/dirs used by non-flocking consumers who may be operating in parallel. This is the "balance cleanliness vs. robustness for those other consumers" from my initial comment.
All of this is "clean up unused cruft", the issue is determining whether something is used or not. Git uses reachability-from-known-refs with time-based backstops for blobs (with the time check allowing
It protects non-flocking coworkers operating within the same directory, because the "flock to preserve" semantics are umoci-specific and not in image-spec. Is you objection to potential tempoary dirty objects asthetic, or are you worried about disk consumption? Or something else? Having two weeks of dirt in the rare "umoci crashed" cases doesn't seem that terrible. Would you accept it with a configurable time like Git's |
IMO, having two different processes modifying an OCI image in parallel feels like it's never going to work as well as you would hope ( But I feel like the solution to this problem is to only clean up "garbage" directories that have names that look like
But those clearly aren't the same --
I'm worried about three use-cases:
However, a use-case I would understand is that |
That sounds good to me; I'll reroll this to take that approach.
Putting sensitive information in a public place will never be safe, temp-cleanup or no. Temp-cleanup helps only when folks are usumg rsync withought excludes, or whatever, to push straight from their working repo, and even then they might still be pushing sensetive orphan blobs. I would encourage folks to use something like Git's
In the unlikely umoci-crash case, yes. And both my previous
I think "umoci will clean up after my external tool" is a dangerous assumption (for example, it will probably be false once we only cleanup Do you want me to work up |
Right, that case is more about doing something less bad rather than trying to be perfect. 😉
The "external tool" in this context is not touching the image directly,
No, I plan to make this part of the "parcel export" story. But I'll open an issue to track it. |
df267b2
to
969eca0
Compare
This allows the use of other non-flocking consumers (e.g. other CAS implementations) in the same base directory. The '.umoci-*' approach was suggested by Aleksa [1] as a way to keep Clean from stepping on other consumers' toes. I'd suggesting using pruneExpire for ModTime-based backstopping, but Aleksa was concerned about leaking temporary directories from crashed umoci runs, primarily for disk-space concerns [1]. The .umoci-* approach allows such abandoned cruft to be immediately cleaned, without worrying about clobbering non-umoci consumers and extentions (unless they use .umoci-* filenames, in which case they deserve it ;). I've gone with .umoci-* instead of Aleksa's .umoci-tmpdir, because I think the . prefix is a sufficient hint that these directories are umoci-specific scratch space. Removal errors (because of insufficient permissions, etc.) seemed like they should be non-fatal so we could continue to remove other cruft. However, I didn't want to silently ignore the failed removal. In this commit, I log those errors with a "warning" level. The new cleanFile method gives us tighter scoping for the defer lock release / file close, because those fire on function-exit [2]. With the old code, we'd keep the removed directory file-descriptors and associated flocks while Clean processed further directories. With the new code, we release those resources immediately after we finish processing the directory in question. [1]: https://github.com/openSUSE/umoci/pull/198#issuecomment-337889528 [2]: https://golang.org/ref/spec#Defer_statements Signed-off-by: W. Trevor King <wking@tremily.us>
969eca0
to
6a3bb15
Compare
This allows the use of other non-flocking consumers (e.g. other CAS implementations) in the same base directory. The '.umoci-*' approach was suggested by Aleksa [1] as a way to keep Clean from stepping on other consumers' toes. I'd suggesting using pruneExpire for ModTime-based backstopping, but Aleksa was concerned about leaking temporary directories from crashed umoci runs, primarily for disk-space concerns [1]. The .umoci-* approach allows such abandoned cruft to be immediately cleaned, without worrying about clobbering non-umoci consumers and extentions (unless they use .umoci-* filenames, in which case they deserve it ;). Aleksa also mentioned that non-umoci consumers might currently be relying on umoci to clean up their abandoned cruft [1]. But we both agree that non-umoci consumers should be cleaning up after themselves and/or providing their own "cleanup anything abandoned by previous runs of my tool" functionality, and should not be relying on umoci to do that for them [2,3]. I've gone with .umoci-* instead of Aleksa's .umoci-tmpdir, because I think the . prefix is a sufficient hint that these directories are umoci-specific scratch space. Removal errors (because of insufficient permissions, etc.) seemed like they should be non-fatal so we could continue to remove other cruft. However, I didn't want to silently ignore the failed removal. In this commit, I log those errors with a "warning" level. The new cleanFile method gives us tighter scoping for the defer lock release / file close, because those fire on function-exit [4]. With the old code, we'd keep the removed directory file-descriptors and associated flocks while Clean processed further directories. With the new code, we release those resources immediately after we finish processing the directory in question. [1]: https://github.com/openSUSE/umoci/pull/198#issuecomment-337889528 [2]: https://github.com/openSUSE/umoci/pull/198#issuecomment-337928272 [3]: https://github.com/openSUSE/umoci/pull/198#issuecomment-337937858 [4]: https://golang.org/ref/spec#Defer_statements Signed-off-by: W. Trevor King <wking@tremily.us>
On Thu, Oct 19, 2017 at 08:06:54AM -0700, Aleksa Sarai wrote:
> Do you want me to work up bundle / push / export / whatever in a
> follow-up PR?
No, I plan to make this part of the "parcel export" story. But I'll
open an issue to track it.
Sounds good.
I've pushed a reroll to use .umoci-* with df267b2 → 6a3bb15. I've
attempted to motivate the change from scratch and also summarize this
discussion in the commit message. Here's the new commit message, so
you can reply inline if you see anything you want adjusted:
oci/cas/dir: Only Clean .umoci-* directories
This allows the use of other non-flocking consumers (e.g. other CAS
implementations) in the same base directory. The '.umoci-*' approach
was suggested by Aleksa [1] as a way to keep Clean from stepping on
other consumers' toes. I'd suggesting using pruneExpire for
ModTime-based backstopping, but Aleksa was concerned about leaking
temporary directories from crashed umoci runs, primarily for
disk-space concerns [1]. The .umoci-* approach allows such abandoned
cruft to be immediately cleaned, without worrying about clobbering
non-umoci consumers and extentions (unless they use .umoci-*
filenames, in which case they deserve it ;).
Aleksa also mentioned that non-umoci consumers might currently be
relying on umoci to clean up their abandoned cruft [1]. But we both
agree that non-umoci consumers should be cleaning up after themselves
and/or providing their own "cleanup anything abandoned by previous
runs of my tool" functionality, and should not be relying on umoci to
do that for them [2,3].
I've gone with .umoci-* instead of Aleksa's .umoci-tmpdir, because I
think the . prefix is a sufficient hint that these directories are
umoci-specific scratch space.
Removal errors (because of insufficient permissions, etc.) seemed like
they should be non-fatal so we could continue to remove other cruft.
However, I didn't want to silently ignore the failed removal. In this
commit, I log those errors with a "warning" level.
The new cleanFile method gives us tighter scoping for the defer lock
release / file close, because those fire on function-exit [4]. With
the old code, we'd keep the removed directory file-descriptors and
associated flocks while Clean processed further directories. With the
new code, we release those resources immediately after we finish
processing the directory in question.
[1]: https://github.com/openSUSE/umoci/pull/198#issuecomment-337889528
[2]: https://github.com/openSUSE/umoci/pull/198#issuecomment-337928272
[3]: https://github.com/openSUSE/umoci/pull/198#issuecomment-337937858
[4]: https://golang.org/ref/spec#Defer_statements
|
Tool-specific filenames make tool-specific cleanup easier. If a casengine dir handler crashes or is abandoned without the Close() cleanup, this lets you clean up after it with: $ rm -rf /path/to/base/.casengine-* or similar without having to worry about blowing away non-casengine content. More discussion about this use-case and alternative approaches in [1]. [1]: https://github.com/openSUSE/umoci/pull/198 Signed-off-by: W. Trevor King <wking@tremily.us>
Needs a rebase, and also a (short)
|
6a3bb15
to
dc79a46
Compare
This allows the use of other non-flocking consumers (e.g. other CAS implementations) in the same base directory. The '.umoci-*' approach was suggested by Aleksa [1] as a way to keep Clean from stepping on other consumers' toes. I'd suggesting using pruneExpire for ModTime-based backstopping, but Aleksa was concerned about leaking temporary directories from crashed umoci runs, primarily for disk-space concerns [1]. The .umoci-* approach allows such abandoned cruft to be immediately cleaned, without worrying about clobbering non-umoci consumers and extentions (unless they use .umoci-* filenames, in which case they deserve it ;). Aleksa also mentioned that non-umoci consumers might currently be relying on umoci to clean up their abandoned cruft [1]. But we both agree that non-umoci consumers should be cleaning up after themselves and/or providing their own "cleanup anything abandoned by previous runs of my tool" functionality, and should not be relying on umoci to do that for them [2,3]. I've gone with .umoci-* instead of Aleksa's .umoci-tmpdir, because I think the . prefix is a sufficient hint that these directories are umoci-specific scratch space. Removal errors (because of insufficient permissions, etc.) seemed like they should be non-fatal so we could continue to remove other cruft. However, I didn't want to silently ignore the failed removal. In this commit, I log those errors with a "warning" level. The new cleanFile method gives us tighter scoping for the defer lock release / file close, because those fire on function-exit [4]. With the old code, we'd keep the removed directory file-descriptors and associated flocks while Clean processed further directories. With the new code, we release those resources immediately after we finish processing the directory in question. [1]: https://github.com/openSUSE/umoci/pull/198#issuecomment-337889528 [2]: https://github.com/openSUSE/umoci/pull/198#issuecomment-337928272 [3]: https://github.com/openSUSE/umoci/pull/198#issuecomment-337937858 [4]: https://golang.org/ref/spec#Defer_statements Signed-off-by: W. Trevor King <wking@tremily.us>
oci/cas/dir/dir.go
Outdated
if err := os.RemoveAll(path); os.IsNotExist(err) { | ||
return nil // somebody else beat us to it | ||
} else if err != nil { | ||
log.Warnf("failed to remove %s (%s)", path, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more thing, please format this warning like this:
log.Warnf("failed to remove %s: %v", path, err)
dc79a46
to
c1616af
Compare
This allows the use of other non-flocking consumers (e.g. other CAS implementations) in the same base directory. The '.umoci-*' approach was suggested by Aleksa [1] as a way to keep Clean from stepping on other consumers' toes. I'd suggesting using pruneExpire for ModTime-based backstopping, but Aleksa was concerned about leaking temporary directories from crashed umoci runs, primarily for disk-space concerns [1]. The .umoci-* approach allows such abandoned cruft to be immediately cleaned, without worrying about clobbering non-umoci consumers and extentions (unless they use .umoci-* filenames, in which case they deserve it ;). Aleksa also mentioned that non-umoci consumers might currently be relying on umoci to clean up their abandoned cruft [1]. But we both agree that non-umoci consumers should be cleaning up after themselves and/or providing their own "cleanup anything abandoned by previous runs of my tool" functionality, and should not be relying on umoci to do that for them [2,3]. I've gone with .umoci-* instead of Aleksa's .umoci-tmpdir, because I think the . prefix is a sufficient hint that these directories are umoci-specific scratch space. Removal errors (because of insufficient permissions, etc.) seemed like they should be non-fatal so we could continue to remove other cruft. However, I didn't want to silently ignore the failed removal. In this commit, I log those errors with a "warning" level. The new cleanFile method gives us tighter scoping for the defer lock release / file close, because those fire on function-exit [4]. With the old code, we'd keep the removed directory file-descriptors and associated flocks while Clean processed further directories. With the new code, we release those resources immediately after we finish processing the directory in question. [1]: https://github.com/openSUSE/umoci/pull/198#issuecomment-337889528 [2]: https://github.com/openSUSE/umoci/pull/198#issuecomment-337928272 [3]: https://github.com/openSUSE/umoci/pull/198#issuecomment-337937858 [4]: https://golang.org/ref/spec#Defer_statements Signed-off-by: W. Trevor King <wking@tremily.us>
This allows the use of other non-flocking consumers (e.g. other CAS implementations) in the same base directory. The '.umoci-*' approach was suggested by Aleksa [1] as a way to keep Clean from stepping on other consumers' toes. I'd suggesting using pruneExpire for ModTime-based backstopping, but Aleksa was concerned about leaking temporary directories from crashed umoci runs, primarily for disk-space concerns [1]. The .umoci-* approach allows such abandoned cruft to be immediately cleaned, without worrying about clobbering non-umoci consumers and extentions (unless they use .umoci-* filenames, in which case they deserve it ;). Aleksa also mentioned that non-umoci consumers might currently be relying on umoci to clean up their abandoned cruft [1]. But we both agree that non-umoci consumers should be cleaning up after themselves and/or providing their own "cleanup anything abandoned by previous runs of my tool" functionality, and should not be relying on umoci to do that for them [2,3]. I've gone with .umoci-* instead of Aleksa's .umoci-tmpdir, because I think the . prefix is a sufficient hint that these directories are umoci-specific scratch space. Removal errors (because of insufficient permissions, etc.) seemed like they should be non-fatal so we could continue to remove other cruft. However, I didn't want to silently ignore the failed removal. In this commit, I log those errors with a "warning" level. The new cleanFile method gives us tighter scoping for the defer lock release / file close, because those fire on function-exit [4]. With the old code, we'd keep the removed directory file-descriptors and associated flocks while Clean processed further directories. With the new code, we release those resources immediately after we finish processing the directory in question. [1]: https://github.com/openSUSE/umoci/pull/198#issuecomment-337889528 [2]: https://github.com/openSUSE/umoci/pull/198#issuecomment-337928272 [3]: https://github.com/openSUSE/umoci/pull/198#issuecomment-337937858 [4]: https://golang.org/ref/spec#Defer_statements Signed-off-by: W. Trevor King <wking@tremily.us>
c1616af
to
332db95
Compare
LGTM. |
Following Git and defaulting to two weeks, as discussed in #17. This allows the use of other non-flocking consumers (e.g. other CAS implementations) in the same base directory.
The added robustness comes with a price, though, since we now have to balance cleanliness vs. robustness for those other consumers.
The hard-coded two-week cutoff may be insufficient for some use cases. Maybe somebody will need to download a huge layer through a very slow pipe, and they'll bump into this limit. Or maybe they have a high-cruft workflow or a small disk and would like to shorten this limit. If that happens or we become concerned that it might, we should make
pruneExpire
configurable.Increasing
pruneExpire
protects more flock-less consumers fromClean
, but also means that cruft can survive for longer beforeClean
is confident enough to remove it. Setting an infitepruneExpire
would be very safe but would makeClean
a no-op. Two weeks seems like a safe choice, since well-behaved consumers will clean up after themselves when they are closed.Clean
is just handling poorly-behaved consumers (or well-behaved consumers that had a hard shutdown). I don't expect cruft to build up quickly enough for the two-week default to be an issue for most users.Consumers who do not wish to rely on
pruneExpire
(perhaps they have long-running operations or are locally settingpruneExpire
very short) can continue to flock their temporary files and directories to protect them fromClean
. Flocking a directory will also protect all content within that directory.Removal errors (because of insufficient permissions, etc.) seemed like they should be non-fatal so we could continue to remove other cruft. However, I didn't want to silently ignore the failed removal. In this commit, I log those errors with a "warning" level.
The order of walk means that an old directory containing an old file will not be removed in a single pass. The first
Clean
will fail to remove the old directory because it is not empty, but will remove the old file (assuming it has sufficient permissions, etc.). A secondClean
pass will remove the now-empty old directory. With cruft building slowly andClean
running fairly frequently, the delayed old-directory removal didn't seem like a big enough issue to warrant an immediate re-clean attempt.