-
Notifications
You must be signed in to change notification settings - Fork 458
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix resource leak in Maven plugin #571
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! It's great to see the consequences of the "content hash" choice, and there's a good chance that it will be the only possibility. Turns out there are some problems with sorting on filename, and even some problems with sorting on content, which I described in my review below. The problems aren't show-stoppers though, and it might be our best path.
What do you think about FileLocator
first checking to see if it's a local config file, and only using copy-to-hashed-path for remote URLs? I pushed up a demo in #572. Whether we merge this (#571) or not, it seems like #572 speeds things up a bit, and I think it fixes #559 on its own.
In order to support the Gradle Build Cache, we might need to do the FileSignature change you've implemented here no matter what, so it's super useful to see it play out. But it hurts me to see how much extra IO it forces us to do. It's probably premature optimization to worry about it, but I think I have a way that we could keep the current FileSignature implementation, and only use content hashing when it's needed.
If we can solve #559 with #572 in the short-term, then I'd prefer to do that and buy some time to think about keeping FileSignature more IO-efficient.
lib/src/main/java/com/diffplug/spotless/npm/NpmFormatterStepStateBase.java
Outdated
Show resolved
Hide resolved
Hey @nedtwigg, I'm observing the same issue when building spotless as described in #328 (comment):
I usually have two git remotes:
and I clone the original repo first, then rename upstream from This rename seems to be what's causing the NPE. Steps to reproduce: $ git clone git@github.com:diffplug/spotless.git
$ cd spotless
$ gradle clean # works fine
$ git remote rename origin upstream
$ gradle clean # fails with NPE Not sure if it's a bug or a problem with my local setup. That's why I describe it here instead of creating an issue. The workaround for me is to not rename the remote :) I'll proceed with fixing this PR. |
Thanks for reporting! I bet part of this is the |
The steps to reproduce use a clean clone of |
553cffe
to
52a7a6c
Compare
@nedtwigg review comments should now be addressed. I fixed them and squashed everything into the same commit |
Awesome, thanks very much! I'm gonna break this commit into two (one for the npm part, another for the rest) and then add a few things. Do you mind if I force-push to your branch, to maintain this discussion history? |
… file instead of the whole node_modules directory.
The resource leak increased the number of threads with every execution of Exclipse-based formatter step in a Maven module. It happened because the `SpotlessCache` wasn't properly utilized. Every cache key was based on randomized file names because Maven plugin relocates config files into the target directory with a random file name. This resulted in all cache accesses being cache misses. This commit fixes the problem by: * making the Maven plugin use non-random file names for output files. `FileLocator` is changed to construct names based on a hash of the input path. Input path can be a URL, file in a JAR, or a regular local file * changing `FileSignature` (used for `SpotlessCache` keys) to use filenames instead of paths and file hashes instead of last modified timestamps These two changes allow `FileSignature` to uniquely identify a set of files by their content and not take file paths into account. As a result, created keys for `SpotlessCache` are properly comparable which results in lots of cache hits and decreased number of created threads. Changed `FileSignature` only accepts files, not directories.
Sure, no problem. Let me know if I can help with anything on this PR |
52a7a6c
to
823c2c3
Compare
…performance based on lastModified that we had prior to this PR.
0c96ee3
to
e4db705
Compare
…le, once a file has changed, we can discard the old one.
where a FileSignature is generated for multiple files with the same filename.
@lutovich I made two substantive changes. One is that I added a descriptive warning message that I don't expect anyone to ever see, but if they do, they'll be glad to see it rather than a silent failure (aaddff8). And the second thing I did was add a cache, so that when we're doing @simschla can you take a peek at b695622? I think it's a good change. We're changing FileSignature so that it only handles files (not folders). The upside is that caching will work better (remote build cache for Gradle, fixes other caching issues for Maven). |
This looks good to me. Thank you for the change. |
No rush @lutovich, but I'm waiting for a positive confirmation from you on the changes I made here before merging this, which allows other merges to go through. I'm only echoing because the GitHub UI doesn't allow me to ask for a code review from you (since it is your PR), so wanted to make sure it didn't get lost :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, made two minor comments.
File file = new File(p); | ||
// calculate the size and content hash of the file | ||
long size = 0; | ||
byte[] buf = new byte[1024]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious, why do you prefer to read the file this way instead of Files#readAllBytes()
?
Also, maybe the code could be simplified a bit by using DigestInputStream
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some of the eclipse jars push into the tens of MB, and SHA-256 only works on 512 bytes at a time. It's a micro-optimization to worry about that memory pressure, very possible that it's slower than Files.readAllBytes
, and it wasn't my motivation. The real reason is that I wanted to protect against the race condition where you read a file, someone else modifies it, and then you check the timestamp (or the opposite order) and end up with a lying cache. By opening it to read, and checking the timestamp while it's open, you can be (more) sure that the content you're reading is the content for that timestamp. I think it's still not guaranteed on unix systems, so it's not perfect, but that was the motivation.
Released in |
The resource leak increased the number of threads with every execution of Exclipse-based formatter step in a Maven module. It happened because the
SpotlessCache
wasn't properly utilized. Every cache key was based on randomized file names because Maven plugin relocates config files into the target directory with a random file name. This resulted in all cache accesses being cache misses.This PR fixes the problem by:
making the Maven plugin use non-random file names for output files.
FileLocator
is changed to construct names based on a hash of the input path. Input path can be a URL, file in a JAR, or a regular local filechanging
FileSignature
(used forSpotlessCache
keys) to use filenames instead of paths and file hashes instead of last modified timestampsThese two changes allow
FileSignature
to uniquely identify a set of files by their content and not take file paths into account. As a result, created keys forSpotlessCache
are properly comparable which results in lots of cache hits and decreased number of created threads.Changed
FileCignature
only accepts files, not directories. NPM formatters changed to have their signatures based on package.json file instead of the whole node_modules directory.Fixes #559