-
Notifications
You must be signed in to change notification settings - Fork 1.4k
[SR-12851] System-wide cache of SwiftPM dependencies #2835
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SR-12851] System-wide cache of SwiftPM dependencies #2835
Conversation
864ed0f
to
7fd45ee
Compare
This refers to SR-12581, but the description of that issue seems unrelated to me, could the reference be wrong? |
Thanks for the heads up @MaxDesiatov I had the numbers twisted 😅. I updated the description accordingly. |
7fd45ee
to
f969e7a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change looks useful to me, at least in theory. I have no impact on the actual roadmap of SwiftPM and I don't know if the development team is interested in this feature though. I've left some coding style suggestions and nitpicks, but I don't think you should address them until you're sure that the SwiftPM team is interested in an implementation of this feature in principle.
If it is merged at some point, one thing I'd like to be included is configurable cache size, otherwise 20 GB seems like too big as a default, I don't remember having that much free space on any of my devices and would be frustrated to discover that all that free space was consumed by SwiftPM without my knowledge or control.
} catch { | ||
// The cache seems to be broken. Lets remove everything. | ||
try? localFileSystem.removeFileTree(cachePath) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the error should still be reported somehow, maybe just printing it would be an improvement.
@MaxDesiatov thanks for your feedback. I've already fixed some of the formatting issues.
As far as I know there is no swiftpm config file yet. I can think of 3 possible solutions without setting up a config file:
I prefer the last approach. |
First of all, thanks a lot for this contribution! I think this is a great idea, and I know it's been discussed as something desirable (though I don't think there was a JIRA for it already, and certainly not a PR). I'll want to look closer at the diffs, and in particular to see how it would be used from an IDE (though |
@swift-ci please test |
Actually: since RepositoryProvider already assumes Git semantics, and the GitRepositoryProvider is just the implementation that uses the 'git' command line tool, a more scalable approach might be to do this outside the RepositoryProvider. This would allow the repository provider to do its thing to fill the cache, and then a shared implementation would then clone the repo out of the cache into the desired location. That would make the cache work for all repository providers. |
@TG908, it’s probably best to ask
|
I like @SDGGiesbrecht's suggestion of using |
I've addressed some of the changes suggested by @aciidb0mb3r and @SDGGiesbrecht. Additionally I've fixed the build issue on Linux (The old git version. on ubuntu 16.0.4 didn't support |
Thanks for your contribution, this is a great idea! One thing I'm a little concerned about is how this will handle concurrent invocations of |
Thanks for the PR, @TG908 I haven't had time to dig into the details, but two high level comments:
|
5361806
to
f39d431
Compare
I wasn't aware that the "repositories" folders only purpose is caching. I like the idea of replacing it with a global cache. @neonichu @jakepetroules concurrent cache accesses will be protected by a
To make this more performant I would need to add shared locks to swift-tools-support-core, which I already did in my fork. I could use some help testing this on Windows though. |
f39d431
to
ba7a1d8
Compare
How should we handle some of the existing swift package subcommands and options like:
I went ahead and split the option Should locking of the repository directories be an implementation detail of I also decided that repositories checked out with Adding a subcommand for explicitly cleaning the cache would be useful as well. |
Thanks again for the PR @TG908 and sorry that it took a while to get back to you. I'll review this soon and get back you with suggestions on your questions. |
@@ -243,7 +209,14 @@ public class RepositoryManager { | |||
self.delegate?.handleWillUpdate(handle: handle) | |||
} | |||
|
|||
try repo.fetch() | |||
// FIXME: Workaround for `FileLock` only working on `LocaFileSystem` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should look into FileLock
essentially being a no op when using the in-memory fs instead of doing this workaround.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see two possible ways of achieving this:
- make
FileLock
part ofFileSystem
--> More extensible - make
FileLock
aware of theFileSystem
being used --> Fewer changes to the API required
Shouldn't lock()
behave like a normal lock when using the in-memory fs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#1 here seems like the best solution, although it would of course involve more changes, as you point out. It makes sense to me that for an in-memory file system, locking would be just normal locking (not a no-op). For this case, it seems that maybe all that would be needed is a new function on FileSystem
that lets you execute a block while holding a lock on a file, e.g. withLock(on: path) { /* do something */ }
. The local-file-system implementation is straightforward, but the in-memory file system implementation would involve a locked map of paths to FileLocks (though that's not technically sufficient either because of symlinks, but the implementation there could be refined over time).
// FIXME: Workaround for `FileLock` only working on `LocaFileSystem` | ||
if localFileSystem.exists(self.path) { | ||
try lock.withLock { | ||
try lock.withLock { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand why we are doing this nesting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I added the workaround to only lock when we are on the local filesystem, I probably copy pasted the block including the lock and added another lock.
I think this generally looks good, but I need to think through the locking behaviour a bit more. I'm actually not super familiar with the access patterns of SwiftPM w.r.t. the repository cache here, e.g. one scenario I'm worried about is a second client doing a fetch during the resolution process of a first client. The locking as its implemented in the PR avoids direct conflicts, but I want to make sure we're not opening ourselves up to side effects over the course of the whole resolution process. For example a scenario like this:
I am not sure this scenario actually happens in practice, but I'd like to spend a bit more time figuring out if it could cause problems down the line. |
I have thought about this a bit more and I think we should actually go back to an approach which keeps the per-project cache as well to avoid some of the side effects I was describing here. How about this:
We could potentially record the commit hash instead of making a copy, but I'm not 100% sure this would prevent us from seeing issues when that commit gets deleted remotely and we fetch. I think we could explore something like that as a later optimization to copying clones after fetching. |
Apparently not having looked closely enough at the code of this PR earlier, I actually just wrote up a reply which was essentially "why not simplify things and use I think using So I think the essential approach should be:
The no-checkout thing both saves time (to avoid an unnecessary materialization of files in the checkout) and also prevents errors in case of things like incorrectly set default-branch references in remote repos (happening more often recently with renaming to Git does locking at the repository level, though that might manifest as a failure to perform the operation if it's already locked rather than waiting for it to finish. So a SwiftPM-specific file lock around the repo in the cache might still be warranted. I think the key here will be to keep the approach simple, and I think the approach of using |
Boris, I think you're right that there is a risk of a problem if we don't have an intermediate per-package repository. The package resolution does
So after thinking this through I've changed my mind, and now agree with you Boris. I think it would make things safer and simpler to see the global cache as strictly a way to quickly populate the local one, so I think the changes here would be to the |
… optional to enable/disable caching. Replaced git clone --reference-if-able with --reference since the git version provided by ubuntu 16.04 does not support --reference-if-able
… and setting its remote to the original url
a6e3cbd
to
e0e5c87
Compare
We moved our default branch to |
I created a draft for the implementation of a system wide cache for swift packages. I would love to get some of your feedback.
In particular I would like some feedback about the
choice of the directory for the cache:.swiftpm/cache/repositories
and its maximum size of 20 GB.
SR-12851