-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Async git repository opening #8721
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Async git repository opening #8721
Conversation
The `RepositoryProvider` protocol's `open` method is defined as synchronous. Convert this to `async` to avoid concurrency workarounds in clients that might be prone to deadlocks. This involved porting `RepositoryManager.lookup` to structured concurrency. The method has some concurrency constraints that are preserved: - There is a limit (`maxConcurrentOperations`) on how many simultaneous lookups can be performed concurrently. - If a lookup is requested and one is in flight for the same `RepositorySpecifier`, the in flight request completes before the new one is started. This PR also moves away from using a `DispatchQueue` to make delegate calls, instead opting to use a `Task` that calls through an `actor` based proxy to the underlying delegate. Gating these calls through an `actor` allows us to ensure the calls are processed sequentially as they were with the `DispatchQueue`.
|
||
private func waitIfNeeded() async { | ||
if activeTasks >= concurrentTasks { | ||
await withCheckedContinuation { continuation in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add a withTaskCancellationHandler
here and potentially a withTaskPriorityEscalationHandler
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@FranzBusch would it be sufficient to propagate cancellation by wrapping the waitingTasks.append in a Task.isCancelled check?
if !Task.isCancelled {
waitingTasks.append(continuation)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No that's not enough since the task can be cancelled after the continuation has been appended. In general, to be structured concurrency and cancellation compliant whenever you create a continuation you need to wrap it into a withTaskCancellationHandler
. In this case you probably want to remove that continuation from the waitingTasks
and resume it with a CancellationError
.
Since you are currently using an actor
this won't work easily without an unstructured task. So I would recommend you to switch over to a class
and protect the state with a Mutex
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, makes sense. Unfortunately SwiftPM's supported platforms
is set to .macOS(.v13)
and Mutex
is only available on >= .v15, so I'll have to use an NSLock as we do elsewhere in SwiftPM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason why we can't bump to macOS 15.0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. We cannot require macOS 15 at this time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the AsyncOperationQueue
to respect cancellation of the parent task. Incorporating a Mutex
polyfill can come later and bulk replace the state: T
/stateLock: NSLock
pattern used throughout SwiftPM.
self.waitingTasksLock.withLock { | ||
if let taskIndex = self.waitingTasks.firstIndex(where: { $0.0 == taskId }) { | ||
let task = self.waitingTasks.remove(at: taskIndex) | ||
task.1.resume(throwing: CancellationError()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
resuming continuations while a lock is held is dangerous and can lead to deadlocks. Note withTaskCancellationHandler's documentation:
Cancellation handlers which acquire locks must take care to avoid deadlock. The cancellation handler may be invoked while holding internal locks associated with the task or other tasks. Other operations on the task, such as resuming a continuation, may acquire these same internal locks. Therefore, if a cancellation handler must acquire a lock, other code should not cancel tasks or resume continuations while holding that lock.
I think here you can return the continuation out of the lock and then resume it after releasing the lock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, I've fixed this up in two places
try await withTaskCancellationHandler { | ||
try await withCheckedThrowingContinuation { (continuation: CheckedContinuation<Void, any Error>) in | ||
if !Task.isCancelled { | ||
waitingTasksLock.withLock { | ||
waitingTasks.append((taskId, continuation)) | ||
} | ||
} else { | ||
continuation.resume(throwing: CancellationError()) | ||
} | ||
} | ||
} onCancel: { | ||
// If the parent task is cancelled then we need to manually handle resuming the | ||
// continuation for the waiting task with a `CancellationError`. | ||
self.waitingTasksLock.withLock { | ||
if let taskIndex = self.waitingTasks.firstIndex(where: { $0.0 == taskId }) { | ||
let task = self.waitingTasks.remove(at: taskIndex) | ||
task.1.resume(throwing: CancellationError()) | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jakepetroules raised a good point with resuming
while a lock is held. In general, never do any outcalls while a lock is held. FWIW, we have run into exactly this deadlock before.
Now the first part of this code is also not 100% correct. There is subtle race that can happen in this code in this scenario:
- Task is running not cancelled
- You are checking
Task.isCancelled
returnsfalse
- Something is canceling your task
- The task cancellation handler runs and acquires the lock. Tries to remove the continuation but it hasn't been stored yet so resumes nothing
- The code after the
Task.isCancelled
runs and stores the continuation
In this case you now stored a continuation of a cancelled task and the task cancellation handler already ran. What you have to do instead is to model a tri-state for a waiter:
enum Waiter {
case creating
case waiting(Continuation)
case cancelled
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense, thanks. I've adopted a tri-state and made sure that the continuation always resumes with a cancellation error when appropriate.
@FranzBusch @jakepetroules I've addressed comments and added some tests for |
|
||
deinit { | ||
waitingTasksLock.withLock { | ||
if !waitingTasks.filter({ $0.continuation != nil }).isEmpty { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it unexpected if there is anything at all in waitingTasks, not just those with a non-nil continuation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, you're right, I've updated to revert this check.
let taskId = ID() | ||
waitingTasksLock.withLock { | ||
waitingTasks.append(.creating(taskId)) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could move this into the above lock and return an optional UUID
. This way we don't need to acquire the lock more than once for the initial setup
fileprivate typealias WaitingContinuation = CheckedContinuation<Void, any Error> | ||
|
||
private let concurrentTasks: Int | ||
private var activeTasks: Int = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this counter at all or can we just use waitingTasks.count
? Keeping the two in sync can be tricky and I had to triple check that the code does it correctly. That requires us to change the WaitingTask
to be just WorkTask
with a new case but IMO that would make this code a lot clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can, but it that would mean the waitingTasks would no longer be able to be modelled as a queue since the actively running tasks would be contained within it.
I think an OrderedDictionary could work instead, but runs in to the swift-collections in swift-build question.
|
||
// If the task was cancelled in between creating the task cancellation handler and aquiring the lock, | ||
// we should resume the continuation with a `CancellationError`. | ||
if case .cancelled = waitingTasks[index] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of writing an if case
I would prefer if we exhaustively switch over the returned task here similar to how you did it in the onCancel
case. I always recommend doing that since it makes sure every case is properly handled.
case .waiting: | ||
activeTasks -= 1 | ||
// Begin the next waiting task | ||
return waitingTasks.remove(at: 0).continuation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we are operating under FIFO here it is worth using a Deque
instead of an array since this remove(at:)
is going to reallocate the entire array.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was trying to keep in mind this will make its way back in to swift-build, which has no dependency on swift-collections and looks to be trying to keep its dependencies minimal. @jakepetroules is that accurate, or would swift-build accept a swift-collections dependency for this?
try await withCheckedThrowingContinuation { (continuation: WaitingContinuation) -> Void in | ||
let action: TaskAction? = waitingTasksLock.withLock { | ||
guard let index = waitingTasks.firstIndex(where: { $0.id == taskId }) else { | ||
// If the task was cancelled in onCancelled it will have been removed from the waiting tasks list. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is not true right? We are always going to get an index by just looking at the code in onCancel
. Either we find a .cancelled
case or we have run before. There is no way for onCancel
and this check to interleave in the way you describe here. Now, what can interleave is the signalCompletion
that also appears to remove cancelled()
tasks. Can we update this comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I've tightend up this comment to reflect how this guard code would actually get called.
@FranzBusch I've tried to simplify the code as suggested by removing the counter state. It did complicate the code in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Just one small nit
Thank you for your tireless reviews @FranzBusch! |
The `RepositoryProvider` protocol's `open` method is defined as synchronous. Convert this to `async` to avoid concurrency workarounds in clients that might be prone to deadlocks. This involved porting `RepositoryManager.lookup` to structured concurrency. The method has some concurrency constraints that are preserved: - There is a limit (`maxConcurrentOperations`) on how many simultaneous lookups can be performed concurrently. - If a lookup is requested and one is in flight for the same `RepositorySpecifier`, the in flight request completes before the new one is started.
The `RepositoryProvider` protocol's `open` method is defined as synchronous. Convert this to `async` to avoid concurrency workarounds in clients that might be prone to deadlocks. This involved porting `RepositoryManager.lookup` to structured concurrency. The method has some concurrency constraints that are preserved: - There is a limit (`maxConcurrentOperations`) on how many simultaneous lookups can be performed concurrently. - If a lookup is requested and one is in flight for the same `RepositorySpecifier`, the in flight request completes before the new one is started.
The
RepositoryProvider
protocol'sopen
method is defined as synchronous. Convert this toasync
to avoid concurrency workarounds in clients that might be prone to deadlocks.This involved porting
RepositoryManager.lookup
to structured concurrency. The method has some concurrency constraints that are preserved:maxConcurrentOperations
) on how many simultaneous lookups can be performed concurrently.RepositorySpecifier
, the in flight request completes before the new one is started.