Skip to content

TaskGroup needs a better data structure #71212

Open
@za-creature

Description

@za-creature

Motivation

Currently, child tasks are stored in a linked list which scales poorly if tasks complete out of order. In this pathological example, tasks complete in reverse order, which causes quadratic behavior in the runtime and thus discourages the use of structured concurrency:

import Foundation

typealias Continuation = CheckedContinuation<Void, Never>
class NoWarnings: @unchecked Sendable {
    private var mut: UnsafeMutablePointer<pthread_mutex_t>
    init() { mut = .allocate(capacity: 1); pthread_mutex_init(mut, nil) }
    deinit { pthread_mutex_destroy(mut); mut.deallocate() }

    var data = [Continuation]()
    func test(_ limit: Int) async {
        await withTaskGroup(of: Void.self) { group in
            for _ in 0..<limit {
                group.addTask {
                    await withCheckedContinuation { task in
                        pthread_mutex_lock(self.mut)
                        defer { pthread_mutex_unlock(self.mut) }

                        self.data.append(task)
                        guard self.data.count == limit else { return }

                        // last task to run wakes everybody up in reverse order
                        for i in (0..<limit).reversed() {
                            self.data[i].resume()
                        }
                    }
                }
            }
        }
    }
}

func measure<T>(_ label: String, block: @Sendable () async throws -> T) async rethrows {
    func now() -> UInt64 { clock_gettime_nsec_np(CLOCK_MONOTONIC_RAW) }
    let start = now()
    let res = try await block()
    print("\(label): \(res) in \((now() - start) / 1_000_000)ms")
}


await measure("1_000") { await NoWarnings().test(1_000) }
await measure("10_000") { await NoWarnings().test(10_000) }
await measure("25_000") { await NoWarnings().test(25_000) }
await measure("50_000") { await NoWarnings().test(50_000) }
await measure("100_000") { await NoWarnings().test(100_000) }

Proposed solution

A balanced tree should do it, or maybe a hash table?

The balanced tree is presumably the easiest way forward as it's not that different from the a linked list.

Alternatives considered

Currently task group seems to use a lock to prevent mutual access when editing its children.

Presumably, this could be made concurrent with the use of a fixed-size atomic Freelist, but it would add additional constraints to the API (e.g. max number of concurrent tasks), so it would be best implemented as a separate type of TaskGroup. At the same time, practical constraints currently cap the number of tasks to ~10000 so maybe such a change wouldn't break too much existing code, especially if the concurrency is user-configurable.

Additional information

https://forums.swift.org/t/review-requested-asynclock/69682/4

Metadata

Metadata

Assignees

No one assigned

    Labels

    ConcurrencуArea → standard library: The `Concurrency` module under the standard library umbrellaconcurrency runtimeFeature: umbrella label for concurrency runtime featuresfeatureA feature request or implementationperformance

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions