Skip to content

Change default for global vs child coroutines by scoping coroutine builder (structured concurrency) #410

Closed
@elizarov

Description

@elizarov

Background and definitions

Currently coroutine builders like launch { ... } and async { ... } start a global coroutine by default. By global we mean that this coroutine's lifetime is completely standalone just like a lifetime of a daemon thread and outlives the lifetime of the job that had started it. It is terminated only explicitly or on shutdown of the VM, so the invoker had to make extra steps (like invoking join/await/cancel) to ensure that it does live indefinitely.

In order to start a child coroutine a more explicit and lengthly invocation is needed. Something like async(coroutineContext) { ... } and async(coroutineContext) { ... } or async(parent=job) { ... }, etc. Child coroutine is different from a global coroutine in how its lifetime is scoped. The lifetime of child coroutine is strictly subordinate to the lifetime of its parent job (coroutine). A parent job cannot complete until all its children are complete, thus preventing accidental leaks of running children coroutines outside of parent's scope.

Problem

This seems to be a wrong default. Global coroutines are error-prone. They are easy to leak and they do not represent a typical use-case where some kind of parallel decomposition of work is needed. It is easy to miss the requirement of adding an explicit coroutineContext or parent=job parameter to start a child coroutine, introducing subtle and hard to debug problems in the code.

Consider the following code that performs parallel loading of two images and returns a combined result (a typical example of parallel decomposition):

suspend fun loadAndCombineImage(name1: String, name2: String): Image {
    val image1 = async { loadImage(name1) }
    val image2 = async { loadImage(name2) }
    return combineImages(image1.await(), image2.await())
}

This code has a subtle bug in that if loading of the first image fails, then the loading of the second one still proceeds and is not cancelled. Moreover, any error that would occur in the loading of the second image in this case would be lost. Note, that changing async to async(coroutineContext) does not fully solve the problem as it binds async loading of images to the scope of the larger (enclosing) coroutine which is wrong in this case. In this case we want these async operations to be children of loadAndCombineImage operation.

For some additional background reading explaining the problem please see Notes on structured concurrency, or: Go statement considered harmful

Solution

The proposed solution is to deprecate top-level async, launch, and other coroutine builders and redefine them as extension functions on CoroutineScope interface instead. A dedicated top-level GlobalScope instance of CoroutineScope is going to be defined.

Starting a global coroutine would become more explicit and lengthly, like GlobalScope.async { ... } and GlobalScope.launch { ... }, giving an explicit indication to the reader of the code that a global resource was just created and extra care needs to be taken about its potentially unlimited lifetime.

Starting a child coroutine would become less explicit and less verbose. Just using async { ... } or launch { ... } when CoroutineScope is in scope (pun intended) would do it. In particular, it means that the following slide-ware code would not need to use join anymore:

fun main(args: Array<String>) = runBlocking { // this: CoroutineScope 
    val jobs = List(100_000) { 
        launch {
            delay(1000)
            print(".")
        }
    }
    // no need to join here, as all launched coroutines are children of runBlocking automatically
}

For the case of parallel decomposition like loadAndCombineImage we would define a separate builder function to capture and encapsulate the current coroutine scope, so that the following code will work properly in all kind of error condition and will properly cancel the loading of the second image when loading of the first one fails:

suspend fun loadAndCombineImage(name1: String, name2: String): Image = coroutineScope { // this: CoroutineScope 
    val image1 = async { loadImage(name1) }
    val image2 = async { loadImage(name2) }
    combineImages(image1.await(), image2.await())
}

Additional goodies

Another idea behind this design is that it should be easy to turn any entity with life-cycle into an entity that you could start coroutines from. Consider, for example, some kind of an application-specific activity that is launch some coroutines but all of those coroutines should be cancelled when the activity is destroyed (for example). Now it looks like this:

class MyActivity {
    val job = Job() // create a job as a parent for coroutines
    val backgroundContext = ... // somehow inject some context to launch coroutines
    val ctx = backgroundContext + job // actual context to use with coroutines

    fun doSomethingInBackground() = launch(ctx) { ... }
    fun onDestroy() { job.cancel() }    
}

The proposal is to simply this pattern, by allowing an easy implementation of CoroutineScope interface by any business entities like the above activity:

class MyActivity : CoroutineScope {
    val job = Job() // create a job as a parent for coroutines
    val backgroundContext = ... // somehow inject some context to launch coroutines
    override val scopeContext = backgroundContext + job // actual context to use with coroutines

    fun doSomethingInBackground() = launch { ... } // !!! 
    fun onDestroy() { job.cancel() }    
}

Now we don't need to remember to specify the proper context when using launch anywhere in the body of MyActivity class and all launched coroutines will get cancelled when lifecycle of MyActivity terminates.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions