This repository has been archived by the owner on Apr 26, 2024. It is now read-only.
-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add documentation for cancellation of request processing
Signed-off-by: Sean Quah <seanq@matrix.org>
- Loading branch information
Sean Quah
committed
May 17, 2022
1 parent
6d8d121
commit c09f777
Showing
3 changed files
with
389 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
Add documentation for cancellation of request processing. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,387 @@ | ||
# Cancellation | ||
Sometimes, requests take a long time to service and clients disconnect | ||
before Synapse produces a response. To avoid wasting resources, Synapse | ||
can cancel request processing for select endpoints with the `@cancelled` | ||
decorator. | ||
|
||
Synapse makes use of Twisted's `Deferred.cancel()` feature to make | ||
cancellation work. | ||
|
||
## Enabling cancellation for an endpoint | ||
1. Check that the endpoint method, and any `async` functions in its call | ||
tree handle cancellation correctly. See | ||
[Handling cancellation correctly](#handling-cancellation-correctly) | ||
for a list of things to look out for. | ||
2. Apply the `@cancellable` decorator to the `on_GET/POST/PUT/DELETE` | ||
method. It's not recommended to make non-`GET` methods cancellable, | ||
since cancellation midway through some database updates is less | ||
likely to be handled correctly. | ||
|
||
## Mechanics | ||
There are two stages to cancellation: downward propagation of a | ||
`cancel()` call, followed by upwards propagation of a `CancelledError` | ||
out of a blocked `await`. | ||
Both Twisted and asyncio have a cancellation mechanism. | ||
|
||
| | Method | Exception | Exception inherits from | | ||
|---------------|---------------------|-----------------------------------------|-------------------------| | ||
| Twisted | `Deferred.cancel()` | `twisted.internet.defer.CancelledError` | `Exception` (!) | | ||
| asyncio | `Task.cancel()` | `asyncio.CancelledError` | `BaseException` | | ||
|
||
### Deferred.cancel() | ||
When Synapse starts handling a request, it runs the async method | ||
responsible for handling it using `defer.ensureDeferred`, which returns | ||
a `Deferred`. | ||
|
||
```python | ||
def do_something() -> Deferred[None]: | ||
... | ||
|
||
async def on_GET() -> Tuple[int, JsonDict]: | ||
d = make_deferred_yieldable(do_something()) | ||
await d | ||
return 200, {} | ||
|
||
request = defer.ensureDeferred(on_GET()) | ||
``` | ||
|
||
During cancellation, `Deferred.cancel()` is called on the `Deferred` | ||
from `defer.ensureDeferred`, `request`. Twisted knows which `Deferred` | ||
`request` is waiting on and passes the `cancel()` call on to `d`. | ||
|
||
The `Deferred` being waited on, `d`, may have its own handling for | ||
`cancel()` and pass the call on to other `Deferred`s. | ||
|
||
Eventually, a `Deferred` handles the `cancel()` call by resolving itself | ||
with a `CancelledError`. | ||
|
||
### CancelledError | ||
The `CancelledError` gets raised out of the `await` and bubbles up, as | ||
per normal Python exception handling. | ||
|
||
## Handling cancellation correctly | ||
In general, when writing code that might be subject to cancellation, two | ||
things must be considered: | ||
* The effect of `CancelledError`s raised out of `await`s. | ||
* The effect of `Deferred`s being `cancel()`ed. | ||
|
||
Examples of code that handles cancellation incorrectly include: | ||
* `try-except` blocks which swallow `CancelledError`s. | ||
* Code that shares the same `Deferred`, which may be cancelled, between | ||
multiple requests. | ||
* Code that starts some processing that's exempt from cancellation, but | ||
uses a logging context from cancellable code. The logging context | ||
will be finished upon cancellation, while the uncancelled processing | ||
is still using it. | ||
|
||
Some common patterns are listed below in more detail. | ||
|
||
### `async` function calls | ||
Most functions in Synapse are relatively straightforward from a | ||
cancellation standpoint: they don't do anything with `Deferred`s and | ||
purely call and `await` other `async` functions. | ||
|
||
An `async` function handles cancellation correctly if its own code | ||
handles cancellation correctly and all the async function it calls | ||
handle cancellation correctly. For example: | ||
```python | ||
async def do_two_things() -> None: | ||
check_something() | ||
await do_something() | ||
await do_something_else() | ||
``` | ||
`do_two_things` handles cancellation correctly if `do_something` and | ||
`do_something_else` handle cancellation correctly. | ||
|
||
That is, when checking whether a function handles cancellation | ||
correctly, its implementation and all its `async` function calls need to | ||
be checked, recursively. | ||
|
||
As `check_something` is not `async`, it does not need to be checked. | ||
|
||
### CancelledErrors | ||
Because Twisted's `CancelledError`s are `Exception`s, it's easy to | ||
accidentally catch and suppress them. Care must be taken to ensure that | ||
`CancelledError`s are allowed to propagate upwards. | ||
|
||
<table width="100%"> | ||
<tr> | ||
<td width="50%" valign="top"> | ||
|
||
**Bad**: | ||
```python | ||
try: | ||
await do_something() | ||
except Exception: | ||
# `CancelledError` gets swallowed here. | ||
logger.info(...) | ||
``` | ||
</td> | ||
<td width="50%" valign="top"> | ||
|
||
**Good**: | ||
```python | ||
try: | ||
await do_something() | ||
except CancelledError: | ||
raise | ||
except Exception: | ||
logger.info(...) | ||
``` | ||
</td> | ||
</tr> | ||
<tr> | ||
<td width="50%" valign="top"> | ||
|
||
**OK**: | ||
```python | ||
try: | ||
check_something() | ||
# A `CancelledError` won't ever be raised here. | ||
except Exception: | ||
logger.info(...) | ||
``` | ||
</td> | ||
<td width="50%" valign="top"> | ||
|
||
**Good**: | ||
```python | ||
try: | ||
await do_something() | ||
except ValueError: | ||
logger.info(...) | ||
``` | ||
</td> | ||
</tr> | ||
</table> | ||
|
||
#### defer.gatherResults | ||
`defer.gatherResults` produces a `Deferred` which: | ||
* broadcasts `cancel()` calls to every `Deferred` being waited on. | ||
* wraps the first exception it sees in a `FirstError`. | ||
|
||
Together, this means that `CancelledError`s will be wrapped in | ||
a `FirstError` unless unwrapped. Such `FirstError`s are liable to be | ||
swallowed, so they must be unwrapped. | ||
|
||
<table width="100%"> | ||
<tr> | ||
<td width="50%" valign="top"> | ||
|
||
**Bad**: | ||
```python | ||
async def do_something() -> None: | ||
await make_deferred_yieldable( | ||
defer.gatherResults([...], consumeErrors=True) | ||
) | ||
|
||
try: | ||
await do_something() | ||
except CancelledError: | ||
raise | ||
except Exception: | ||
# `FirstError(CancelledError)` gets swallowed here. | ||
logger.info(...) | ||
``` | ||
|
||
</td> | ||
<td width="50%" valign="top"> | ||
|
||
**Good**: | ||
```python | ||
async def do_something() -> None: | ||
await make_deferred_yieldable( | ||
defer.gatherResults([...], consumeErrors=True) | ||
).addErrback(unwrapFirstError) | ||
|
||
try: | ||
await do_something() | ||
except CancelledError: | ||
raise | ||
except Exception: | ||
logger.info(...) | ||
``` | ||
</td> | ||
</tr> | ||
</table> | ||
|
||
### Creation of `Deferred`s | ||
If a function creates a `Deferred`, the effect of cancelling it must be considered. `Deferred`s that get shared are likely to have unintended behaviour when cancelled. | ||
|
||
<table width="100%"> | ||
<tr> | ||
<td width="50%" valign="top"> | ||
|
||
**Bad**: | ||
```python | ||
cache: Dict[str, Deferred[None]] = {} | ||
|
||
def wait_for_room(room_id: str) -> Deferred[None]: | ||
deferred = cache.get(room_id) | ||
if deferred is None: | ||
deferred = Deferred() | ||
cache[room_id] = deferred | ||
# `deferred` can have multiple waiters. | ||
# All of them will observe a `CancelledError` | ||
# if any one of them is cancelled. | ||
return make_deferred_yieldable(deferred) | ||
|
||
# Request 1 | ||
await wait_for_room("!aAAaaAaaaAAAaAaAA:matrix.org") | ||
# Request 2 | ||
await wait_for_room("!aAAaaAaaaAAAaAaAA:matrix.org") | ||
``` | ||
</td> | ||
<td width="50%" valign="top"> | ||
|
||
**Good**: | ||
```python | ||
cache: Dict[str, Deferred[None]] = {} | ||
|
||
def wait_for_room(room_id: str) -> Deferred[None]: | ||
deferred = cache.get(room_id) | ||
if deferred is None: | ||
deferred = Deferred() | ||
cache[room_id] = deferred | ||
# `deferred` will never be cancelled now. | ||
# A `CancelledError` will still come out of | ||
# the `await`. | ||
# `delay_cancellation` may also be used. | ||
return make_deferred_yieldable(stop_cancellation(deferred)) | ||
|
||
# Request 1 | ||
await wait_for_room("!aAAaaAaaaAAAaAaAA:matrix.org") | ||
# Request 2 | ||
await wait_for_room("!aAAaaAaaaAAAaAaAA:matrix.org") | ||
``` | ||
</td> | ||
</tr> | ||
<tr> | ||
<td width="50%" valign="top"> | ||
</td> | ||
<td width="50%" valign="top"> | ||
|
||
**Good**: | ||
```python | ||
cache: Dict[str, List[Deferred[None]]] = {} | ||
|
||
def wait_for_room(room_id: str) -> Deferred[None]: | ||
if room_id not in cache: | ||
cache[room_id] = [] | ||
# Each request gets its own `Deferred` to wait on. | ||
deferred = Deferred() | ||
cache[room_id]].append(deferred) | ||
return make_deferred_yieldable(deferred) | ||
|
||
# Request 1 | ||
await wait_for_room("!aAAaaAaaaAAAaAaAA:matrix.org") | ||
# Request 2 | ||
await wait_for_room("!aAAaaAaaaAAAaAaAA:matrix.org") | ||
``` | ||
</td> | ||
</table> | ||
|
||
### Uncancelled processing | ||
Some `async` functions may kick off some `async` processing which is | ||
intentionally protected from cancellation, by `stop_cancellation` or | ||
other means. If the `async` processing inherits the logcontext of the | ||
request which initiated it, care must be taken to ensure that the | ||
logcontext is not finished before the `async` processing completes. | ||
|
||
<table width="100%"> | ||
<tr> | ||
<td width="50%" valign="top"> | ||
|
||
**Bad**: | ||
```python | ||
cache: Optional[ObservableDeferred[None]] = None | ||
|
||
async def do_something_else( | ||
to_resolve: Deferred[None] | ||
) -> None: | ||
await ... | ||
logger.info("done!") | ||
to_resolve.callback(None) | ||
|
||
async def do_something() -> None: | ||
if not cache: | ||
to_resolve = Deferred() | ||
cache = ObservableDeferred(to_resolve) | ||
# `do_something_else` will never be cancelled and | ||
# can outlive the `request-1` logging context. | ||
run_in_background(do_something_else, to_resolve) | ||
|
||
await make_deferred_yieldable(cache.observe()) | ||
|
||
with LoggingContext("request-1"): | ||
await do_something() | ||
``` | ||
</td> | ||
<td width="50%" valign="top"> | ||
|
||
**Good**: | ||
```python | ||
cache: Optional[ObservableDeferred[None]] = None | ||
|
||
async def do_something_else( | ||
to_resolve: Deferred[None] | ||
) -> None: | ||
await ... | ||
logger.info("done!") | ||
to_resolve.callback(None) | ||
|
||
async def do_something() -> None: | ||
if not cache: | ||
to_resolve = Deferred() | ||
cache = ObservableDeferred(to_resolve) | ||
run_in_background(do_something_else, to_resolve) | ||
# We'll wait until `do_something_else` is | ||
# done before raising a `CancelledError`. | ||
await make_deferred_yieldable( | ||
delay_cancellation(cache.observe()) | ||
) | ||
else: | ||
await make_deferred_yieldable(cache.observe()) | ||
|
||
with LoggingContext("request-1"): | ||
await do_something() | ||
``` | ||
</td> | ||
</tr> | ||
<tr> | ||
<td width="50%"> | ||
|
||
**OK**: | ||
```python | ||
cache: Optional[ObservableDeferred[None]] = None | ||
|
||
async def do_something_else( | ||
to_resolve: Deferred[None] | ||
) -> None: | ||
await ... | ||
logger.info("done!") | ||
to_resolve.callback(None) | ||
|
||
async def do_something() -> None: | ||
if not cache: | ||
to_resolve = Deferred() | ||
cache = ObservableDeferred(to_resolve) | ||
# `do_something_else` will get its own independent | ||
# logging context. `request-1` will not count any | ||
# metrics from `do_something_else`. | ||
run_as_background_process( | ||
"do_something_else", | ||
do_something_else, | ||
to_resolve, | ||
) | ||
|
||
await make_deferred_yieldable(cache.observe()) | ||
|
||
with LoggingContext("request-1"): | ||
await do_something() | ||
``` | ||
</td> | ||
<td width="50%"> | ||
</td> | ||
</tr> | ||
</table> |