Description
I'd like to propose adding a timeout
option to the Fetch API.
Prior issue: I know this was in #20, but that discussion was long and winding. One of the blockers was aborting requests, which was a big rabbit hole, but is now solved with AbortController
! So I'd like to open this fresh issue to discuss the importance and ergonomics of timeouts.
Please bare with me! I just want a blank slate to argue the case…
Timeouts are critical!
Before talking implementation… it's important to reiterate just how critical timeouts are. Since TCP/HTTP is inherently unpredictable, you can't actually guarantee that any call to fetch
will ever resolve or reject. It can very easily just hang forever waiting for a response. You have no guarantee of a response. To reiterate:
Any HTTP request without a timeout might hang your code forever.
That should scare you if you're trying to write stable code. Because this applies at all levels of the callstack. If you call into an async
function that calls fetch
without a timeout, you can still hang forever.
The tried-and-true solution for this uncertainty is timeouts. Pretty much every HTTP-requesting library has a way to use them. As long as you've set a timeout for a response, you've returned to a state of certainty. Your request might fail, sure, but at least you're now guaranteed not to be left in an infinitely hanging state.
That is the critical piece: timeouts eliminate uncertainty.
I think Python's requests
documentation sums up the severity nicely (emphasis mine):
You can tell Requests to stop waiting for a response after a given number of seconds with the
timeout
parameter. Nearly all production code should use this parameter in nearly all requests. Failure to do so can cause your program to hang indefinitely.
It underscores just how widespread the use case is. If your code is striving to be stable and bug-free, then every call to fetch
should have an associated timeout.
But to make that goal possible, specifying a timeout needs to be easy. Right now it's not.
Prior concern: In #20, people brought up other types of timeouts, like "idle timeouts", which ensure that at least some data is received for a request every interval of time. These can be useful, but they are definitely not the 90% case. And importantly, they don't actually eliminate the infinite hanging uncertainty. Other types of timeouts can either be implemented in userland or taken up in a separate issue if there's a need.
AbortController
is great, but not enough.
The AbortController
and AbortSignal
features are great! And they seem really well designed for use across different APIs and for low-level use cases. I have no qualms with them at all, and I think they are perfect to use to make a timeout
property available.
But I don't think that just adding them has solved "timeouts" for Fetch. Because they aren't ergonomic enough to offer a good solution for the common case.
Right now you'd need to do...
const controller = new AbortController()
const timeoutId = setTimeout(() => controller.abort(), 5000)
const res = await fetch('https://example.com', { signal: controller.signal })
const body = await res.json()
clearTimeout(timeoutId)
This is a lot of code.
It's too much code for a use case that is so common. (Remember, this is something that should be done for almost every call to fetch
.) Not only is it a lot, it requires learning and mastering a new AbortController
concept, just to get a guarantee of certainty for fetch
. For most users this is unnecessary.
And it's easy to get wrong too!
Most people want to wait for the entire body to be received (notice how the timeout is cleared after res.json()
), but most examples of using AbortController
in the wild right now do not properly handle this case, leaving them in an uncertain state.
Not only that, but prior to AbortController
(and still now) people would use libraries like p-timeout
and almost certainly added unexpected bugs because it is common to see people recommend things like:
const res = pTimeout(fetch('https://example.com'), 5000)
const body = await res.json()
That example also has the potential to hang forever!
What's the ideal UX?
Most people are currently using either no timeouts, or incorrectly written timeouts that still leave their code in a potentially infinitely hanging state. And these subtle bugs are only getting more common as more and more people switch to using the isomorphic-fetch
or node-fetch
packages. (And remember, this bubbles up the callstack!)
I think this is something that fetch
should solve. And to do that, we really need something as simple as:
fetch('https://example.com', {
timeout: 5000
})
It needs to be simple, because it needs to be something you can add to any call to fetch
and be guaranteed that it will no longer hang forever. Simple enough that is "just works" as expected. Which means that if people are reading the body (and they often are), the timeout should cover you when reading the body too.
Prior concern: In #20, people brought up that because fetch
breaks responses down into "headers" and "body" across two promises, it's unclear what a timeout
property should apply to. I think this is actually not a problem, and there's a good solution. (Keep reading! It's a solution that is used in Go for their Timeout
parameter.)
A potential solution…
For the timeout
option to match user expectations, it needs to incorporate reading the full body when they do.
This is just how people think about HTTP responses—it's the 90% use case. They will absolutely expect that timeout
encompasses time to read the entire response they're working with, not just the headers. (This mental model is also why people are incorrectly libraries like p-timeout
to add timeouts right now.)
However! Keep reading before you assume things…
The decision does not have to be the black-and-white "either the timeout only applies to the headers, or it only applies to the body" that was dismissed in #20. It can just as easily apply to either just the headers, or both the headers and the body, in a way that ensures it always meets user expectations and gives them the guarantee of certainty.
This is similar to how Go handles their default Timeout
parameter:
Timeout specifies a time limit for requests made by this
Client
. The timeout includes connection time, any redirects, and reading the response body. The timer remains running afterGet
,Head
,Post
, orDo
return and will interrupt reading of theResponse.Body
. Source
This is smart, because it allows timeout
to adapt to how the user's code handles the response. If you read the body, the timeout covers it. If you don't you don't need to worry either.
Here's what that looks like in terms of where errors are thrown…
const res = await fetch('https://example.com', { timeout: 5000 })
// Right now different types of network failures are thrown here.
//
// With a timeout property, timeout failures will also be thrown and
// thrown here *if* the headers of the request have not been received
// before the timeout elapses.
const json = await res.json()
// Right now network failures *and* JSON parsing failures are thrown here.
//
// With a timeout property, timeout failures will also be thrown and
// thrown here *if* the headers of the request were received, but the
// body of the request was not also received within the timeout.
To summarize what's going on there in English...
The timeout
property can be passed into fetch
calls, which specifies a total duration for the read operations on a response. If only the headers are read, the timeout only applies to the headers. If both the headers and body are read the timeout applies to the headers and body.
A real example.
So you can do:
const example = () => {
const res = await fetch('https://example.com', { timeout: 10000 })
const json = await res.json()
return json
}
Which guarantees you received that full JSON body in 10 seconds.
Or, if you don't care about the body...
const example = () => {
const res = await fetch('https://example.com', { timeout: 1000 })
return res.ok
}
Which guarantees you receive the headers in 1 second.
This aligns well with user expectations and use cases. As far as a user is concerned, they can set a timeout
property, and be covered from network uncertainty regardless of how much of the response they read. By setting timeout
, they are saved from the infinitely hanging bug.
What about $my_edge_case
?
To be clear, I'm not saying that this timeout
option handles every single need for timeouts under the sun. That's impossible.
There are concepts like "idle timeouts" that are unrelated to the uncertainty of TCP. And there are always going to be advanced use cases where you want a timeout for only the initial connection, or only for reading the headers, etc.
This proposal does not try to address those advanced cases.
It's about improving the ergonomics for the 90% of cases that should have a timeout already today, but likely don't.
The timeout
option is… optional. It's opt-in. And when you do opt-in to it, it gives you the guarantee of certainty—that your calls to fetch()
and res.json()
will no longer hang indefinitely, no matter how much of the response you choose to read or not. That's it job. And it does it in a way that matches user expectations for the 90% use cases.
Anyone who needs more explicit/complex timeout logic can always use the existing AbortController
pattern, which is designed well enough to handle any use case you can throw at it.
Why not do it in userland?
It's already possible to do in userland. But people aren't doing it correctly because right now the API doesn't have good ergonomics. Most of the popular code samples that show adding timeouts to fetch do it incorrectly, and leave open the infinite hanging failure mode. And that's not even counting the code that just doesn't add timeouts because fetch
's current ergonomics don't make it easy.
It's extremely hard to debug these infinite hangs when they happen, because they can happen at any layer of the callstack above where fetch
is called. Any async
function that has a dependency call a buggy fetch
can hang indefinitely.
It's critical that people use timeouts. And to make that happen it's critical that it be easy.
Asking people to master entirely new AbortController
and AbortSignal
concepts for the 99% use case of timing out a request is not a smart thing to do if you're looking to help people write stable code. (Nothing wrong with those two concepts, they just shouldn't be involved in the common case because they are super low-level.)
And the argument that "fetch
is a low-level API" also misses the point. People are increasingly using fetch
, isomorphic-fetch
, node-fetch
, etc. in non-low-level places. They are using it as their only HTTP dependency for their apps, because bundle size is important in JavaScript.
And because timeouts are not handled nicely in the spec, those polyfill libraries are incapable of adding timeout
options themselves.
In summary…
- Any call to
fetch
without a timeout is not guaranteed to resolve or reject. - This is has been brought up in many, many issues in
fetch
polyfills. - The existing ways to add timeouts to a
fetch
aren't ergonomic enough. - This leads to subtle bugs and widespread incorrect example code.
- There's a good solution for a
timeout
option that meets user expectations. - The solution doesn't preclude future solutions for other timeout needs. (Or userland.)
- This solution is guaranteed to eliminate network uncertainty, that's the goal.
- It explicitly doesn't try to solve for 100% of timeouts—just for the common 99%.
Thanks for reading!