You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The patched fetch function should not buffer a streamed response
When our patched `fetch` function comes to the conclusion that it should
cache a response, it currently buffers the full response body before
returning a pseudo-cloned `Response` instance.
This is especially a problem in chat applications, where LLM responses
need to be streamed to the client immediately, without being buffered.
Since those chat requests are usually POSTs though, the buffering in
`createPatchedFetcher` did not create a problem because this was only
applied to GET requests. Although use cases where GET requests are
streamed do also exist, most prominently RSC requests. Those would have
been already affected by the buffering.
With the introduction of the Server Components HMR cache in #67527
(enabled per default in #67800), the patched `fetch` function was also
buffering POST response bodies, so that they can be stored in the HMR
cache. This made the buffering behaviour obvious because now Next.js
applications using the AI SDK to stream responses were affected, see
vercel/ai#2480 for example.
With this PR, we are now returning the original response immediately,
thus allowing streaming again, and cache a cloned response in the
background.
As an alternative, I considered to not cache POST requests in the Server
Components HMR cache. But I dismissed this solution, because I still
think that caching those requests is useful when editing server
components. In addition, this solution would not have addressed the
buffering issue for GET requests.
0 commit comments