fix(fetch): avoid buffering streamed uploads in memory#5455
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #5455 +/- ##
=======================================
Coverage 93.46% 93.47%
=======================================
Files 110 110
Lines 37106 37122 +16
=======================================
+ Hits 34682 34698 +16
Misses 2424 2424 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
e6235db to
c235156
Compare
fetch clones the request so it can follow redirects, and cloning tees the body's stream: the original keeps one branch and the wire sends the other. A stream body has a null source and can never be replayed across a redirect (http-redirect-fetch returns a network error for a non-303 redirect with a null source, and otherwise re-extracts from the source, not from the teed branch), so the branch kept on the original request is never read. The tee still buffers every chunk for it, so a streamed upload ends up fully held in memory. Skip the tee for null-source bodies in the internal redirect clone and reuse the stream directly. Request.clone() and Response.clone() still tee, so their two branches stay independently readable. Fixes nodejs#4058
c235156 to
44f9885
Compare
mcollina
left a comment
There was a problem hiding this comment.
LGTM. I checked this against the Fetch Standard: the HTTP-network-or-cache fetch note explicitly encourages avoiding teeing when the request body source is null, and redirects/auth retries for such bodies either fail or do not need the retained tee branch. Request.clone()/Response.clone() behavior remains unchanged.
KhafraDev
left a comment
There was a problem hiding this comment.
If that is what the spec says, this is not the way to implement it
|
You're right. The note is on http-network-or-cache fetch, not on the clone algorithm, and that placement is the point: the shortcut is only safe for that one clone, so it has no business on cloneRequest/cloneBody. I put it in the wrong layer. #5456 special-cases it at the single call site and leaves the clone primitives alone, which is clearly the right shape. Closing this in favor of #5456. |
This relates to...
Fixes #4058.
Rationale
Streaming a large upload with fetch (e.g.
body: fs.createReadStream(...),duplex: 'half') ends up holding the whole file in memory.fetch clones the request so it can follow a redirect, and cloning tees the body's stream: the original request keeps one branch and the wire sends the other. A stream body has a null source, so it can't be replayed across a redirect anyway. http-redirect-fetch returns a network error for a non-303 redirect with a null source, and on any other redirect the body is re-extracted from its source rather than from the teed branch. So the branch kept on the original request is never read, but the tee still buffers every chunk waiting for it, which is the entire upload.
The clone happens on every fetch with a body, not only when a redirect actually occurs, so this affected all streamed uploads.
Changes
Skip the tee for null-source bodies in the internal redirect clone and reuse the stream directly.
cloneBodytakes aforFetchflag that only the clone in http-network-or-cache fetch sets.Request.clone()andResponse.clone()don't pass it, so they still tee and keep two independently readable branches. Bodies with a non-null source are unchanged.Features
N/A
Bug Fixes
A streamed request body is no longer buffered in memory.
Breaking Changes and Deprecations
N/A
Status
test/fetch/issue-4058.js: a 128 MiB streamed PUT peaks at a few MiB with the fix and ~128 MiB without it; the clone and redirect suites pass; lint clean)Notes
cloneBody/cloneRequestare also touched by #5415 (lazy Request/Response internals), so a textual conflict is expected and whichever lands second will need a small rebase. This is an independent change: it only affects the redirect-following clone of a null-source body, not the lazy-init refactor.