Skip to content

Document that JSON::ResumableParser does not bound its buffer size#1023

Merged
byroot merged 1 commit into
ruby:masterfrom
mame:document-resumable-buffer-limit
Jun 18, 2026
Merged

Document that JSON::ResumableParser does not bound its buffer size#1023
byroot merged 1 commit into
ruby:masterfrom
mame:document-resumable-buffer-limit

Conversation

@mame

@mame mame commented Jun 18, 2026

Copy link
Copy Markdown
Member

JSON::ResumableParser buffers the input until a document completes, so careless use can lead to a DoS.

require "json"
parser = JSON::ResumableParser.new
parser << '"'                  # start a string, never close it
parser << "a" * 5_000_000      # the whole unterminated string is buffered
parser.parse                   # => false (nothing consumed or released)
parser.rest.bytesize           # => 5000001 (keeps growing; OOM if you keep feeding)

This PR documents in the rdoc that bounding the input is the caller's responsibility.

Maybe an option like ResumableParser.new(buffer_size_limit: 1024) would also be worth having?

@byroot

byroot commented Jun 18, 2026

Copy link
Copy Markdown
Member

Well, your example shows an incomplete scalar, not a document.

But it's actually worse with a document, because already parsed hashes and arrays will use more RAM than their JSON representation. I don't mind adding a note, but I'll reword it I think.

@byroot

byroot commented Jun 18, 2026

Copy link
Copy Markdown
Member

Maybe an option like ResumableParser.new(buffer_size_limit: 1024) would also be worth having?

I don't think so because of the reason given above:

require "json"
parser = JSON::ResumableParser.new
parser << '['                  # start an array never close it
parser << "{}," * 5_000_000      # the 5M empty hashes are buffered.
parser.parse                   # => false (nothing consumed or released)
parser.partial_value           # => (keeps growing; OOM if you keep feeding)

So I don't think there's a way around trusting or sanitizing yourself the input. But it's kinda the same with JSON.parse.

@mame

mame commented Jun 18, 2026

Copy link
Copy Markdown
Member Author

Ah, buffer_size_limit was a bad name. What I had in mind was the number of bytes since the start of the JSON fragment (including bytes that have already been consumed and shrunk away). So parser << ("{}," * 5_000_000) would raise an exception at that point.

Anyway, I'm not familiar with the background that motivated the streaming parser, so if it isn't a problem for the use case, I don't feel strongly about adding limit support either. That's good enough for me if we can make sure that it is the caller's responsibility.

TBH, I doubt this will be a real problem in practice. But I'm fairly sure some security researcher will report it unless it is documented (or even after it is :-)

@byroot

byroot commented Jun 18, 2026

Copy link
Copy Markdown
Member

What I had in mind was the number of bytes since the start of the JSON fragment (including bytes that have already been consumed and shrunk away)

Right, that could work. Even though it would be a bit annoying to track, and that arguably you'd need the same for JSON.parse.

I'm not familiar with the background that motivated the streaming parser,

It's definitely intended for trusted payload. This was started for fluentd (logging essentially), but it was also suggested it could be useful for LLM APIs as they tend to drip feed JSON.

But I'm fairly sure some security researcher will report it unless it is documented (or even after it is :-)

Oh yes... BTW feel free to add some statement that security issues involving the JSON gem should be sent directly here though: https://github.com/ruby/json/security/advisories/new

An incomplete document is buffered in full with no size limit, so reading from an
untrusted source can grow memory without bound. Note in the rdoc that bounding the
input is the caller's responsibility.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@byroot byroot force-pushed the document-resumable-buffer-limit branch from bcf1819 to a7e173a Compare June 18, 2026 19:35
@byroot byroot merged commit 3650e2b into ruby:master Jun 18, 2026
42 checks passed
@byroot

byroot commented Jun 19, 2026

Copy link
Copy Markdown
Member

#1029 adds ResumableParser#parsed_bytes, which makes it easier to stop parsing after a certain number of bytes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants