Document that JSON::ResumableParser does not bound its buffer size#1023
Conversation
|
Well, your example shows an incomplete scalar, not a document. But it's actually worse with a document, because already parsed hashes and arrays will use more RAM than their JSON representation. I don't mind adding a note, but I'll reword it I think. |
I don't think so because of the reason given above: require "json"
parser = JSON::ResumableParser.new
parser << '[' # start an array never close it
parser << "{}," * 5_000_000 # the 5M empty hashes are buffered.
parser.parse # => false (nothing consumed or released)
parser.partial_value # => (keeps growing; OOM if you keep feeding)So I don't think there's a way around trusting or sanitizing yourself the input. But it's kinda the same with |
|
Ah, Anyway, I'm not familiar with the background that motivated the streaming parser, so if it isn't a problem for the use case, I don't feel strongly about adding limit support either. That's good enough for me if we can make sure that it is the caller's responsibility. TBH, I doubt this will be a real problem in practice. But I'm fairly sure some security researcher will report it unless it is documented (or even after it is :-) |
Right, that could work. Even though it would be a bit annoying to track, and that arguably you'd need the same for
It's definitely intended for trusted payload. This was started for
Oh yes... BTW feel free to add some statement that security issues involving the JSON gem should be sent directly here though: https://github.com/ruby/json/security/advisories/new |
An incomplete document is buffered in full with no size limit, so reading from an untrusted source can grow memory without bound. Note in the rdoc that bounding the input is the caller's responsibility. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
bcf1819 to
a7e173a
Compare
|
#1029 adds |
JSON::ResumableParserbuffers the input until a document completes, so careless use can lead to a DoS.This PR documents in the rdoc that bounding the input is the caller's responsibility.
Maybe an option like
ResumableParser.new(buffer_size_limit: 1024)would also be worth having?