Skip to content

fix(WebRequest): CWE-190/DoS fix and boundary-parsing refactor#447

Open
mathieucarbou wants to merge 3 commits into
mainfrom
CWE-190
Open

fix(WebRequest): CWE-190/DoS fix and boundary-parsing refactor#447
mathieucarbou wants to merge 3 commits into
mainfrom
CWE-190

Conversation

@mathieucarbou

Copy link
Copy Markdown
Member

Tighten multipart boundary parsing in _parseReqHeader():

  • Replace String::charAt() inner loop with a raw C-string pointer and pre-computed length for faster, allocation-free scanning.
  • Add an early-exit upper bound on the scan loop so that positions where 'boundary=' cannot possibly fit are skipped entirely.
  • Replace the three-pass quoted-string extraction (find close-quote, substring, then unescape) with a single-pass approach that writes directly into _boundary using a std::string_view window into the original header value — no intermediate heap allocations.
  • Enforce the RFC 2046 §5.1 70-character limit on boundary length in both token and quoted-string paths; abort with PARSE_REQ_FAIL on violation to prevent CWE-190 integer-overflow / DoS.
  • Replace strncmp length guard with a position-based early break that is clearer and avoids the redundant cast.

Ref: #445

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens and refactors multipart boundary= parsing in AsyncWebServerRequest::_parseReqHeader() to reduce allocation overhead during header scanning and to enforce RFC 2046’s 70-character boundary limit as a mitigation for CWE-190/DoS scenarios.

Changes:

  • Refactors the boundary= parameter scan to use raw C-string access and earlier exit conditions.
  • Reworks boundary value extraction to a single-pass unescape/write path (token or quoted-string) and enforces the 70-char limit early.
  • Introduces std::string_view-based slicing to avoid intermediate String::substring() allocations during parsing.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/WebRequest.cpp Outdated
Comment thread src/WebRequest.cpp Outdated
@mathieucarbou

Copy link
Copy Markdown
Member Author

@willmmiles : fyi we cannot use string_view if we want to keep supporting arduino 2 and libretiny.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.

Comment thread src/WebRequest.cpp

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated no new comments.

@mathieucarbou

Copy link
Copy Markdown
Member Author

@me-no-dev @willmmiles : can you guys review / approve this pr do that we can merge it ? It is sitting there since a while now. Thanks !

Comment thread src/WebRequest.cpp
Comment thread src/WebRequest.cpp
Comment thread src/WebRequest.cpp Outdated
Comment thread src/WebRequest.cpp Outdated
Comment on lines +517 to +521
// If there is not enough room left for "boundary=" plus at least
// one value byte, no later ';' can match either — stop scanning.
if (j + (int)T_BOUNDARY_LEN + 1 > llen) {
break;
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't have to check this here - the test at the top of the loop should be enough. The mistake is that lscan needs to deduct the semicolon too.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code is valid and this check is necessary.
lscan already accounts for semicolon.

I have doubled checked this one with GLM 5.2 to be sure and have a more in depth explanation of why your proposal would fail.

the loop would stop one iteration early and miss valid boundaries like ;boundary=X at the very end of the string.

image

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right - this function does require a duplicate check as it is currently arranged. I'm sorry for the confusion - I really couldn't put my finger on what it was that was bugging me with the logic here, and I didn't do a very good job of articulating that unease.

I did a more thorough re-review with Claude and it was very helpful for clarifying my intuition. The deeper issue IMO is that the function tries to mix token analysis in a byte loop. A clearer formulation is to make the outer loop a loop over parameters instead, with "skip logic" over the quoted values inside.

The cleaner version is here: willmmiles@65e128e

It's substantially shorter and clearer IMO. (It's also dropped the toLowerCase call in favor of strncasecmp which should be much faster and avoids the extra allocation.)

@mathieucarbou mathieucarbou Jun 23, 2026

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I will have a look!

@mathieucarbou mathieucarbou Jun 24, 2026

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@willmmiles I cherry picked your commit and verified it. I added a commit on top of it to fix an issue in your commit that re-introduced the CVE, and updated the MultiPart.ino example. The issue was an non-handled quoted empty boundary parameter. All the test cases in MultiPart.ino are working fine now.

@mathieucarbou

Copy link
Copy Markdown
Member Author

@willmmiles : I have addressed your comments. Could you please have a look ? Thanks!

mathieucarbou added a commit that referenced this pull request Jun 24, 2026
willmmiles' parameter-loop refactor (65e128e) removed the final
_boundary length safety net that existed after both extraction
branches, relying on per-branch checks.  However, the per-branch
checks do not cover the empty quoted-string case: boundary=""
closes immediately with _boundary still empty and falls through to
_isMultipart = true without rejection.

Downstream in EXPECT_BOUNDARY, an empty _boundary would cause
--\r\n to be accepted as a valid boundary delimiter, reintroducing
the CWE-190 / DoS condition this PR is meant to fix.

Restore the final safety check (length == 0 || length > 70) with a
comment explaining the empty-quoted-string gap that motivates it.

Ref: #447 (comment)
mathieucarbou and others added 3 commits June 24, 2026 09:34
Tighten multipart boundary parsing in _parseReqHeader():

- Replace String::charAt() inner loop with a raw C-string pointer and
  pre-computed length for faster, allocation-free scanning.
- Add an early-exit upper bound on the scan loop so that positions
  where 'boundary=' cannot possibly fit are skipped entirely.
- Replace the three-pass quoted-string extraction (find close-quote,
  substring, then unescape) with a single-pass approach that writes
  directly into _boundary using a raw pointer+length pair — no
  intermediate heap allocations and no C++17 dependency.
- Enforce the RFC 2046 §5.1 70-character limit on boundary length in
  both token and quoted-string paths; abort with PARSE_REQ_FAIL on
  violation to prevent CWE-190 integer-overflow / DoS.
- Replace strncmp length guard with a position-based early break that
  is clearer and avoids the redundant cast.
- Fix ESP8266 build: String(const char*, size_t) is unavailable; use
  a 71-byte stack buffer + String(const char*) instead (#ifdef ESP8266).
- Fix LibreTiny/C++14 build: remove std::string_view (C++17 only);
  replaced with plain const char* + size_t throughout.

Ref: #445
Use a token-based loop instead of a byte loop as the primary parser for
multipart options.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
willmmiles' parameter-loop refactor (65e128e) removed the final
_boundary length safety net that existed after both extraction
branches, relying on per-branch checks.  However, the per-branch
checks do not cover the empty quoted-string case: boundary=""
closes immediately with _boundary still empty and falls through to
_isMultipart = true without rejection.

Downstream in EXPECT_BOUNDARY, an empty _boundary would cause
--\r\n to be accepted as a valid boundary delimiter, reintroducing
the CWE-190 / DoS condition this PR is meant to fix.

Restore the final safety check (length == 0 || length > 70) with a
comment explaining the empty-quoted-string gap that motivates it.

Ref: #447 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants