Skip to content

fix(kms-keyring-node): coalesce concurrent branch key cache misses#1664

Open
yosefbs wants to merge 1 commit into
aws:masterfrom
yosefbs:fix/hkeyring-coalesce-branch-key-fetches
Open

fix(kms-keyring-node): coalesce concurrent branch key cache misses#1664
yosefbs wants to merge 1 commit into
aws:masterfrom
yosefbs:fix/hkeyring-coalesce-branch-key-fetches

Conversation

@yosefbs

@yosefbs yosefbs commented Jun 17, 2026

Copy link
Copy Markdown

Issue #, if available: #1663

Description of changes:

If a lot of encrypt/decrypt operations run at once against a cold cache for the same branch key, they all miss the cache together and each one queries the keystore on its own — so I get N DynamoDB GetItem + N KMS Decrypt calls instead of one.

getBranchKeyMaterials now shares a single in-flight request per cache entry id: the first caller on a miss starts the fetch and stores the promise, everyone else for the same key awaits it. The entry is dropped once it settles, so the materials cache still owns caching and TTL, and a failed request isn't shared — the next call just retries.

This covers encrypt and decrypt, since both go through the same helper. Nothing changes about cache eviction, TTL, or the cache entry id.

I also added a test that fires 3000 concurrent lookups for the same key and checks the keystore is hit once — it fails without the fix (expected 3000 to equal 1) and passes with it.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Check any applicable:

  • Were any files moved? Moving files changes their URL, which breaks all hyperlinks to the files.

When many encrypt/decrypt operations run concurrently against a cold
cache for the same branch key, each cache miss independently queried the
keystore, firing N DynamoDB GetItem and N KMS Decrypt calls instead of
one. getBranchKeyMaterials now shares a single in-flight request per
cache entry id, evicting it on settle so the cryptographic materials
cache keeps ownership of caching and TTL. A rejected request is evicted
too, so the next call retries rather than sharing the failure.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@yosefbs yosefbs requested a review from a team as a code owner June 17, 2026 19:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant