Skip to content

Migrate JWT/JWK auto-refresh from RPCSessions to curl multi singleton#7989

Draft
Copilot wants to merge 11 commits into
copilot/jwk-refresh-curl-multi-planfrom
copilot/copilotjwk-refresh-curl-multi-plan
Draft

Migrate JWT/JWK auto-refresh from RPCSessions to curl multi singleton#7989
Copilot wants to merge 11 commits into
copilot/jwk-refresh-curl-multi-planfrom
copilot/copilotjwk-refresh-curl-multi-plan

Conversation

Copilot AI commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Migrates JWT/JWK auto-refresh outbound HTTP fetches away from the legacy RPCSessions::create_client() path and onto ccf::curl::CurlRequest attached to CurlmLibuvContextSingleton, following the curl multi singleton direction introduced in #7102.

Core changes

  • src/node/jwt_key_auto_refresh.h

    • Fetches both OpenID metadata and JWKS via curl multi singleton requests.
    • Removes the RPCSessions dependency from JwtKeyAutoRefresh.
    • Bounds metadata/JWKS response bodies to 1 MB and sets connect/transfer timeouts.
    • Restricts curl protocols to HTTPS.
    • Verifies peer and hostname using the configured governance CA bundle.
    • Clears libcurl's default CA path so validation is pinned to the configured CA bundle rather than also trusting system roots.
    • Reports curl connection/TLS failures through send_refresh_jwt_keys_error(), so refresh failure metrics include network-level fetch failures.
    • Uses weak ownership plus a stopped flag for scheduled tasks and curl callbacks so late async responses are ignored during shutdown instead of extending auto-refresh lifetime.
  • src/node/node_state.h

    • Constructs JwtKeyAutoRefresh without passing rpcsessions.

Metadata and JWKS hardening

  • Parses optional OpenID metadata issuer constraints inside the metadata parse error path, so malformed/non-string values are recorded as refresh failures rather than escaping the task.
  • Applies metadata issuer constraints to JWKS keys before constructing the internal SetJwtPublicSigningKeys request.
  • Validates jwks_uri before handing it to curl:
    • scheme must be HTTPS
    • host and port must match the configured issuer authority, preventing metadata-controlled cross-authority JWKS fetches

Tests

Adds/updates JWT auto-refresh coverage for:

  • curl connection failure metrics
  • TLS trust failure metrics
  • malformed OpenID metadata issuer values
  • cross-authority jwks_uri rejection
  • refreshed key issuer-constraint propagation

Validated locally:

  • scripts/cpp-format-checks.sh
  • scripts/python-format-checks.sh
  • scripts/python-lint-checks.sh
  • cd build && ninja jwt_auth_test js_generic curl_test logging
  • cd build && ./tests.sh -VV -R 'jwt_auth_test|programmability_and_jwt'
  • cd build && ninja

Copilot AI and others added 3 commits June 25, 2026 20:50
Replace rpcsessions->create_client() HTTP fetches in JwtKeyAutoRefresh
with ccf::curl::CurlRequest attached to CurlmLibuvContextSingleton,
following the pattern established in #7102 for QuoteEndorsementsClient.

Changes:
- jwt_key_auto_refresh.h: inherit enable_shared_from_this, add
  send_curl_get() helper with CURLOPT_CAINFO_BLOB CA bundle support,
  migrate both metadata and JWKS fetches to curl with task-deferred
  response callbacks
- node_state.h: remove rpcsessions parameter from JwtKeyAutoRefresh
  constructor call
- CHANGELOG.md: add [Unreleased] entry for this change

Co-authored-by: achamayou <4016369+achamayou@users.noreply.github.com>
Co-authored-by: achamayou <4016369+achamayou@users.noreply.github.com>
Co-authored-by: achamayou <4016369+achamayou@users.noreply.github.com>
Copilot AI changed the title [WIP] Migrate JWT/JWK auto-refresh to curl multi singleton client Migrate JWT/JWK auto-refresh from RPCSessions to curl multi singleton Jun 25, 2026
Copilot AI requested a review from achamayou June 25, 2026 20:54
The JWT issuer test server creates TLS certs using generate_cert which
hashes the CN (e.g. sha256("localhost")), so libcurl's hostname
verification fails when connecting to the server because neither the CN
nor any SAN matches the actual hostname.

Add an optional `san` parameter to `generate_cert` that, when provided,
adds a DNS Subject Alternative Name extension to the certificate. Update
JwtIssuer._generate_auth_data to pass the hostname as the SAN so that
libcurl can successfully verify the server certificate.

Co-authored-by: achamayou <4016369+achamayou@users.noreply.github.com>
@achamayou

Copy link
Copy Markdown
Member

@copilot look at the test failures to improve your change.

…etrics

Co-authored-by: achamayou <4016369+achamayou@users.noreply.github.com>

Copilot AI commented Jun 26, 2026

Copy link
Copy Markdown
Contributor Author

@copilot look at the test failures to improve your change.

Fixed three CI failures in commit Fix CI failures: clang-format, CHANGELOG version, JWT test baseline metrics:

  1. C/C++ format: Applied clang-format to src/node/jwt_key_auto_refresh.h (long lambda capture lists were not wrapped per style rules).

  2. Release notes: Moved the changelog entry from ## [Unreleased] into the existing ## [7.0.6] section under a new ### Changed heading, matching the version in python/pyproject.toml.

  3. programmability_and_jwt test failure: The new code calls send_refresh_jwt_keys_error() for curl-level connection failures, which is the observability improvement described in the PR. However, when the test runs a second time after a primary failover (line 794 in jwt_test.py), the new primary fires the auto-refresh while the issuer server is momentarily down — generating a failure count before the test's server starts. The fix captures a baseline_m snapshot right before start_openid_server and changes the assertions to delta comparisons (m["failures"] == baseline_m["failures"]) so pre-existing connection failures from between test invocations are excluded.

achamayou and others added 5 commits June 26, 2026 09:18
Use HTTPS-only curl requests with bounded timeouts for JWT/JWK auto-refresh, and add coverage for connection and TLS refresh failures.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Normalize the parsed jwks_uri scheme before enforcing HTTPS-only refreshes.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@achamayou achamayou added the run-long-test Run Long Test job label Jun 26, 2026
@achamayou achamayou requested a review from cjen1-msft June 26, 2026 19:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

run-long-test Run Long Test job

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants