Rate limiting observability (metrics and tracing) PR A#5701
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #5701 +/- ##
==========================================
+ Coverage 70.60% 70.65% +0.04%
==========================================
Files 681 682 +1
Lines 68804 68921 +117
==========================================
+ Hits 48579 48696 +117
+ Misses 16675 16669 -6
- Partials 3550 3556 +6 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
|
This PR's value is entirely in service of the follow-up metrics/tracing PRs — but everything those PRs need is already right here, at the one point both enforcement paths funnel through. At the rejection branch the rejected bucket is if rejectedIdx >= 0 {
src := checks[rejectedIdx].rejectedBy
l.telemetry.recordRejection(ctx, src) // meter + span left to the telemetry layer
return &Decision{Allowed: false, RetryAfter: buckets[rejectedIdx].RetryAfter()}, nil
}How the span is handled is open — annotate the ambient request span from Two reasons this seems preferable to the metadata approach:
Since fail-open is out of scope for these signals, the one argument I can see for the current shape is keeping |
|
Thanks, agreed. Keeping pkg/ratelimit telemetry-free was not a deliberate constraint; I was optimizing for a small preparatory PR and ended up introducing public plumbing that has no independent domain value. I’ll rework this PR to remove RejectionSource and both RejectedBy fields, keep the bucket source private in limitCheck, and fold the core metrics into this PR. Decision, Redis-error, and Lua-latency signals will be emitted directly from the concrete limiter where the rejected index, error, timing boundary, and context are already available. For tracing, I’ll annotate the ambient request span in a focused follow-up rather than introduce a dedicated rate-limit span. MCPServer also needs its telemetry middleware moved outside rate limiting so that span is active. Fail-open will remain a separate edge-level change because the raw limiter cannot know whether a caller actually proceeds after an infrastructure error. I’ll update the implementation plan and PR description accordingly. |
733dc33 to
437d289
Compare
Signed-off-by: Sanskarzz <sanskar.gur@gmail.com>
437d289 to
9f421bb
Compare
|
@jerm-dro I've updated this PR with the suggested changes. |
Summary
limiter used by both MCPServer and VirtualMCPServer.
DecisionorRateLimitedError.only for the first bucket rejected by the atomic Redis check.
reference documentation.
Part of #4553
Type of change
Test plan
task test-e2e)API Compatibility
v1beta1API, OR theapi-break-allowedlabel is applied and the migration guidance is described above.Changes
pkg/ratelimit/limiter.gopkg/ratelimit/observability.gopkg/ratelimit/observability_test.gotest/e2e/thv-operator/virtualmcp/virtualmcp_rate_limiting_test.godocs/observability.mdgo.modDoes this introduce a user-facing change?
Yes. Operators with telemetry enabled can observe rate-limit decisions, Redis errors, and Redis Lua check latency. Rate-limit enforcement, responses, Redis keys, retry timing, and fail-open behavior are unchanged.
Implementation plan
Approved implementation plan
first rejected bucket.
endpoint.
Follow-up PRs
This is the first PR in the rate-limit observability work. To keep each review
focused, the remaining changes will be opened sequentially after the preceding
PR merges:
Request-span attributes
Annotate the existing request span rather than creating a second span.
Allowed requests will report
rate_limit.decision=allowed,rate_limit.rejected_by=none, andrate_limit.fail_open=false. Rejectedrequests will add a private rejection identifier to
limitCheckat itspoint of use. MCPServer middleware ordering will move telemetry outside rate
limiting so the ambient request span is active.
Fail-open observability
Add
toolhive_rate_limit_fail_openand set the fail-open request-spanattributes after a Redis error reaches the shared enforcement adapter.
Focused tests will verify one increment per affected request and continued
delegation through both HTTP and vMCP paths.
Each follow-up will contain the tests and documentation as needed.
Special notes for reviewers
The private metadata uses the rejected index already returned by
bucket.ConsumeAll; it does not perform another Redis check or introduceparallel HTTP/vMCP instrumentation.
Decision,RateLimitedError, and the vMCPdecorator public behavior remain unchanged.