Instrument both workercache and valkey store with metrics.#347
Open
Julian Gutierrez Oschmann (juli4n) wants to merge 1 commit into
Open
Instrument both workercache and valkey store with metrics.#347Julian Gutierrez Oschmann (juli4n) wants to merge 1 commit into
Julian Gutierrez Oschmann (juli4n) wants to merge 1 commit into
Conversation
Add metrics to track the caching layer. Emit metrics to count pub/sub `PUBLISH` events (and failures), number of cached workers, amount of time the cache is disconnected, total resyncs (caused by a watch disconnection) and total relists.
| persistence := ateredis.NewPersistence(rdb) | ||
| persistence, err := ateredis.NewPersistence(rdb) | ||
| if err != nil { | ||
| mr.Close() |
Collaborator
There was a problem hiding this comment.
Is there a way to do testing.Defer or something similar so we don't need to remember to do this every time?
Collaborator
Author
There was a problem hiding this comment.
That sounds like a good idea. We are already using this pattern throughout this file so I would probably tackle that as a separate PR?
| } | ||
| if err := s.rdb.Publish(ctx, workerPubSubChannel, payload).Err(); err != nil { | ||
| slog.ErrorContext(ctx, "worker event publish failed", slog.Any("err", err)) | ||
| s.metricWorkerPubsubMessages.Add(ctx, 1, metric.WithAttributes(attribute.String("error.type", "_OTHER"))) |
Collaborator
There was a problem hiding this comment.
I'm a little confused on this. Is it a typical pattern to increment the same metric with an error attribute, or have a separate metric for failures?
Collaborator
Author
There was a problem hiding this comment.
Not an otel expert, but I think this is the recommended way of reporting errors (https://opentelemetry.io/docs/specs/semconv/general/recording-errors/#recording-errors-on-metrics).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add metrics to track the caching layer. Emit metrics to count pub/sub
PUBLISHevents (and failures), number of cached workers, amount of time the cache is disconnected, total resyncs (caused by a watch disconnection) and total relists.