Skip to content

Add Valkey operations handbook#357

Open
Grant McCloskey (MushuEE) wants to merge 1 commit into
agent-substrate:mainfrom
MushuEE:design/valkey-operations-eval
Open

Add Valkey operations handbook#357
Grant McCloskey (MushuEE) wants to merge 1 commit into
agent-substrate:mainfrom
MushuEE:design/valkey-operations-eval

Conversation

@MushuEE

Copy link
Copy Markdown
Collaborator

Operational reference for the Valkey persistence tier and the in-process worker cache that sits in front of it. Written for operators, on-call engineers, and contributors who need to reason about the storage tier without first reading the code. This will be used as ref for further investigation on if Valkey will really stand-up.

NOTE: We chose Valkey due to it's speed for the critical scheduling requirements. With WorkerCache now being in memory, we get the speed for scheduling and still rely on eventual consistency. Do we now want more durability? Also still need to answer: what data can we 100% NOT lose in these failure situations.

README.md — entry point, scope, conventions
topology.md — what's deployed, sizing math, configuration knobs
lifecycle.md — actor state machine, four workflows, worker cache
and eligibility model
operations.md — failure modes with inline recovery, common admin
operations, open risks

#12

  • Tests pass
  • Appropriate changes to documentation are included in the PR

Operational reference for the Valkey persistence tier and the
in-process worker cache that sits in front of it. Written for
operators, on-call engineers, and contributors who need to reason
about the storage tier without first reading the code.

  README.md       — entry point, scope, conventions
  topology.md     — what's deployed, sizing math, configuration knobs
  lifecycle.md    — actor state machine, four workflows, worker cache
                    and eligibility model
  operations.md   — failure modes with inline recovery, common admin
                    operations, open risks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant