ateredis: encode Actor/Worker records as binary protobuf (#307) by fkautz · Pull Request #356 · agent-substrate/substrate

Frederick F. Kautz IV (fkautz) · 2026-06-30T08:07:31Z

Summary

Switch the on-the-wire encoding of Actor and Worker records in the Valkey
state store from protojson to binary protobuf. This roughly halves per-record
RAM and is about 4x faster to encode, which extends runway toward the
1-billion-record target. The change is scoped to the resident Actor/Worker
records, as described in #307.

This is a clean cutover with no migration path. Since this is pre-alpha with no
production data, the keyspace is flushed on deploy. protojson and binary
protobuf cannot decode each other, so it is not safe as a rolling update across
the cutover. Deploy by stopping the replicas, flushing the keyspace, then
starting the new version.

Changes

Write path: proto.Marshal for Actor and Worker (Create and Update).
Read path: proto.Unmarshal (Get, Update, Delete, List).
The optimistic-concurrency version check stays in Go and compares the decoded
version field rather than the raw bytes, so the WATCH/MULTI paths are
unaffected and the encoding is never inspected from Redis Lua.
Added key-identity guards to the list paths (fetchActors and ListWorkers).
Unlike protojson, proto.Unmarshal accepts an empty byte slice and returns a
zero-valued message without an error. GetActor and GetWorker already
rejected that case via their identity checks, and the list paths now do too.

Out of scope (intentionally still protojson)

Atespace records, which are low cardinality and use negligible RAM at scale.
The worker-change pub/sub payload, which is ephemeral and never persisted.

Testing

Unit (miniredis): a binary-on-the-wire assertion for both record types, which
catches an accidental revert to protojson, plus rejection of empty, corrupt,
and identity-mismatched values across both the Get and List paths.
Integration (-tags integration): real-cluster round-trip for both types,
an optimistic-concurrency conflict for both types, and a binary-safe lock test
using a token with a NUL byte. These skip unless ATE_TEST_REDIS_ADDR points
at a dedicated, empty cluster, and they refuse to run against a non-empty or
non-cluster server.

Verified locally against a single-node Valkey 9.1.0 cluster:

ATE_TEST_REDIS_ADDR=127.0.0.1:7300 go test -tags integration \
    ./cmd/ateapi/internal/store/ateredis/

All unit and integration tests pass, and go vet and gofmt are clean. There
is no change to store.Interface, and existing store tests pass unmodified.

Closes #307.

…rate#307) Switch the on-the-wire encoding of Actor and Worker records in the Valkey state store from protojson to binary protobuf. This roughly halves per-record RAM and is ~4x faster to encode, extending runway toward the 1B-record target. Clean cutover, no migration path (pre-alpha, no production data): the keyspace is flushed on deploy. - Write path: proto.Marshal for Actor and Worker (Create/Update). - Read path: proto.Unmarshal (Get/Update/Delete/List). - The optimistic-concurrency version check stays in Go and compares the decoded version field, never the raw bytes, so WATCH/MULTI is unaffected and the encoding is never inspected from Redis Lua. - Unlike protojson, proto.Unmarshal accepts an empty byte slice and yields a zero-valued message without error. GetActor/GetWorker already reject that via their key-identity checks; this adds the same guard to the list paths (fetchActors, ListWorkers), which previously relied on protojson failing. - Atespace records and the worker-change pub/sub stay on protojson (low cardinality / ephemeral, not RAM-resident at scale); the cutover is scoped to the resident Actor/Worker records per agent-substrate#307. Tests (miniredis): binary-on-the-wire encoding for both record types (so a silent revert to protojson is caught, since the store API is otherwise encoding-agnostic), and rejection of empty/identity-mismatched values across the Get and List paths. Real-cluster round-trip was verified manually against a single-node Valkey 9.1.0 cluster. Signed-off-by: Frederick F. Kautz IV <fkautz@alumni.cmu.edu>

Julian Gutierrez Oschmann (juli4n) · 2026-07-01T14:57:03Z

+// `actor:<atespace>:<actor-id>`, holding a binary protobuf-serialized Actor
+// record. Binary protobuf roughly halves per-record memory versus protojson and
+// is ~4x faster to encode; the optimistic-concurrency version check stays in Go
+// (comparing the decoded version field, never the raw bytes), so the encoding is


I don't know how relevant this comment is:

"Binary protobuf roughly halves per-record memory versus protojson and
// is ~4x faster to encode; the optimistic-concurrency version check stays in Go
Comment view// (comparing the decoded version field, never the raw bytes), so the encoding is
// never inspected from Redis lua."

Looks more like a point in time commit message than package documentation (e.g. 4x faster than what?).

I would just remove it or turn it into a message that justifies why we use binary proto encoding instead of JSON / textproto.

Julian Gutierrez Oschmann (juli4n) · 2026-07-01T15:00:17Z

+// `worker:<namespace>:<pool-name>:<pod-name>`, holding a binary
+// protobuf-serialized Worker record.
+//
+// Only the resident Actor/Worker records use binary protobuf. Atespace records


I think we should be consistent and use the same encoding across all resources. Over time we will add more resources, and having to think about which ones are in which encoding will add significant cognitive burden.

Not a problem, I'll move the others over too.

Frederick F. Kautz IV (fkautz) force-pushed the issue-307-binary-protobuf branch from 4c2a6ca to 37deaf5 Compare June 30, 2026 15:04

Julian Gutierrez Oschmann (juli4n) reviewed Jul 1, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ateredis: encode Actor/Worker records as binary protobuf (#307)#356

ateredis: encode Actor/Worker records as binary protobuf (#307)#356
Frederick F. Kautz IV (fkautz) wants to merge 1 commit into
agent-substrate:mainfrom
fkautz:issue-307-binary-protobuf

Frederick F. Kautz IV (fkautz) commented Jun 30, 2026

Uh oh!

Julian Gutierrez Oschmann (juli4n) Jul 1, 2026

Uh oh!

Julian Gutierrez Oschmann (juli4n) Jul 1, 2026

Uh oh!

Frederick F. Kautz IV (fkautz) Jul 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

Frederick F. Kautz IV (fkautz) commented Jun 30, 2026

Summary

Changes

Out of scope (intentionally still protojson)

Testing

Uh oh!

Julian Gutierrez Oschmann (juli4n) Jul 1, 2026

Choose a reason for hiding this comment

Uh oh!

Julian Gutierrez Oschmann (juli4n) Jul 1, 2026

Choose a reason for hiding this comment

Uh oh!

Frederick F. Kautz IV (fkautz) Jul 2, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants