Add opt-in nerve.db retention: message compaction + telemetry pruning#116
Open
alex-fedotyev wants to merge 1 commit into
Open
Add opt-in nerve.db retention: message compaction + telemetry pruning#116alex-fedotyev wants to merge 1 commit into
alex-fedotyev wants to merge 1 commit into
Conversation
nerve.db grows unbounded. On a heavily-used install it reached 767MB, 714MB of which is the messages table, dominated by the machine-facing blocks (523MB) and thinking (54MB) JSON. That JSON is only needed while a message is rendered live or being indexed into memU: memU reads content (gated by the per-session last_memorized_at watermark) and SDK resume restores from the .jsonl transcript, not DB blocks. So old, already-memorized messages can drop blocks/thinking while keeping content (the UI falls back to content text when blocks is NULL). This adds an opt-in retention subsystem, disabled by default: - MaintenanceStore mixin. compact_messages nulls blocks/thinking for messages older than retention_full_days that are past their session's memorize watermark, in a non-starred, non-active session; content is kept. prune_telemetry and prune_file_snapshots delete append-only rows older than retention_days. checkpoint truncates the WAL; vacuum rewrites the file to reclaim freed pages. - RetentionConfig (enabled=False, retention_days=90, retention_full_days=30, interval_hours=24). - A background lifespan task that no-ops unless enabled. - nerve db prune [--dry-run] and nerve db vacuum CLI commands. Shipping disabled by default so a merge mutates no existing data. The operator opts in, previews with --dry-run, then runs prune + vacuum to reclaim the file. No schema change.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
nerve.dbgrows without bound. On a heavily-used install it reached 767MB, ofwhich the
messagestable is 714MB, dominated by the machine-facingblocksJSON (523MB) and
thinking(54MB). That JSON is only needed while a message isrendered live or being indexed into memU. memU extraction reads
content(gated by the per-session
last_memorized_atwatermark), and SDK resumerestores context from the
.jsonltranscript rather than DB blocks, so oldalready-memorized messages can drop
blocks/thinkingwhile keepingcontent(the UI falls back to
contenttext whenblocksis NULL).This adds an opt-in retention subsystem, disabled by default so merging it
mutates no existing data:
MaintenanceStoreDB mixin:compact_messagesnullsblocks/thinkingfor messages older thanretention_full_daysthat are past their session's memorize watermark, ina non-starred, non-active session. Keeps
content. Idempotent.prune_telemetry/prune_file_snapshotsdelete append-only rows olderthan
retention_days.checkpointtruncates the WAL;vacuumrewrites the file to reclaimfreed pages.
RetentionConfig(enabled=False,retention_days=90,retention_full_days=30,interval_hours=24).nerve db prune [--dry-run]andnerve db vacuumcommands.docs/config.md.No schema change.
Reclaim model
Nulling columns and deleting rows frees pages to the SQLite freelist (reused by
later writes) but does not shrink the file.
PRAGMA wal_checkpoint(TRUNCATE)truncates only the WAL. Only
VACUUMshrinks the file, so it is an explicitoperator step (
nerve db vacuum), never on the background loop.Operator reclaim sequence
retention.enabled: true(optionally tune the windows) in the localconfig; restart to pick it up.
nerve db prune --dry-runto preview.nerve.db, thennerve db prunefollowed bynerve db vacuum(daemon stopped) to shrink the file.
Test plan
tests/test_db_retention.py(15 tests): compaction eligibility (oldstarred, active, never-memorized, message-newer-than-watermark);
contentalways kept; idempotency; dry-run mutates nothing; telemetry/snapshot pruning
deletes old and keeps new while leaving core tables untouched; combined
run_retention.failures unrelated to this change (docker-mode detection under a container
env var; codex tests that need untracked local fixtures).
nerve db prune --dry-run,nerve db prune, andnerve db vacuumexercised end-to-end against a throwaway database.
>= 1, defaults to disabled,and raises no unknown-key warnings.