[Harbor 4/4] architecture docs, tutorial, and the GAIA example#6
[Harbor 4/4] architecture docs, tutorial, and the GAIA example#6varunursekar wants to merge 3 commits into
Conversation
- docs/harbor/architecture.md — what the integration is, the compiled-task topology, the two evaluation modes, the component map, and the leaderboard-integrity model. - docs/harbor/tutorial.md — build and run an optimization task end to end (both modes, the agent-side protocol), and a Harbor section in the README. - examples/gaia-optimization — a Mode-B example optimizing a GaiaAgent (a thin Terminus2 subclass with an editable prompt) on gaia/gaia via a nested harbor run on Modal. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
|
||
| This lets anyone optimize a coding agent with plain `harbor run`, and makes the result | ||
| leaderboard-gradeable — the optimizer cannot read hidden labels, modify the scorer, or | ||
| bypass its budget. |
There was a problem hiding this comment.
This reads as a hard guarantee ('the optimizer cannot read hidden labels, modify the scorer, or bypass its budget'), but the code makes each best-effort and the shipped GAIA example undercuts the first one (see the build.yaml comment). Suggest softening to something like: 'vero never writes per-sample labels to the agent's volume and meters every agent evaluation; OS-level mechanisms (read-only paths, a root:600 finalize token) keep the scorer and test split out of the agent's reach on a best-effort basis.'
|
|
||
| splits: | ||
| - { split: train, access: non_viewable } # optimizer sees aggregate scores only | ||
| - { split: validation, access: no_access } # hidden; never reaches the optimizer |
There was a problem hiding this comment.
'hidden; never reaches the optimizer' isn't true for this config: build.yaml is git-tracked and agent_repo is ., so vero harbor build seeds this whole file — including these validation task IDs — into /work/agent via git archive HEAD. The optimizer can read the held-out task IDs, and GAIA answers are public. Move the partition out of the agent_repo subtree, or caveat that for public benchmarks the held-out identity is visible (only per-sample scores are withheld). This is the example that backs the headline 'cannot read hidden labels' claim, so it is worth getting airtight.
|
|
||
| - [`docs/harbor/architecture.md`](docs/harbor/architecture.md) — what it is, the topology, and the leaderboard-integrity model. | ||
| - [`docs/harbor/tutorial.md`](docs/harbor/tutorial.md) — build and run a task end to end. | ||
| - [`examples/gsm8k-agent`](examples/gsm8k-agent) (Mode A) and [`examples/gaia-optimization`](examples/gaia-optimization) (Mode B). |
There was a problem hiding this comment.
examples/gsm8k-agent is cited as the Mode A example but it has no build.yaml (it's the older Policy-API example). The Harbor Mode A example that ships a build.yaml is examples/doubler-agent. Repoint here, or add a build.yaml to gsm8k-agent.
| The optimizer is untrusted. Integrity rests on a few mechanisms, all best-effort at | ||
| the OS/process level (a container escape is out of scope): | ||
|
|
||
| - **3-tier split visibility** (`SplitAccessLevel`): `visible` (aggregate + per-sample |
There was a problem hiding this comment.
Worth one explicit line here: tier_for_split defaults any split not listed to viewable (full per-sample results), so omission fails open. Tell authors to list every split explicitly. (Pairs with the protocol.py fail-open comment on #4.)
| - **Commit transfer**: the sidecar `git fetch`es the agent's commit from the mounted | ||
| repo into its *own* repo with hooks disabled and `file://` (object copy, no | ||
| alternates), so the evaluated tree is fully owned by the sidecar and tamper-evident. | ||
| - **Protected scorer / write-access**: the scorer is sidecar-only; `read_only_paths` |
There was a problem hiding this comment.
'the scorer is sidecar-only' holds for Mode B but not Mode A, where the scorer lives in the agent's editable repo, protected only by chown root:root + chmod -R a-w on read_only_paths (which isn't a real tamper control — see #5). Recommend splitting this claim by mode.
…xample, by-mode scorer Documentation accuracy fixes (review findings on PR #6): - architecture: soften the intro from a hard guarantee ("the optimizer cannot read hidden labels, modify the scorer, or bypass its budget") to best-effort, OS/process-level language describing what is actually enforced. - gaia build.yaml: correct "never reaches the optimizer". Because agent_repo is "." and build.yaml is git-tracked, the validation task ids ARE seeded into the optimizer's repo; only the per-sample scores are withheld. Acceptable for a public benchmark, with a caveat + mitigations for secret-identity benchmarks. - examples: gsm8k-agent is cited as the Mode A example but ships no build.yaml; repoint to gaia-optimization as the complete runnable example and pair gsm8k-agent with the tutorial's Mode A snippet. - architecture: document the current fail-open default for unlisted splits (and that it becomes fail-closed once the protocol fix lands), and split the "scorer is sidecar-only" claim by mode (true for Mode B; Mode A keeps the scorer in the agent's editable repo until the serve.py fix). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
docs(harbor): honest integrity guarantees, GAIA leak caveat, Mode A example [fixes 4/4 docs]
Draft · Stack 4 of 4 — targets
harbor-3-compiler. Additive, low-risk.docs/harbor/architecture.md— what it is, the compiled-task topology, the two modes, the component map, and the leaderboard-integrity model.docs/harbor/tutorial.md— build + run end to end (both modes, the agent-side protocol); README Harbor section.examples/gaia-optimization— a Mode-B example optimizing aGaiaAgent(thinTerminus2subclass with an editable prompt) ongaia/gaiavia a nested harbor run on Modal.Start your reading here for the big picture, then dive into [1/4]–[3/4].
Stack: [1/4] core → [2/4] sidecar → [3/4] compiler → this.
🤖 Generated with Claude Code
Greptile Summary
This PR adds the final layer of the Harbor integration stack: architecture and tutorial documentation plus a complete Mode B runnable example (
gaia-optimization) that optimizes aGaiaAgentprompt on the GAIA benchmark via a nestedharbor runon Modal.docs/harbor/architecture.md,tutorial.md): cover the compiled-task topology, both evaluation modes, the leaderboard-integrity trust boundary (including the documented fail-open default for unlisted splits), and end-to-end build/run instructions for both modes.examples/gaia-optimization): a self-containedbuild.yaml+ thinTerminus2subclass that redirects prompt-template resolution to an editableprompts/directory — the optimization surface. Thebuild.yamlcorrectly setsno_accesson the held-out validation split and includes a clear caveat about held-out task IDs being readable from the git-trackedagent_repo(acceptable for this public benchmark, with guidance for private benchmarks).Confidence Score: 5/5
Purely additive: documentation files and a self-contained example with no changes to library code or existing paths.
All changed files are new (docs and example); no existing code is modified. The one Python file (agent.py) is a minimal 38-line Terminus2 subclass with straightforward logic and no side effects on the rest of the codebase.
No files require special attention.
Important Files Changed
Flowchart
%%{init: {'theme': 'neutral'}}%% flowchart TD A["vero harbor build -c build.yaml"] --> B["Harbor task dir\n(environment/, instruction.md, tests/test.sh)"] B --> C["harbor run -p task -a optimizer -e docker"] C --> D["main container\n(optimizer agent)\nedits prompts/, commits"] C --> E["eval-sidecar container\nvero harbor serve\n(budget ledger, admin token)"] D -- "vero harbor eval --split train" --> E E -- "nested harbor run (Modal)" --> F["GaiaAgent runs GAIA tasks\n(inner harbor environment)"] F -- "verifier rewards collate" --> E E -- "aggregate score only\n(no per-sample labels)" --> D D -- "trial ends" --> G["tests/test.sh\n(shared verifier, root)"] G -- "vero harbor finalize\n(admin token, root:600)" --> E E --> H["Select best train commit\nScore on hidden validation split"] H --> I["reward.json\naccuracy on validation"]%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%% flowchart TD A["vero harbor build -c build.yaml"] --> B["Harbor task dir\n(environment/, instruction.md, tests/test.sh)"] B --> C["harbor run -p task -a optimizer -e docker"] C --> D["main container\n(optimizer agent)\nedits prompts/, commits"] C --> E["eval-sidecar container\nvero harbor serve\n(budget ledger, admin token)"] D -- "vero harbor eval --split train" --> E E -- "nested harbor run (Modal)" --> F["GaiaAgent runs GAIA tasks\n(inner harbor environment)"] F -- "verifier rewards collate" --> E E -- "aggregate score only\n(no per-sample labels)" --> D D -- "trial ends" --> G["tests/test.sh\n(shared verifier, root)"] G -- "vero harbor finalize\n(admin token, root:600)" --> E E --> H["Select best train commit\nScore on hidden validation split"] H --> I["reward.json\naccuracy on validation"]Reviews (2): Last reviewed commit: "Merge pull request #10 from scaleapi/har..." | Re-trigger Greptile