Releases: EntityProcess/agentv
Releases · EntityProcess/agentv
v4.38.1
v4.38.1-next.1
What's Changed
- ci(release): gate finalize before bump and use GH_MODELS_TOKEN by @christso in #1394
- fix(cli): make required_version advisory, warn on eval failure instead of prompting/exiting by @christso in #1396
Full Changelog: v4.38.0...v4.38.1-next.1
v4.38.0
What's Changed
- feat(evaluation): final-answer output with trace artifacts by @christso in #1364
- feat(results): support configured storage branch by @christso in #1365
- fix(pi-cli): set AZURE_OPENAI_BASE_URL for full URLs, AZURE_OPENAI_RESOURCE_NAME for bare names by @christso in #1367
- feat(results): periodic WIP checkpoints for in-progress eval runs by @christso in #1369
- docs(results): document WIP checkpoints by @christso in #1371
- feat(dashboard): render structured transcript artifacts by @christso in #1370
- docs(ops): remove Agent Mail coordination instructions by @christso in #1372
- fix(results): harden static report publishing by @christso in #1373
- feat(replay): record and replay target outputs by @christso in #1374
- docs(agents): remove local override file guidance by @christso in #1378
- docs(trace): specify trace envelope contract by @christso in #1379
- feat(copilot): support custom provider config by @christso in #1382
- feat(trace): write trace envelope sidecars by @christso in #1376
- feat(replay): support trace envelope replay sources by @christso in #1381
- feat(dashboard): improve sidebar navigation by @christso in #1383
- feat(dashboard): move project context to top bar by @christso in #1384
- refactor(dashboard): make project context read-only by @christso in #1386
- refactor(dashboard): move run context into breadcrumbs by @christso in #1387
- feat(workspace): split repo provenance from acquisition by @christso in #1390
- ci(release): add contract eval gate before latest publish by @christso in #1393
Full Changelog: v4.35.1...v4.38.0
v4.38.0-next.1
What's Changed
- feat(dashboard): move project context to top bar by @christso in #1384
- refactor(dashboard): make project context read-only by @christso in #1386
- refactor(dashboard): move run context into breadcrumbs by @christso in #1387
- feat(workspace): split repo provenance from acquisition by @christso in #1390
Full Changelog: v4.37.0-next.1...v4.38.0-next.1
v4.37.0-next.1
What's Changed
- docs(agents): remove local override file guidance by @christso in #1378
- docs(trace): specify trace envelope contract by @christso in #1379
- feat(copilot): support custom provider config by @christso in #1382
- feat(trace): write trace envelope sidecars by @christso in #1376
- feat(replay): support trace envelope replay sources by @christso in #1381
- feat(dashboard): improve sidebar navigation by @christso in #1383
Full Changelog: v4.36.0-next.1...v4.37.0-next.1
v4.36.0-next.1
What's Changed
- feat(evaluation): final-answer output with trace artifacts by @christso in #1364
- feat(results): support configured storage branch by @christso in #1365
- fix(pi-cli): set AZURE_OPENAI_BASE_URL for full URLs, AZURE_OPENAI_RESOURCE_NAME for bare names by @christso in #1367
- feat(results): periodic WIP checkpoints for in-progress eval runs by @christso in #1369
- docs(results): document WIP checkpoints by @christso in #1371
- feat(dashboard): render structured transcript artifacts by @christso in #1370
- docs(ops): remove Agent Mail coordination instructions by @christso in #1372
- fix(results): harden static report publishing by @christso in #1373
- feat(replay): record and replay target outputs by @christso in #1374
Full Changelog: v4.35.1...v4.36.0-next.1
v4.35.1
v4.35.1-next.1
What's Changed
- fix(eval): accept structured code grader content by @christso in #1360
- fix(targets): add Pi SDK OpenAI target by @christso in #1361
Full Changelog: v4.35.0...v4.35.1-next.1
v4.35.0
What's Changed
- feat(scripts): deploy example eval results repos by @christso in #1267
- fix(docker): run compose service as host user by @christso in #1268
- [codex] Ignore Gas Town runtime files by @christso in #1269
- feat(docker): add dashboard deployment script by @christso in #1271
- chore(dashboard): rename app directory by @christso in #1273
- docs(agents): document AO-first workflow by @christso in #1275
- docs(agents): align AO task graph guidance by @christso in #1276
- feat(config): split AgentV home and data directories by @christso in #1278
- feat(phoenix): add AgentV eval adapter by @christso in #1279
- test(cli): simplify churny local coverage by @christso in #1289
- docs(orchestration): use br-first Beads workflow by @christso in #1288
- docs(agents): clarify regression-focused tests by @christso in #1290
- fix(dashboard): support white-label app branding by @christso in #1291
- chore(beads): track shared bead state by @christso in #1292
- docs(agents): remove redundant bead guidance by @christso in #1293
- feat(codex): support reasoning effort targets config by @christso in #1294
- fix(codex): use stream_log for target logging by @christso in #1295
- feat(results): support global per-project results config by @christso in #1296
- fix(config): complete layered home config migration by @christso in #1299
- fix(targets): preserve azure deployment base urls by @christso in #1301
- docs(pi): document thinking level config by @christso in #1302
- feat(results): delete local run workspaces by @christso in #1300
- feat(dashboard): add project results sync UX by @christso in #1303
- docs(evals): document benchmark provenance patterns by @christso in #1304
- fix(core): allow templated use_target validation by @christso in #1305
- fix(cache): honor configured response cache paths by @christso in #1307
- feat(showcase): add replay-first trace evaluation fixtures by @christso in #1308
- feat(evals): preserve rubric criterion operators by @christso in #1309
- docs(dashboard): document project results sync workflow by @christso in #1306
- fix(dashboard): use registry project display names by @christso in #1310
- fix(dashboard): reconcile remote run counts by @christso in #1311
- fix(dashboard): preserve remote run detail context by @christso in #1312
- feat(dashboard): clarify remote sync outcome by @christso in #1313
- feat(results): publish selected local runs by @christso in #1314
- fix(results): remove selected run publish workflow by @christso in #1315
- docs(evals): plan AgentV eval authoring extensibility by @christso in #1316
- docs(integrations): document Phoenix adapter by @christso in #1317
- fix(dashboard): clarify remote run actions by @christso in #1318
- fix(dashboard): make run tables mobile scrollable by @christso in #1319
- style(brand): present AgentV wordmark by @christso in #1320
- fix(results): separate execution errors from quality failures by @christso in #1321
- chore(beads): record AgentV demo gap progress by @christso in #1322
- style(brand): uppercase visible AGENTV wordmark by @christso in #1323
- fix(web): dedupe landing wordmark by @christso in #1324
- feat(dashboard): filter score distribution by @christso in #1327
- fix(dashboard): remove redundant quality label prefixes by @christso in #1328
- docs(agents): clarify beads ownership for research by @christso in #1333
- feat(results): materialize per-test task bundles by @christso in #1330
- docs(agents): remove local orchestration guidance by @christso in #1334
- fix(results): harden remote sync dogfood by @christso in #1332
- feat(trace): add normalized trajectory contract by @christso in #1331
- feat(cli): rerun captured task bundles by @christso in #1335
- feat(cli): simplify eval output surface by @christso in #1336
- fix(results): supersede stale remote sync errors by @christso in #1338
- docs(agents): externalize task tracker guidance by @christso in #1339
- chore(tracking): remove repo-local beads state by @christso in #1340
- fix(dashboard): compact run labels by @christso in #1337
- feat(core): support structured llm-grader context by @christso in #1342
- fix(dashboard): settle remote sync button state by @christso in #1343
- fix(results): retry interrupted direct result pushes by @christso in #1344
- docs(results): clarify remote sync CLI contract by @christso in #1345
- docs(run): document generated task bundles by @christso in #1350
- fix(results): ignore unrelated files during remote sync by @christso in #1347
- fix(cli): remove eval benchmark json flag by @christso in #1348
- fix(dashboard): make targets table scroll on mobile by @christso in #1346
- fix(cli): remove eval export option by @christso in #1349
- fix(dashboard): prevent mobile experiment table clipping by @christso in #1352
- docs(adr): keep Phoenix observability out of core by @christso in #1353
- test(core): tolerate grader timing jitter by @christso in #1355
- feat(config): hard-break dashboard project schema by @christso in #1356
- feat(config): use succinct repo_url project remotes by @christso in #1357
- fix(dashboard): sort runs by timestamp by @christso in #1359
Full Changelog: v4.31.3...v4.35.0
v4.35.0-next.1
What's Changed
- test(core): tolerate grader timing jitter by @christso in #1355
- feat(config): hard-break dashboard project schema by @christso in #1356
- feat(config): use succinct repo_url project remotes by @christso in #1357
- fix(dashboard): sort runs by timestamp by @christso in #1359
Full Changelog: v4.34.1-next.1...v4.35.0-next.1