Skip to content

fix: page ready agents during health-check startup#345

Open
scale-sf wants to merge 1 commit into
mainfrom
fix/page-ready-healthchecks
Open

fix: page ready agents during health-check startup#345
scale-sf wants to merge 1 commit into
mainfrom
fix/page-ready-healthchecks

Conversation

@scale-sf

@scale-sf scale-sf commented Jul 2, 2026

Copy link
Copy Markdown

Summary

  • page through READY agents when starting health-check workflows at startup
  • avoid loading non-ready agents just to filter them in Python
  • add a focused unit test covering pagination beyond one page

Tests

  • uv run ruff check src/temporal/run_healthcheck_workflow.py tests/unit/temporal/test_run_healthcheck_workflow.py
  • uv run --group test pytest tests/unit/temporal/test_run_healthcheck_workflow.py -q

Greptile Summary

This PR fixes startup health-check workflow scheduling to page through only READY agents rather than loading the entire agent table into Python memory and filtering client-side. A stable order_by="id" sort is included, and a new unit test file covers the two critical pagination boundary conditions.

  • run_healthcheck_workflow.py: adds a while True pagination loop that fetches up to 200 READY agents per page, breaks on an empty page or a partial page, and removes the old in-Python status filter.
  • test_run_healthcheck_workflow.py: two async unit tests — one exercising multi-page traversal and one verifying the empty-page break when total count is an exact multiple of page size.

Confidence Score: 5/5

Safe to merge — the pagination loop is logically sound, both break conditions are correct, and the stable sort by id was included from the start.

The change is narrowly scoped to startup agent enumeration, eliminates a full table scan, and the pagination logic correctly handles empty pages, partial pages, and exact-multiple page counts. The previous review concerns (stable sort key, boundary test) are both addressed in this PR.

No files require special attention.

Important Files Changed

Filename Overview
agentex/src/temporal/run_healthcheck_workflow.py Replaces bulk load + Python-side filter with server-side paginated query for READY agents; adds stable order_by="id" sort, correct dual-break termination (empty page + partial page), and module-level page-size constant.
agentex/tests/unit/temporal/test_run_healthcheck_workflow.py New unit test file with two focused cases: multi-page traversal (page 1 full + page 2 partial) and exact-multiple boundary (page 1 full + page 2 empty), exercising both break conditions in the pagination loop.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[start main] --> B[load GlobalDependencies]
    B --> C{env configured?}
    C -- no --> Z1[return]
    C -- yes --> D{health-check enabled?}
    D -- no --> Z2[return]
    D -- yes --> E{Temporal configured?}
    E -- no --> Z3[return]
    E -- yes --> F[init AgentRepository + TemporalAdapter]
    F --> G[page_number = 1]
    G --> H[agent_repo.list\nfilters=READY\norder_by=id\npage=page_number]
    H --> I{agents empty?}
    I -- yes --> Z4[done]
    I -- no --> J[for each agent:\nstart_workflow or log skip]
    J --> K{len agents < PAGE_SIZE?}
    K -- yes, last page --> Z4
    K -- no --> L[page_number += 1]
    L --> H
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
flowchart TD
    A[start main] --> B[load GlobalDependencies]
    B --> C{env configured?}
    C -- no --> Z1[return]
    C -- yes --> D{health-check enabled?}
    D -- no --> Z2[return]
    D -- yes --> E{Temporal configured?}
    E -- no --> Z3[return]
    E -- yes --> F[init AgentRepository + TemporalAdapter]
    F --> G[page_number = 1]
    G --> H[agent_repo.list\nfilters=READY\norder_by=id\npage=page_number]
    H --> I{agents empty?}
    I -- yes --> Z4[done]
    I -- no --> J[for each agent:\nstart_workflow or log skip]
    J --> K{len agents < PAGE_SIZE?}
    K -- yes, last page --> Z4
    K -- no --> L[page_number += 1]
    L --> H
Loading

Reviews (3): Last reviewed commit: "fix: page ready agents for health checks" | Re-trigger Greptile

@scale-sf scale-sf requested a review from a team as a code owner July 2, 2026 02:21
@scale-sf scale-sf changed the title Fix health-check startup pagination fix: page ready agents during health-check startup Jul 2, 2026
Comment thread agentex/src/temporal/run_healthcheck_workflow.py
Comment thread agentex/tests/unit/temporal/test_run_healthcheck_workflow.py Outdated
@scale-sf scale-sf force-pushed the fix/page-ready-healthchecks branch from 8586b34 to dd0e73d Compare July 2, 2026 02:32
@scale-sf

scale-sf commented Jul 2, 2026

Copy link
Copy Markdown
Author

@greptile

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant