Skip to content

TAP test updates#515

Merged
mason-sharp merged 3 commits into
mainfrom
tap-nonregular-test-updates
Jul 1, 2026
Merged

TAP test updates#515
mason-sharp merged 3 commits into
mainfrom
tap-nonregular-test-updates

Conversation

@mason-sharp

@mason-sharp mason-sharp commented Jun 30, 2026

Copy link
Copy Markdown
Member

Fix non-regularly scheduled TAP tests + harden teardown

Some TAP tests not in the standard schedule failed on PG18. This fixes the genuine test bugs and hardens the shared teardown path so one test's failure can't cascade into others.

Test fixes (26d9df5)

  • 026_no_double_wal_flush — PG18 removed wal_sync/wal_write from pg_stat_wal (relocated to pg_stat_io). The test now reads WAL fsyncs from pg_stat_io (object='wal') on PG18+ and falls back to pg_stat_wal.wal_sync on PG15–17. The delta < N regression guard is unchanged.
  • 029_remote_commit_ts_recovery_clean_shutdown — the test grepped $datadir/logs, but the harness routes server logs to TESTLOGDIR (log_directory/log_filename in SpockTest.pm). Now greps the harness log dir. The Spock feature itself was already correct.
  • 800_standby_hot_standby_feedback — Spock disables its own failover-slots worker on PG18+ (relies on native sync_replication_slots; see #if PG_VERSION_NUM >= 180000 in spock_failover_slots.c). The test now skip_alls on PG18+, matching the source gate, and still runs fully on PG15–17.

Teardown resilience (83eb14c)

All TAP tests share fixed datadirs (/tmp/tmp_spock_node_*) and ports (544x), so a leaked postmaster from one test poisons the next. Previously a failed node_drop in destroy_cluster (via system_or_bail) could die before pg_ctl stop/rm -rf ran, orphaning servers. Logical cleanup is now best-effort (system_maybe) and the server-stop + datadir-removal always run.

pg_stat_wal lost wal_sync/wal_write in PG18 (moved to pg_stat_io), the
logging collector writes server logs to TESTLOGDIR rather than the data
directory, and Spock disables its failover-slots worker on PG18+ in favor
of native sync_replication_slots.

- 026: read WAL fsyncs from pg_stat_io on PG18+, pg_stat_wal.wal_sync below
- 029: grep the harness log dir (TESTLOGDIR) instead of the data directory
- 800: skip on PG18+, matching the #if PG_VERSION_NUM >= 180000 source gate
All TAP tests share fixed datadirs (/tmp/tmp_spock_node_*) and ports
(544x), so a leaked postmaster from one test poisons the next. When a
test died with node_drop failing (e.g. a node that was never fully set
up), system_or_bail aborted destroy_cluster before the pg_ctl stop and
rm -rf ran, orphaning servers on those fixed ports.

Make logical cleanup best-effort (node_drop via system_maybe) and ensure
the server stop and datadir removal always run (plain sleep/system), so
one test's failure can no longer cascade into the next.
@coderabbitai

coderabbitai Bot commented Jun 30, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

This PR updates four TAP test files: one selects WAL fsync counters dynamically by PostgreSQL version, one updates log directory resolution, one skips a test suite on PostgreSQL 18+, and one makes cluster teardown more resilient to failures.

Changes

TAP Test Fixes

Layer / File(s) Summary
Version-aware WAL fsync counter selection
tests/tap/t/026_no_double_wal_flush.pl
Queries server_version_num to choose between pg_stat_io (PG18+) and pg_stat_wal (PG15-17) for baseline and after measurements, then updates the delta computation and assertion text.
Log directory path fix
tests/tap/t/029_remote_commit_ts_recovery_clean_shutdown.pl
Uses \$conf->{log_dir} instead of a hardcoded $sub_dir/logs path to locate server logs, with comments clarifying the harness log directory layout.
Version-gated test skipping
tests/tap/t/800_standby_hot_standby_feedback.pl
Detects PostgreSQL major version via pg_config --version and skips the suite on version 18 or higher; otherwise it runs the fixed 9-test plan.
Resilient cluster teardown
tests/tap/t/SpockTest.pm
Changes destroy_cluster to use best-effort node dropping, non-bailing shutdown handling, an unconditional sleep, and unguarded datadir removal so teardown continues after partial failures.

Poem

I’m a rabbit with a TAP-tap beat,
Counting WAL and logs so neat.
New versions hop, old ones stay in line,
Teardown nibbles failure into twine.
ʕ•ᴥ•ʔ 🥕

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Title check ❓ Inconclusive The title is too generic and doesn't identify the main PG18 TAP test and teardown changes. Use a concise title that names the main change, such as PG18 TAP test fixes and teardown hardening.
✅ Passed checks (4 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description check ✅ Passed The description clearly matches the TAP test fixes and teardown hardening reflected in the changed files.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch tap-nonregular-test-updates

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@mason-sharp mason-sharp changed the title Tap nonregular test updates TAP test updates Jun 30, 2026
@codacy-production

Copy link
Copy Markdown

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/tap/t/026_no_double_wal_flush.pl`:
- Around line 61-63: Add context = 'normal' to the PG18 branch of the wal_sync
query in the 026_no_double_wal_flush.pl test so the pg_stat_io lookup matches
pg_stat_wal semantics. Update the $wal_sync_query logic in the conditional
around $pg_version_num to filter pg_stat_io by object = 'wal' and context =
'normal', preventing IOCONTEXT_INIT fsyncs from affecting $before, $after, and
$delta.

In `@tests/tap/t/800_standby_hot_standby_feedback.pl`:
- Around line 3-17: The test file is mixing two test-ending strategies, so keep
only one of `plan tests => 9` or `done_testing()` in the top-level script.
Update the `Test::More` setup near the existing `plan skip_all`/`plan tests`
block so the file uses a single completion style, and ensure any final test
count matches the actual assertions in this script.

In `@tests/tap/t/SpockTest.pm`:
- Around line 387-397: The cleanup logic in SpockTest’s shutdown path
backgrounds pg_ctl stop and then uses a fixed sleep before rm -rf, which can
race with still-running PostgreSQL processes. Update the stop-and-cleanup block
to run pg_ctl stop synchronously using the existing $PG_BIN/pg_ctl invocation so
it waits for each node to fully exit, or replace the fixed sleep with a real
wait/poll for shutdown completion before deleting each datadir. Keep the cleanup
flow in the loop over `@node_datadirs` and preserve the “must always run”
behavior.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 73c47e71-fb65-4de8-ab39-04235647217c

📥 Commits

Reviewing files that changed from the base of the PR and between 804a10b and 83eb14c.

📒 Files selected for processing (4)
  • tests/tap/t/026_no_double_wal_flush.pl
  • tests/tap/t/029_remote_commit_ts_recovery_clean_shutdown.pl
  • tests/tap/t/800_standby_hot_standby_feedback.pl
  • tests/tap/t/SpockTest.pm

Comment thread tests/tap/t/026_no_double_wal_flush.pl
Comment thread tests/tap/t/800_standby_hot_standby_feedback.pl Outdated
Comment thread tests/tap/t/SpockTest.pm Outdated
@mason-sharp mason-sharp requested a review from rasifr July 1, 2026 00:10
@mason-sharp mason-sharp merged commit 4fcf13f into main Jul 1, 2026
15 checks passed
@mason-sharp mason-sharp deleted the tap-nonregular-test-updates branch July 1, 2026 12:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants