Skip to content

Single-page arena ToT for all size-determinable construction (init_tiles_nested, retile)#570

Merged
evaleev merged 4 commits into
masterfrom
zhihao/fix/single_arena_page
Jun 30, 2026
Merged

Single-page arena ToT for all size-determinable construction (init_tiles_nested, retile)#570
evaleev merged 4 commits into
masterfrom
zhihao/fix/single_arena_page

Conversation

@zhihao-deng

Copy link
Copy Markdown
Contributor

Summary

Route the size-determinable arena ToT construction paths through the up-front
arena_outer_init allocator instead of the incremental, multi-page
ArenaToTBuilder, so every such outer tile is a single contiguous arena page.

Multi-page outer tiles silently disqualify the strided-BLAS outer-contract +
inner-AXPY fast path (it reverts to per-cell AXPY). make_nested_tile
(init_tiles_nested) and make_with_new_trange (TA::retile) had regressed to
the multi-page builder in 176df8a2, though their range/fill callbacks are
random-access — so the up-front allocator applies with no peak-memory penalty.

Why it matters

Fixing this at the construction layer makes the single-page guarantee universal:
all arena ToT built via these paths are single-page — not just one hand-patched
category. Downstream (e.g. MPQC CSV-CCk/PNO) every operand — amplitudes,
coefficients, energies, PNO/OSV-domain tensors — is single-page, instead of relying
on a post-hoc compaction of a single variable (which also costs an extra deep-copy
and a peak-memory spike). ArenaToTBuilder is kept for the one genuinely
single-pass path (init_tiles), preserving the 176df8a2 optimization where it
still applies.

Companion MPQC change (expected)

This TA change covers ToT built through init_tiles_nested/retile. MPQC also
constructs some operands via direct ArenaToTBuilder calls, which need a
matching conversion to arena_outer_init to be single-page — landing separately
in MPQC:

  • cc/solvers.h: the PNO Jacobi amplitude update and the coefficient/energy
    accessors (_pno_coeffs_tot, _osv_coeffs_tot, f_pno_diag, osv_energy).
  • mbpt/pao_to_pno_mp2.ipp: the PAO→PNO contribution-slice builders.
  • mbpt/csv.ipp: drops the now-redundant compact_csv_coeffs post-hoc compaction
    (coefficients are now born single-page).

Validation

With both the TA and MPQC changes applied: new make_nested_tile/retile
single-page regression tests + full tot_construction suite pass. Downstream MPQC
CSV-CCk (C4H10) with TA_ASSERT_SINGLE_PAGE=1: 0 violations across 490,124 ToT
tiles
, ~10% faster.

zhihao-deng and others added 4 commits June 30, 2026 07:02
init_tiles_nested's range/fill callbacks are random-access, so the tile
size is known up front. Build via arena_outer_init (one contiguous page)
instead of the incremental ArenaToTBuilder, whose multi-page spill reverts
the strided-BLAS fast path to per-cell AXPY. Reverts the 176df8a switch;
ArenaToTBuilder stays the single-pass init_tiles fall-back.
make_with_new_trange rebuilt each retiled arena-ToT tile incrementally,
though the source cell ranges are known up front. Use arena_outer_init so
TA::retile keeps arena ToT tiles single-page.
TA_ASSERT_SINGLE_PAGE (env, off by default) makes arena_outer_init and
ArenaToTBuilder::finish check Arena::page_count() <= 1, throw a
TiledArray::Exception on a multi-page tile, and print a checked/violations
summary at exit. Uses page_count() rather than classify_run so it stays
valid for tiles with null or non-uniform inner cells. Zero overhead when
unset.
The up-front arena_outer_init path pre-walks every cell's range via
range_fn; cache the source cell it finds per ordinal so the fill loop
reuses it instead of repeating the (cached, but non-trivial) source
lookup -- restoring the single source-lookup-per-cell property the
comment claimed. Also pass zero_init=false, since every non-null cell
is fully overwritten by the copy loop, matching the old builder default.
@evaleev evaleev force-pushed the zhihao/fix/single_arena_page branch from 3ce639c to 35e96b5 Compare June 30, 2026 11:11
@evaleev

evaleev commented Jun 30, 2026

Copy link
Copy Markdown
Member

Rebased onto current master (now includes the merged #571) and pushed a cleanup commit (35e96b527) addressing two review findings in the retile path (make_with_new_trange):

  1. Duplicate cell walkthrougharena_outer_init's range pre-walk now caches the source cell it finds per ordinal in src_cells, so the fill loop reuses it instead of re-running the (cached, but non-trivial) source_cell_at lookup. This restores the single-source-lookup-per-cell property the comment claimed; the stale comment is updated to describe the actual two-phase up-front build.
  2. Redundant zero-init — pass zero_init=false to arena_outer_init, since every non-null cell is fully overwritten by the copy loop (matching the old ArenaToTBuilder default).

Verified: ta_test builds clean; tot_construction_suite passes at np=1. (The np=2 tot_construction_dist_suite SEGV in the MADNESS RMI worker thread reproduces identically on the un-modified branch — pre-existing ASan/comm-thread teardown artifact, not from this change.)

@evaleev evaleev merged commit 84411a6 into master Jun 30, 2026
8 of 9 checks passed
@evaleev evaleev deleted the zhihao/fix/single_arena_page branch June 30, 2026 11:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants