Skip to content

Ngmix v2.0 (CI mirror of #740)#741

Open
cailmdaley wants to merge 107 commits into
developfrom
ngmix_v2.0
Open

Ngmix v2.0 (CI mirror of #740)#741
cailmdaley wants to merge 107 commits into
developfrom
ngmix_v2.0

Conversation

@cailmdaley

Copy link
Copy Markdown
Contributor

What this is

A same-repo mirror of #740 (@martinkilbinger's "Ngmix v2.0"), pushed to a branch on CosmoStat/shapepipe so that CI actually runs. All 57 commits are authored by Martin (and carry Lucy Baumont's and Axel Guinot's work) — pushing the branch preserves that authorship unchanged; the only thing this PR adds is a same-repo head so GitHub Actions fires the pull_request workflow without the fork-PR approval gate. #740 received no CI runs at all for this reason.

Substance is identical to #740 — see that PR for the full description. In short: upgrade ngmix to 2.4.0 and adopt Lucy's new ngmix classes/interface; overhaul the shape-measurement module; centroid-bias fix + validation; v2.0 production-run plumbing.

Going forward, opening PRs directly on CosmoStat/shapepipe (rather than from a fork) avoids this — fork PRs don't trigger our Docker-image CI without a maintainer approval that wasn't happening.

Closes/supersedes #740 once CI is green (leaving that call to Martin).

Review

A detailed review is on its way (read against Martin's checklist plus a science-quality pass). Headline from exercising the new fitter against real ngmix 2.4.0 on candide: the metacal path runs end-to-end and shear recovery is unbiased at the few×10⁻⁴ level in m. Full notes to follow.

— Claude on behalf of Cail

Lucie Baumont and others added 30 commits January 9, 2023 14:07
- bin/ scripts were untracked, causing Docker build to fail
- Fix license field to use SPDX string format (MIT) to resolve
  SetuptoolsDeprecationWarning

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
cailmdaley and others added 3 commits June 11, 2026 15:09
The rms branch drew the metacal noise image at a scalar median(rms)
while the weights claim per-pixel variance; under a strong RMS gradient
(factor 8 across a 48px stamp) fixnoise then floods the low-noise half
and the ivar weights measure WORSE than binary weights through metacal
(noshear shape scatter 0.116 vs 0.078 over 50 paired realisations).
Drawing the noise image from the rms map itself restores the advantage
(0.053 vs 0.078); constant maps are bit-identical to the old behaviour.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Known Gaussian galaxies with noise of known per-pixel variance run
through the module's own observation/weight building and fitter stack
(do_ngmix_metacal's runner construction is factored into make_runners so
the test fits with literally the module's configuration). Teeth: mean
reduced chi2 = 1.00 only if the weights are the true inverse variance
(probed breakages land at 3.0-1.5e5; ngmix's chi2-rescaled covariance
makes pulls blind to this), and under a factor-8 RMS gradient the ivar
weights must beat MegaPipe-style binary weights on paired realisations
(scatter ratio 0.59 direct, 0.68 through metacal; inverted or transposed
maps push it past 1.6). Accuracy and pull calibration asserted per case.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…pe, pull-tol derivation)

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@cailmdaley

Copy link
Copy Markdown
Contributor Author

The GalSim noise-injection validation has landed (c5a9cc4d, tightened in 6bae2a41) — and it earned its keep before arriving: it caught a real defect in the rms-weight path as previously merged (fixed in bb93c288, please review that one specifically).

The finding. In do_ngmix_metacal, the fixnoise noise image was drawn at a scalar median(rms). Under a strong noise gradient (factor-8 across the stamp) that floods the low-noise half with excess noise, and the ivar weights were measurably counterproductive end-to-end: shape scatter 0.116 (ivar) vs 0.078 (binary), ratio 1.48. Drawing the noise image per-pixel from the rms map restores the expected advantage: 0.053 vs 0.078 (ratio 0.684). Constant rms maps produce bit-identical draws to the old path, and bkg_rms=None is untouched.

What the test pins (N=50 seeded realizations per case, true g=(+0.03,−0.02), Gaussian galaxy × Gaussian PSF, observations built by the module's own make_ngmix_observation path):

  • Direct fits, ivar weights: mean g = (+0.0329, −0.0194) flat / (+0.0326, −0.0221) ramp — consistent with truth; pull std 1.08–1.12; χ²/dof = 1.0027 under both flat and factor-8 ramp noise.
  • Binary weights under the ramp (old MegaPipe-style): χ²/dof = 4.83, paired shape-scatter ratio ivar/binary = 0.586 (asserted < 0.8).
  • Metacal end-to-end (noshear): ivar/binary scatter ratio 0.684 (asserted < 0.9) after bb93c288.

The teeth — injected wiring bugs, all caught: ×4 normalization → χ²/dof 4.0; 1/rms instead of 1/rms² → 13.2; inverted weights → 1.5×10⁵ (and scatter ratio > 1.6); transposed rms map → 13.9. One subtlety worth knowing: pull distributions alone can be laundered (ngmix rescales the parameter covariance by reduced χ²), so the absolute-normalization assertion rides on χ²/dof — the one statistic that can't hide a mis-scaled weight map.

Honest boundaries: the test enters at make_ngmix_observation, so vignetmaker's upstream extraction/orientation of the BACKGROUND_RMS vignettes is not exercised (the transposed-map probe shows χ² would catch a misalignment injected at the module boundary). Metacal pull std sits at ~1.4–1.5 even with perfect weights and flat noise — fixnoise/deconvolution correlated-noise bookkeeping inside ngmix, independent of the weights — so pull-calibration assertions live on the direct-fit layer.

11 tests, ~35 s; full suite 274 passed, no regressions. This file is also the natural home for the r50/T semantics regression checks going forward.

— Claude (Fable), on behalf of Cail

@cailmdaley

Copy link
Copy Markdown
Contributor Author

One more finding from tonight's star-validation investigation (prep for tomorrow's session with Fabian), orthogonal to the weight work but real: the module's PSF-model fit is prior-dominated. PSF observations are built with unit-flux stamps and no weight map, so the likelihood is much weaker than the prior — a PSF drawn with g1 = 0.025 fits to g1 = 9×10⁻⁵ (HSM confirms the stamp itself carries g1 = 0.0241; removing the prior recovers g = (0.0241, −0.0144)). Consequences: the g_psfo/g_psf catalogue columns are biased round, and the deconvolution kernel is rounder than the true PSF — a PSF-leakage risk. Present on both the current tip and the pre-June branch state, so it is not from the recent fixes. Happy to open this as its own issue with the reproduction script (/n17data/cdaley/unions/scratch-wf/star-experiment/ on candide).

Possibly also relevant to Monday's m-bias discussion: the 2023 star-test failure (#656, still open) links to esheldon/ngmix#72 — metacal m ≈ −0.2 when the jacobian departs from a simple pixel scale. The WCS-passing question and the m-bias question may be the same thread; we're checking the WCS-delivery hypothesis against Fabian's run artifacts tomorrow.

— Claude (Fable), on behalf of Cail

Gate mpi4py import on MPI launcher environment so bare shapepipe_run
inside an srun shell no longer aborts in MPI_Init. Same commits as
PR #747; folded here so the fix ships with the ngmix v2.0 line.
The exclusive input-ID flag and the NUMBER_LIST config option converged
in FileHandler._format_process_list and did the same thing for a single
ID; NUMBER_LIST is now the one mechanism.

Pipeline:
- remove -e/--exclusive from args.py and its plumbing through run.py,
  FileHandler, and JobHandler (where it was stored but never used)
- NUMBER_LIST entries are now validated against the input file numbers
  found on disk, preserving -e's early failure on a wrong ID: the run
  aborts at start-up instead of when a module first opens files
- unit tests for the validation (subset passes, typo raises, no-list
  scan path unchanged)

Canfar chain (script-level -e options are unchanged; one ID per
headless job remains the interface):
- job_sp_canfar.bash, job_sp_canfar_v2.0.bash, and
  init_run_exclusive_canfar.sh write NUMBER_LIST into a per-job config
  copy (set_config_number_list: insert-or-replace under [FILE], ID in
  numbering-scheme form: leading dash, dots->dashes) instead of passing
  -e to shapepipe_run. Side benefit: the processed ID is recorded in
  the config copied to the run's log directory.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
cailmdaley and others added 2 commits June 11, 2026 23:09
Review hardening: set_config_number_list now verifies the key landed in
the config copy and aborts otherwise — a config with no NUMBER_LIST line
and no bare [FILE] header would previously install an unmodified copy
and silently process every image. Also fix init_run_exclusive_canfar.sh's
missing-ID guard, which tested an unset variable and never fired.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…exclusive

Replace -e/--exclusive flag with NUMBER_LIST (#746)
@cailmdaley cailmdaley mentioned this pull request Jun 11, 2026
@martinkilbinger

Copy link
Copy Markdown
Contributor

One more finding from tonight's star-validation investigation (prep for tomorrow's session with Fabian), orthogonal to the weight work but real: the module's PSF-model fit is prior-dominated. PSF observations are built with unit-flux stamps and no weight map, so the likelihood is much weaker than the prior — a PSF drawn with g1 = 0.025 fits to g1 = 9×10⁻⁵ (HSM confirms the stamp itself carries g1 = 0.0241; removing the prior recovers g = (0.0241, −0.0144)). Consequences: the g_psfo/g_psf catalogue columns are biased round, and the deconvolution kernel is rounder than the true PSF — a PSF-leakage risk. Present on both the current tip and the pre-June branch state, so it is not from the recent fixes. Happy to open this as its own issue with the reproduction script (/n17data/cdaley/unions/scratch-wf/star-experiment/ on candide).

Possibly also relevant to Monday's m-bias discussion: the 2023 star-test failure (#656, still open) links to esheldon/ngmix#72 — metacal m ≈ −0.2 when the jacobian departs from a simple pixel scale. The WCS-passing question and the m-bias question may be the same thread; we're checking the WCS-delivery hypothesis against Fabian's run artifacts tomorrow.

— Claude (Fable), on behalf of Cail

This could be a feature - with fitgauss an isotropic function is fitted (@fabianhervaspeters). To check.

cailmdaley and others added 7 commits June 12, 2026 20:39
Consolidate the test suite around one discovery root with three suite-level
tiers under tests/, keeping module-unit tests next to the code in
src/shapepipe/tests/. The root conftest.py is the single source of markers,
candide detection, and the off-cluster skip policy, so the same suite is green
on a laptop, in CI, and on the cluster.

Markers (declared in pyproject.toml, --strict-markers on):
  slow     heavy compute; excluded from the fast inner loop
  candide  needs the cluster and/or its real data; auto-skipped elsewhere

Guardrails:
- tests/science/test_mbias.py (fast, local): inject g1=0.02 through
  make_ngmix_observation / do_ngmix_metacal, build the per-object metacal
  response R11, assert |m| < 5e-3 in the ideal limit. Recovers g1=0.019964
  (m=-1.8e-3).
- tests/cluster/test_star_shear_response.py (slow + candide): faithful
  reproduction of Fabian's R-function. Reads ngmix metacal catalogs
  (HDU [2]=1p [1]=1m [4]=2p [3]=2m), mask 20<mag<26,
  R1=(g1_1p-g1_1m)/0.02, R2=(g2_2p-g2_2m)/0.02, 100-group jackknife;
  asserts <R1>,<R2> within 0 +/- 0.03. Parametrized outputs base_dir;
  the SLURM regeneration chain is wired (tests/helpers/cluster.py) but
  opt-in only, never launched by the suite.

Helpers: tests/helpers/{star_response,cluster,artifacts}.py.

Artifacts seam: the star-response test emits a plot + status JSON + markdown
to tests/_artifacts/ (on pass and fail) for a later GitHub Pages step. Emit
only; the deploy is intentionally not built here.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The suite lived in two trees — top-level `tests/` (unit/science/cluster/
helpers) and `src/shapepipe/tests/` (the per-module tests). Fold the module
tests into `tests/module/` so there is a single canonical location, discovered
from one `testpaths` root.

- Relocate the 12 `src/shapepipe/tests/*.py` modules to `tests/module/`
  (pure moves — their imports are already absolute `from shapepipe...`, so no
  source edits and no packaging change were needed; nothing referenced
  `shapepipe.tests`).
- Make `tests/` a package (`tests/__init__.py`, `tests/module/__init__.py`).
  The cluster guardrail imports shared code as `tests.helpers.*`; that only
  resolves once `tests` itself is importable, which also removes a latent
  collection error in the harness branch.
- `pyproject.toml`: `testpaths = ["tests"]` (was `["tests", "src/shapepipe/tests"]`).
- Document the single layout in `tests/README.md`; fix the stale two-tree
  description in `docs/source/testing.md` and the project `CLAUDE.md`.

Full suite collects from one root with no errors; the fast tier
(`-m "not slow and not candide"`) is green (276 passed, 5 skipped).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The galaxy Jacobian origin sets where the centroid prior sits, so it must land
on the object. Two ways to place it, now selectable per run:

- `hsm` (DEFAULT, unchanged behavior): re-center on the HSM adaptive-moment
  centroid measured from the stamp. Robust for galaxies — it follows the light,
  so the centroid prior does not bias an object offset from the stamp center.
- `wcs` (Fabian's): place the origin at the catalog sky position projected
  through the WCS to a sub-pixel offset, with no shape measurement. Better for
  stars, whose HSM moments are too noisy to re-center on.

Wiring:
- `make_ngmix_observation` gains `centroid_source` (+ `wcs_full`/`ra`/`dec` for
  the wcs branch); the hsm branch is the prior code verbatim.
- `do_ngmix_metacal`, `Ngmix.__init__`, and `Ngmix.process` thread it through;
  `Postage_stamp` carries the per-epoch WCS + ra/dec, filled in
  `prepare_postage_stamps` (used only by the wcs branch).
- `ngmix_runner` reads the optional `CENTROID_SOURCE` ini key (default `hsm`);
  documented in `example/cfis/config_tile_Ng_template.ini`.

The star-response guardrail declares `CENTROID_SOURCE = wcs` and threads it
into the regeneration chain (`SP_CENTROID_SOURCE`), so the star grid is built
with the star-appropriate centroid.

Default path verified unchanged: 26 ngmix/science tests green; the wcs branch
reproduces the WCS-projected sub-pixel offset on a synthetic exposure.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add `.github/workflows/fast-tests.yml`: the quick inner-loop gate that runs the
`tests/` suite minus the heavy and cluster tiers — `pytest -m "not slow and not
candide"` — on every push and pull request. It complements `deploy-image.yml`
(which builds the dev image and runs the *full* suite inside it) by skipping the
Docker build, so it returns in minutes.

Matches the repo's conventions: SHA-pinned actions (covered by the existing
github-actions dependabot policy), Python 3.12, and the environment reproduced
from `uv.lock` via `uv sync --frozen --extra test` — the same pinned set a
developer's `.venv` carries, including the pinned ngmix git source. Tests that
need the astromatic binaries self-skip via `shutil.which`, so they don't run
here (the dev-image CI exercises them). Concurrency-cancels superseded runs.

Verified the exact invocation locally: 276 passed, 5 skipped, 2 deselected.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The fork pin it described (aguinot/ngmix@stable_version) is removed by this
PR's move to upstream ngmix 2.4.0, so the rule no longer applies.
The v1 `moments_fail` column counted moments-initial-guess (get_guess)
failures. get_guess no longer exists in the v2.0 fitter; the column now
holds the count (0-5) of metacal fit types with nonzero fit flags, and it
reaches the catalogue (NGMIX_MOM_FAIL). The old name no longer matches the
meaning, so rename through ngmix + make_cat to mcal_types_fail /
NGMIX_MCAL_TYPES_FAIL. Tests updated in lockstep.
@aguinot

aguinot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Hello,
I just happened to look quickly over the changes in ngmix.py and I found a potential issue in the changes.
The Tpsf and g1/2_psf has been replaced by the psfo quantities (o stand for "original"). This is wrong and it is probably a lot better to keep both.
I would agree that g1/2_psf do not contain a lot of information as it is measured on the reconvolved PSF which in this case is an isotropic gaussian but it would be good to have as a sanitycheck. Also, the reconvolve PSF does not have to be isotropic so it could be useful to have if you plan on changing later.
Now Tpsf is the most problematic one. The reconvolved PSF is larger than the orignal one and you want to use that for the selection Tgal/Tpsf not the original PSF size. If you use Tpsf_o instead you will include smaller galaxies that would (should) be cuted off.
On a separate note, it is very dangerous to have a quantity that is not what it is suppose to be. Given that g1_psf is suppose to be the g1 of the PSF measured on the metacalibrated image it is expected to be 0. If you replace it by g1_psf_o then it will be =/= 0 and that could be very confusing.

I also saw a discussion on the prior for the PSF fit. I do agree that psf and galaxies should use different prior but it has never been an issue and I don't really see why it is an issue now. The size and flux are uniform over a range that cover the expected value for the PSF. The centroid should be fine too. So only the shape could be bias but it "roughly" a gaussian centered around 0 with an over estimated sigma for the PSF, which shouldn't be a big problem.
I hope this can help!

claude added 3 commits June 17, 2026 12:13
- make_cat: read ngmix_mcal_flags/ngmix_id outside the moments branch;
  they were assigned only in the non-moments (else) branch but used
  unconditionally, so _save_ngmix_data(moments=True) raised NameError.
- make_cat: remove a leftover debug print loop over the ELL columns.
- runners (mccd_interp, read_ext_sexcat, psfex_interp, vignetmaker):
  the WCS log / exp-numbers files are positional inputs supplied via the
  config FILE_PATTERN in MULTI-EPOCH / post-process modes, but the
  @module_runner decorators only declare the base inputs (the optional
  ones cannot be encoded statically without breaking CLASSIC/base mode).
  Replace the bare input_file_list[1]/[2] reads with a clear ValueError
  naming the missing FILE_PATTERN entry instead of a cryptic IndexError.
- ngmix compile_results: tie the hardcoded metacal-type list to
  METACAL_TYPES via a divergence check (order kept stable for
  byte-reproducible output).
- ngmix: drop a stray marker comment and a dead commented-out vignet-path
  block; fold two debug prints into the ValueError message and fix a
  missing f-string prefix in an adjacent error message.
The fast-tests workflow runs 'uv sync --frozen --extra test' then pytest
on the GitHub runner (outside the dev image). Without a [build-system]
table, uv resolved shapepipe as a virtual project (uv.lock recorded
source = { virtual = "." }), so 'uv sync' installed only the
dependencies and not shapepipe itself — every test module failed
collection with 'No module named shapepipe'.

Declare the setuptools build backend (already the de-facto backend via
[tool.setuptools]); uv.lock now records source = { editable = "." } and
'uv sync' puts shapepipe on the path. Verified: 'uv sync --dry-run' now
plans the editable install, and the package builds cleanly with
setuptools.build_meta (wheel contains shapepipe/ + entry points).
test_mpirun_launch_imports_mpi forces an mpirun-style env, which makes
shapepipe.run import mpi4py.MPI. On the bare fast-tests runner mpi4py the
package imports but the MPI runtime library (libmpi.so, provided by the
dev image) is missing, so mpi4py.MPI raises RuntimeError — not the
ImportError that pytest.importorskip('mpi4py') guards against. Guard on
importing mpi4py.MPI directly and skip on any failure, matching the
workflow's self-skip pattern for system-dependent tests. The dev-image CI
still exercises the real MPI path. (run.py's launcher gate is unchanged
and correct; test_bare_launch_skips_mpi already passes everywhere.)

Copy link
Copy Markdown
Contributor Author

CI is green — we believe this PR is ready to merge.

fast-tests, build-test-publish, and the docs deploy all pass on e57544be.

A final pass since the last review round:

Cleanups & review fixes

  • Renamed the moments_fail column → mcal_types_fail / NGMIX_MCAL_TYPES_FAIL through ngmix + make_cat. The v1 name counted get_guess (moments-guess) failures that no longer exist in the v2 fitter; it now counts the metacal types with nonzero fit flags, so the name now matches the meaning.
  • A /code-review pass surfaced and fixed: a latent NameError in make_cat._save_ngmix_data(moments=True) (two arrays were read only in the non-moments branch but used unconditionally); a leftover debug print loop in make_cat (plus two stray prints in ngmix folded into a ValueError, and a missing f-string prefix); clear ValueErrors in the mccd_interp / read_ext_sexcat / psfex_interp / vignetmaker runners when a MULTI-EPOCH/post-process config omits the positional WCS / exp-numbers input (previously a cryptic IndexError — no behaviour change for valid configs); and a divergence guard tying compile_results' metacal-type list to METACAL_TYPES.

CI fix (fast-tests was red, unrelated to the science changes)

  • pyproject.toml had no [build-system] table, so uv resolved shapepipe as a virtual project (uv.lock recorded source = { virtual = "." }) and uv sync installed only the dependencies — every test module failed collection with No module named shapepipe. Declared the setuptools backend (lockfile now editable).
  • test_mpirun_launch_imports_mpi now skips when the MPI runtime library is absent (the bare fast-tests runner; the dev-image CI still exercises the real MPI path), matching the workflow's self-skip pattern for system-dependent tests. run.py's launcher gate is unchanged and correct.

Deferred, intentionally out of scope for this PR

— Claude, on behalf of Cail


Generated by Claude Code

@aguinot

aguinot commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Hello again Claude/Cail,

I am sorry to insist but the changes to Tpsf are wrong and not necessary. I totally understand that you want to keep the psf prior in a separate issue but this change is introducing a bug and make it "impossible" to properly compute the selection response. I really don't understand why you are introducing it.
I will also answer to #749 with an example but as I suspected, the prior is not an issue.

The test-suite constitution is explicit: tests run inside the dev image
via deploy-image.yml's pytest step — 'the image is the environment, so the
suite exercises exactly what ships. There is no second test workflow.' The
fast-tests job ran a subset (-m 'not slow and not candide') in a bare uv
venv with no system tools, so the binary/MPI tests only self-skipped
there; the in-image 'pytest -rX' run is a strict superset and is the real
gate. Removing the second workflow eliminates the environment divergence
(and the skips that only existed to paper over it).

The [build-system] declaration stays — it is correct independent of CI
(makes 'uv sync' / 'pip install -e .' install the package, per the
documented dev loop); only the comment's fast-tests reference is dropped.

Copy link
Copy Markdown
Contributor Author

Update — consolidated to a single in-image CI path.

Following a quick discussion: rather than keep the separate fast-tests job (which ran a subset in a bare uv venv, with the binary/MPI tests only self-skipping there), I've removed fast-tests.yml. This realigns with the test-suite constitution — "the image is the environment… there is no second test workflow" — and eliminates the environment divergence. deploy-image.yml's in-container pytest -rX is a strict superset of what fast-tests ran, so no coverage is lost; it remains the gate.

Kept from the CI work: the [build-system] declaration (correct independent of CI — it makes uv sync / pip install -e . install the package, per the documented dev loop), and the MPI-launcher test's skip-when-unavailable guard (harmless in-image, where MPI is present).

All checks green on 216f407a; PR is mergeable_state: clean.

— Claude, on behalf of Cail


Generated by Claude Code

@martinkilbinger

Copy link
Copy Markdown
Contributor

@cailmdaley we should maybe propagate both original and reconvolved PSF quantities Tpsf and Tpsf_o, following @aguinot's comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants