Summary
pl.show() computes the figure's axis bounds via get_extent(sdata, coordinate_system=cs, exact=True) (basic.py:1831). For shapes/points, spatialdata's exact=True path transforms every geometry into the coordinate system (per-geometry shapely affine, O(N)) just to take a bounding box. For large shape collections this dominates the render and is unrelated to what is being drawn.
Measured on the real Visium HD dataset (render_shapes):
| shapes |
get_extent exact=True |
exact=False |
result |
| 351,817 |
6.9 s |
0.97 s |
identical |
| 5,479,660 |
109 s |
14.6 s |
identical |
At 5.5M shapes this get_extent call is ~85% of the whole render (~109 s of ~128 s), independent of method/as_points.
The heuristic
exact=False transforms only the bounding-box corners instead of all geometries. Mathematically:
exact=True = bbox({T(g) for every geometry g})
exact=False = bbox(T(corners of the intrinsic bbox))
Since geometries ⊆ intrinsic bbox, exact=False ⊇ exact=True always, and they are equal iff the transform maps axis-aligned boxes to axis-aligned boxes — i.e. the 2×2 linear part is a monomial matrix (exactly one non-zero per row and per column): scale, axis flips, 90°/180°/270° rotations, axis swaps (+ any translation). Only true rotation/shear makes exact=False over-estimate. spatialdata's own get_extent docstring confirms this: "the exact and approximate extent are the same if the transformation does not contain any rotation or shear."
This covers essentially all real Visium/Xenium/MERFISH data (readers produce scale + translation).
Proposed change (spatialdata-plot)
In show(), before calling get_extent, inspect the transforms of the wanted shapes/points elements to the coordinate system; if all are axis-aligned, pass exact=False, else keep exact=True.
def _is_axis_aligned(linear2x2, *, rtol=1e-9):
"""Sends axis-aligned boxes to axis-aligned boxes (scale/flip/90deg/swap) -> exact==approx."""
m = np.asarray(linear2x2, dtype=float)
nz = np.abs(m) > rtol * (np.abs(m).max() or 1.0) # relative tol: ignore float noise
return bool((nz.sum(0) <= 1).all() and (nz.sum(1) <= 1).all() and nz.sum() == m.shape[0])
- Conservative: any rotated/sheared element among the rendered ones → keep
exact=True for the whole call, so output is never wrong.
- Validated end-to-end:
render_shapes(as_points=True) on 351k shapes went 8.5 s → 2.6 s (3.2×) by forcing exact=False; 5.5M projected ~128 s → ~33 s (~4×). Extents identical for non-rotation transforms.
- Speeds up the non-
as_points shapes render too, since the extent cost is shared.
- Use a relative tolerance in the monomial check so float noise in
to_affine_matrix isn't misread as shear.
Acceptance / tests
exact=False is chosen for a pure scale/translation element; exact=True for a rotated one.
- A render benchmark / assertion that extents are unchanged for scale transforms.
Note: upstream follow-up (spatialdata)
A get_extent(exact=False) still has an O(N)-Python floor: _get_extent_of_shapes filters empties with e["geometry"].apply(lambda g: not g.is_empty) (~14.6 s of the 5.5M number); vectorized ~e.geometry.is_empty is 128× faster. The deeper fix is in spatialdata itself: (1) auto-detect axis-aligned transforms inside get_extent so even exact=True uses the cheap corner-transform path (fixes squidpy/napari too), and (2) vectorize that empty filter. The spatialdata-plot heuristic above is the immediate, self-contained mitigation.
Context: surfaced while profiling render_shapes/render_labels(as_points=True) on Visium HD; the geometry-transform in get_extent, not the scatter, is the bottleneck.
Summary
pl.show()computes the figure's axis bounds viaget_extent(sdata, coordinate_system=cs, exact=True)(basic.py:1831). For shapes/points, spatialdata'sexact=Truepath transforms every geometry into the coordinate system (per-geometry shapely affine, O(N)) just to take a bounding box. For large shape collections this dominates the render and is unrelated to what is being drawn.Measured on the real Visium HD dataset (
render_shapes):get_extentexact=TrueAt 5.5M shapes this
get_extentcall is ~85% of the whole render (~109 s of ~128 s), independent ofmethod/as_points.The heuristic
exact=Falsetransforms only the bounding-box corners instead of all geometries. Mathematically:exact=True=bbox({T(g) for every geometry g})exact=False=bbox(T(corners of the intrinsic bbox))Since geometries ⊆ intrinsic bbox,
exact=False ⊇ exact=Truealways, and they are equal iff the transform maps axis-aligned boxes to axis-aligned boxes — i.e. the 2×2 linear part is a monomial matrix (exactly one non-zero per row and per column): scale, axis flips, 90°/180°/270° rotations, axis swaps (+ any translation). Only true rotation/shear makesexact=Falseover-estimate. spatialdata's ownget_extentdocstring confirms this: "the exact and approximate extent are the same if the transformation does not contain any rotation or shear."This covers essentially all real Visium/Xenium/MERFISH data (readers produce scale + translation).
Proposed change (spatialdata-plot)
In
show(), before callingget_extent, inspect the transforms of the wanted shapes/points elements to the coordinate system; if all are axis-aligned, passexact=False, else keepexact=True.exact=Truefor the whole call, so output is never wrong.render_shapes(as_points=True)on 351k shapes went 8.5 s → 2.6 s (3.2×) by forcingexact=False; 5.5M projected ~128 s → ~33 s (~4×). Extents identical for non-rotation transforms.as_pointsshapes render too, since the extent cost is shared.to_affine_matrixisn't misread as shear.Acceptance / tests
exact=Falseis chosen for a pure scale/translation element;exact=Truefor a rotated one.Note: upstream follow-up (spatialdata)
A
get_extent(exact=False)still has an O(N)-Python floor:_get_extent_of_shapesfilters empties withe["geometry"].apply(lambda g: not g.is_empty)(~14.6 s of the 5.5M number); vectorized~e.geometry.is_emptyis 128× faster. The deeper fix is in spatialdata itself: (1) auto-detect axis-aligned transforms insideget_extentso evenexact=Trueuses the cheap corner-transform path (fixes squidpy/napari too), and (2) vectorize that empty filter. The spatialdata-plot heuristic above is the immediate, self-contained mitigation.Context: surfaced while profiling
render_shapes/render_labels(as_points=True)on Visium HD; the geometry-transform inget_extent, not the scatter, is the bottleneck.