Skip to content

feat: add Mosaic row-group predicate pruning#393

Open
QuakeWang wants to merge 1 commit into
apache:mainfrom
QuakeWang:mosaic-pruning
Open

feat: add Mosaic row-group predicate pruning#393
QuakeWang wants to merge 1 commit into
apache:mainfrom
QuakeWang:mosaic-pruning

Conversation

@QuakeWang

Copy link
Copy Markdown
Member

Purpose

Linked issue: Related to #378

Mosaic reads previously treated data predicates as residual only, so row groups were read even when Mosaic row-group statistics could prove they could not match.

This PR adds conservative row-group predicate pruning for Mosaic files while preserving residual filtering semantics.

Brief change log

  • Add a Mosaic StatsAccessor over row-group ColumnStats.
  • Map file-level Paimon fields to Mosaic column indices by name.
  • Convert supported Mosaic stats values into Paimon Datum for shared stats pruning.
  • Fail open when stats are missing or cannot be interpreted safely.
  • Update ReadBuilder docs to describe generic format-level reader pruning.

Tests

  • cargo test -p paimon --features mosaic arrow::format::mosaic::tests -- --nocapture
  • cargo fmt --check

API and Format

Documentation

Signed-off-by: QuakeWang <wangfuzheng0814@foxmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant