Skip to content

fix: support missing columns in Mosaic reads#389

Merged
JingsongLi merged 1 commit into
apache:mainfrom
QuakeWang:mosaic-missing-columns
Jun 16, 2026
Merged

fix: support missing columns in Mosaic reads#389
JingsongLi merged 1 commit into
apache:mainfrom
QuakeWang:mosaic-missing-columns

Conversation

@QuakeWang

Copy link
Copy Markdown
Member

Purpose

Linked issue: #378

Mosaic reads used to validate the full requested schema before checking the physical Mosaic file schema. When a requested column was missing from an older Mosaic file, an unsupported type in that missing column could fail the read before DataFileReader had a chance to fill it with nulls.

Brief change log

  • Filter Mosaic projections by the physical file schema before validating supported Mosaic types.
  • Return empty batches with the correct row count when all requested Mosaic columns are physically missing.
  • Keep missing column null filling in DataFileReader.
  • Add Mosaic reader tests for missing columns, missing unsupported columns, all-missing projections, and row selection.
  • Add a DataFileReader test covering null fill for a physically missing Mosaic column.

Tests

  • cargo fmt --all -- --check
  • cargo test -p paimon --features mosaic arrow::format::mosaic -- --nocapture
  • cargo test -p paimon --features mosaic table::data_file_reader -- --nocapture
  • cargo check -p paimon-datafusion --features mosaic
  • cargo clippy -p paimon --features mosaic --tests -- -D warnings

API and Format

Documentation

Signed-off-by: QuakeWang <wangfuzheng0814@foxmail.com>

@JingsongLi JingsongLi left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@JingsongLi JingsongLi merged commit 1491ee0 into apache:main Jun 16, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants