Fix Iceberg read optimization returning NULLs for stats-less manifests by il9ue · Pull Request #1895 · Altinity/ClickHouse

il9ue · 2026-06-08T14:12:34Z

Re-opened from Altinity/ClickHouse (instead of my fork) so CI publishes direct .deb package URLs for clickhouse-regression. Same commit (8b597ed) as #1814. Fork PRs don't receive repo secrets, so the S3 upload step was skipped on #1814 and CI only emitted the DEB_ARM_RELEASE artifact zip.

Changelog category (leave one):

Bug Fix

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Fix Iceberg read optimization returning NULL for every column when reading from manifests written without per-file column statistics (typical of non-Spark writers like pyiceberg with default settings). Affects icebergLocal, icebergS3, icebergAzure, icebergHDFS, and all *Cluster variants. Antalya 26.3 fix for #1545.

Documentation entry for user-facing changes

Documentation is written (mandatory for new features)

Antalya-specific bug fix on antalya-26.3. No upstream cherry-pick — this bug exists only on Antalya, introduced by #1069 ("Read optimization using Iceberg metadata"). Mirror of the 25.8 fix in #1688.

Why this fires

When reading an Iceberg table written by a non-Spark writer that omits per-file column statistics from the manifest's Avro schema (pyiceberg with default settings, format v1 writers, and others), the allow_experimental_iceberg_read_optimization path produces silent data loss: correct row counts, every column value NULL. Confirmed on icebergLocal; the same code path fires for icebergS3, icebergAzure, icebergHDFS, and all *Cluster variants.

Root cause

IcebergIterator always populates file_meta_info before yielding objects, so the file_meta_data.has_value() check in the optimization passes. The problem is what's inside the populated DataFileMetaInfo: when the manifest's data_file.value_counts / column_sizes / null_value_counts Avro fields are all absent (all three are optional per the Iceberg spec), DataFileMetaInfo::columns_info stays empty.

The optimization's second loop in StorageObjectStorageSource::createReader then iterates every requested column, finds none in the empty columns_info map, and adds them all to constant_columns_with_values with Field() (NULL). requested_columns_copy is cleared, need_only_count = true, the Parquet reader returns row count only, and generate() injects every column as a constant-NULL column at the correct row count. The optimization conflates "no stats were written" with "all columns are absent" — but absent stats tell us nothing about which columns are physically present.

The fix

Add any_stats_field_present (bool) to DataFileMetaInfo, populated during manifest parsing in AvroForIcebergDeserializer.cpp — true if any of value_counts, column_sizes, or null_value_counts were emitted. Gate the optimization's absent-NULL loop on this flag: when no stats were emitted, skip the loop and fall through to the Parquet reader, which correctly handles both physically-present columns (read normally) and schema-evolved-absent columns (handled upstream by IcebergMetadata::getInitialSchemaByPath setting the file's own schema as initial_header).

A per-column presence set was considered but is unnecessary — schema evolution is already handled upstream of the optimization, so the boolean is sufficient. JSON serialization (cluster reads via toJson() / JSON-ptr constructor) round-trips the new field; missing-on-deserialization defaults to false, matching pre-fix behavior.

Files changed

src/Storages/ObjectStorage/DataLakes/IDataLakeMetadata.h — added any_stats_field_present to DataFileMetaInfo; constructor signature updated.
src/Storages/ObjectStorage/DataLakes/IDataLakeMetadata.cpp — JSON serde round-trips the new field; defaults to false on missing.
src/Storages/ObjectStorage/DataLakes/Iceberg/ManifestFile.h — header updates for ParsedManifestFileEntry.
src/Storages/ObjectStorage/DataLakes/Common/AvroForIcebergDeserializer.cpp — tracks whether any stats Avro field was present during manifest parsing on 26.3.
src/Storages/ObjectStorage/DataLakes/Iceberg/IcebergIterator.cpp — forwards the new bool when constructing DataFileMetaInfo.
src/Storages/ObjectStorage/StorageObjectStorageSource.cpp — the absent-NULL loop now skips when any_stats_field_present is false.

Note: 26.3 uses AvroForIcebergDeserializer.cpp for manifest parsing where 25.8 / 26.1 use ManifestFile.cpp (file was split upstream). Same logic, different file.

Tested

Integration test tests/integration/test_storage_iceberg_no_spark/test_iceberg_read_optimization_empty_stats.py, ported from the 25.8 PR. Four scenarios:

test_iceberg_local_returns_actual_rows_with_stats_less_manifest — reproducer, fails without the fix.
test_iceberg_local_returns_correct_rows_when_optimization_disabled — control.
test_iceberg_local_partial_stats_manifest_reads_correctly — manifest with value_counts only.
test_iceberg_local_full_stats_manifest_reads_correctly — full Spark-style stats regression guard.

Closes #1545
Mirror of #1688 (antalya-25.8 fix).

CI/CD Options

Exclude tests:

Regression jobs to run:

When an Iceberg manifest's per-file column statistics are absent or empty (common for non-Spark writers like pyiceberg with default settings), DataFileMetaInfo::columns_info is empty. The optimization in StorageObjectStorageSource::createReader misread this as "all columns are absent from the file" and returned constant NULLs for every row while still returning the correct row count. Result: silent data loss on icebergLocal, icebergS3, icebergAzure, icebergHDFS, and all *Cluster variants. Gate the optimization's absent-NULL loop directly on columns_info.empty() instead of introducing a separate stats-presence flag. When no usable per-column stats were parsed -- whether the manifest omitted the stats fields entirely or declared them but left them empty -- fall through to the Parquet reader, which correctly handles physically-present columns (read normally) and schema-evolved-absent columns (handled by IcebergMetadata::getInitialSchemaByPath setting the file's own schema as initial_header). columns_info is already serialized to workers in the cluster JSON path, so this changes no serialization format and keeps the fork's DataFileMetaInfo serde identical to upstream. Closes #1545. Mirror of #1688 (antalya-25.8 fix). Signed-off-by: Daniel Q. Kim <daniel.kim@altinity.com>

github-actions · 2026-06-08T14:13:54Z

Workflow [PR], commit [5ec03a9]

alsugiliazova · 2026-06-11T12:33:17Z

Audit update for PR #1895

AI audit note: This review comment was generated by AI (claude-opus-4.7).

Mirror of: Altinity/ClickHouse#1688 (antalya-25.8 fix). Closes Altinity/ClickHouse#1545. Re-open of the closed fork-based PR Altinity/ClickHouse#1814 (same commit 8b597ed / 5ec03a9), re-filed from a repo branch so CI secrets are available and the .deb package URLs are published for clickhouse-regression.

Confirmed defects

No confirmed defects in reviewed scope.

alsugiliazova · 2026-06-11T12:35:48Z

PR #1895 CI Verification — Fix Iceberg read optimization returning NULLs for stats-less manifests

Head branch: fix/iceberg-empty-stats-26.3
Head SHA: 5ec03a9700f77ba6dd465e84b4775d4a7e4323e9
Verification date: 2026-06-11

Verdict

No PR-caused regressions identified. All red checks are infrastructure timeouts, pre-existing flaky tests, or a pre-existing UBSan bug in unrelated code (ToStartOfInterval integer overflow). The PR's own 4 new integration tests pass on every job that ran them across 5 build variants.

Job status summary (current SHA)

Status	Count
SUCCESS	564
FAILURE	13 (jobs + rollup rows)
ERROR	3
SKIPPED	199 (build matrix exclusions)

Failing checks on current SHA `5ec03a97` (latest run `27143685017`)

#	Check	Failure	Category
1	`Stress test (arm_asan, s3)`	`Cannot start clickhouse-server`	Pre-existing flaky
2	`AST fuzzer (amd_ubsan)`	UBSan integer overflow in `ToStartOfInterval<Year>`	Pre-existing upstream bug, unrelated
3	`Integration tests (amd_msan, 4/6)`	Job-level `error`, 0 test fails	Harness timeout
4	`Integration tests (arm_binary, distributed plan, 2/4)`	Job-level `error`, 0 test fails	Harness timeout
5	`SQLLogic test`	Job-level `error`, 0 test fails	Harness timeout (3h limit)
6	`RegressionTestsRelease / S3Export (part) / s3_export_part`	`could not bring up docker-compose cluster`	Infrastructure

1. `Stress test (arm_asan, s3)` — pre-existing flake

Test name Cannot start clickhouse-server: 88 failures across 34 unrelated PRs in the last 60 days (since 2026-04-30). Recurring infrastructure / startup flake.

2. `AST fuzzer (amd_ubsan)` — pre-existing upstream UBSan bug

Stack trace from stderr.log:

/ClickHouse/src/Functions/DateTimeTransforms.h:817:79: runtime error:
  signed integer overflow: 9223372036854775806 * 12 cannot be represented in type 'Int64'
  #0  DB::ToStartOfInterval<(IntervalKind::Kind)10>::execute(...)  Functions/DateTimeTransforms.h:817
  #1  DB::FunctionToStartOfInterval::execute<DataTypeDate32, ...>  Functions/toStartOfInterval.cpp:437
  ...
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior

The bug is in Functions/DateTimeTransforms.h and Functions/toStartOfInterval.cpp — completely unrelated to Iceberg / object storage code touched by this PR.
AST fuzzer (amd_ubsan) "Unknown error" hits in last 60 days: PR=0 (master) ×4, plus PRs 1645, 1568, 1803, 1895 — generic recurring fuzzer bucket.
This is a fuzzer-discovered upstream UBSan issue, not a regression introduced by PR Fix Iceberg read optimization returning NULLs for stats-less manifests #1895.

3–5. Integration tests (amd_msan 4/6, arm_binary distributed plan 2/4) and SQLLogic test — harness timeouts

All three jobs are reported as error in the database with 0 test failures — meaning every test that completed passed; the job itself was killed by the harness timeout.

From Integration tests (amd_msan, 4/6) job log:

[2026-06-08 22:32:51] PASSED test_multiple_disks/test.py::test_concurrent_alter_modify[mt]
[2026-06-08 22:36:10] PASSED test_multiple_disks/test.py::test_concurrent_alter_modify[replicated]
[2026-06-08 22:37:34] WARNING: Timeout exceeded [11400], sending SIGTERM to process group
[2026-06-08 22:49:53] PASSED test_ttl_replicated/test.py::test_ttl_drop_parts_limit
[2026-06-08 22:49:53] === 3 passed in 1218.94s ===
[2026-06-08 22:49:54] Test execution was interrupted (exit status: 2)

From SQLLogic test job log:

[2026-06-08 22:51:52] WARNING: Timeout exceeded [10800] for [3952]
[2026-06-08 22:51:55] WARNING: Job timed out: [SQLLogic test], timeout [10800], exit code [137]

All three are runner-level timeouts (3h / 11400s limits), not regressions. None of these jobs touch the Iceberg/StorageObjectStorageSource code path.

6. `S3Export (part)` regression suite — infrastructure failure

From the report:

/s3 Fail 52s could not bring up docker-compose cluster
/s3/minio Fail 52s could not bring up docker-compose cluster

The test environment failed to start in 52 seconds (no test ever ran). Pure infrastructure issue — completely independent of the PR's source change to Iceberg read optimization.

PR's own new tests — all passing

tests/integration/test_storage_iceberg_no_spark/test_iceberg_read_optimization_empty_stats.py includes 4 test cases:

test_iceberg_local_full_stats_manifest_reads_correctly
test_iceberg_local_partial_stats_manifest_reads_correctly
test_iceberg_local_returns_correct_rows_when_optimization_disabled
test_iceberg_local_returns_actual_rows_with_stats_less_manifest

All 4 tests passed in all 5 integration jobs that scheduled them on this SHA:

Check	Status (4/4 cases)
`Integration tests (amd_tsan, 5/6)`	OK
`Integration tests (amd_msan, 5/6)`	OK
`Integration tests (amd_binary, 5/5)`	OK
`Integration tests (amd_asan, db disk, old analyzer, 5/6)`	OK
`Integration tests (arm_binary, distributed plan, 4/4)`	OK

That is 20 / 20 PASS for the PR's own coverage — the fix's behavior is consistently exercised across debug, release, sanitizer, alternative-analyzer, and ARM builds.

Recommendations

Rerun the failing jobs to clear the harness-timeout reds:
- Integration tests (amd_msan, 4/6)
- Integration tests (arm_binary, distributed plan, 2/4)
- SQLLogic test
Rerun Stress test (arm_asan, s3) to clear the known startup flake.
Rerun RegressionTestsRelease / S3Export (part) (infra docker-compose failure).
The AST fuzzer (amd_ubsan) UBSan finding in ToStartOfInterval is an upstream bug worth tracking separately, but does not block this PR — the touched code is unrelated.
Merge once jobs are green; no PR-introduced regressions detected.

il9ue mentioned this pull request Jun 8, 2026

Fix Iceberg read optimization returning NULLs for stats-less manifests (#1545) — antalya-26.3 #1814

Closed

25 tasks

ianton-ru approved these changes Jun 8, 2026

View reviewed changes

alsugiliazova added the verified Approved for release label Jun 11, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Iceberg read optimization returning NULLs for stats-less manifests#1895

Fix Iceberg read optimization returning NULLs for stats-less manifests#1895
il9ue wants to merge 1 commit into
antalya-26.3from
fix/iceberg-empty-stats-26.3

il9ue commented Jun 8, 2026

Uh oh!

github-actions Bot commented Jun 8, 2026

Uh oh!

alsugiliazova commented Jun 11, 2026

Uh oh!

alsugiliazova commented Jun 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

il9ue commented Jun 8, 2026

Changelog category (leave one):

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Documentation entry for user-facing changes

Why this fires

Root cause

The fix

Files changed

Tested

CI/CD Options

Exclude tests:

Regression jobs to run:

Uh oh!

github-actions Bot commented Jun 8, 2026

Uh oh!

alsugiliazova commented Jun 11, 2026

Audit update for PR #1895

Confirmed defects

Uh oh!

alsugiliazova commented Jun 11, 2026

PR #1895 CI Verification — Fix Iceberg read optimization returning NULLs for stats-less manifests

Verdict

Job status summary (current SHA)

Failing checks on current SHA 5ec03a97 (latest run 27143685017)

1. Stress test (arm_asan, s3) — pre-existing flake

2. AST fuzzer (amd_ubsan) — pre-existing upstream UBSan bug

3–5. Integration tests (amd_msan 4/6, arm_binary distributed plan 2/4) and SQLLogic test — harness timeouts

6. S3Export (part) regression suite — infrastructure failure

PR's own new tests — all passing

Recommendations

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Failing checks on current SHA `5ec03a97` (latest run `27143685017`)

1. `Stress test (arm_asan, s3)` — pre-existing flake

2. `AST fuzzer (amd_ubsan)` — pre-existing upstream UBSan bug

6. `S3Export (part)` regression suite — infrastructure failure