GH-3558: Bump hadoop to 3.4.3 and properly close buffers for vectored IO by nastra · Pull Request #3620 · apache/parquet-java

nastra · 2026-06-19T13:09:29Z

Rationale for this change

This closes allocated buffers for vectored IO, because after updating to Hadoop > 3.4.x we see a bunch of failing tests where buffers are not properly closed.

Closes #3558
Closes #3356

What changes are included in this PR?

Are these changes tested?

existing tests

Are there any user-facing changes?

nastra · 2026-06-19T13:35:38Z

/cc @Fokko @wgtmac

iemejia · 2026-06-20T20:15:56Z

Wondering if my approach for the same issue in https://github.com/apache/parquet-java/pull/3579/changes#diff-8da24c84aef62e6e836d073938f7843d289785baaeddf446f3afeae6d4ef4b10R1368
is slightly more robust. Also for reference in my change when I free the resources it really does not apply them because it is up to Hadoop to do it :(
apache/hadoop#8511

If anyone here knows someone who can help us get this reviewed/merged on the Hadoop side that would be great.

Fokko · 2026-06-21T18:49:31Z

+    // allocated. We track the originals here to ensure they are properly released.
+    ByteBufferAllocator baseAllocator = options.getAllocator();
+    List<ByteBuffer> allocatedBuffers = new ArrayList<>();
+    ByteBufferAllocator trackingAllocator = new ByteBufferAllocator() {


I don't like adding a wrapper here just to fix something in Hadoop. @steveloughran suggested to set the config to avoid checksums on the Hadoop side: #3356 (comment) This doesn't add any value to Parquet anyway. This is more or less what I suggested in #3559

makes sense, let's go with your PR then.

apacheGH-3558: Properly close buffers for vectored IO

b2456b7

nastra changed the title ~~GH-3558: Properly close buffers for vectored IO~~ GH-3558: Bump hadoop to 3.5.0 and properly close buffers for vectored IO Jun 19, 2026

Update pom.xml

676f1be

nastra changed the title ~~GH-3558: Bump hadoop to 3.5.0 and properly close buffers for vectored IO~~ GH-3558: Bump hadoop to 3.4.3 and properly close buffers for vectored IO Jun 19, 2026

Fokko reviewed Jun 21, 2026

View reviewed changes

nastra closed this Jun 22, 2026

nastra deleted the close-buffers branch June 22, 2026 10:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GH-3558: Bump hadoop to 3.4.3 and properly close buffers for vectored IO#3620

GH-3558: Bump hadoop to 3.4.3 and properly close buffers for vectored IO#3620
nastra wants to merge 2 commits into
apache:masterfrom
nastra:close-buffers

nastra commented Jun 19, 2026 •

edited

Loading

Uh oh!

nastra commented Jun 19, 2026

Uh oh!

iemejia commented Jun 20, 2026

Uh oh!

Fokko Jun 21, 2026

Uh oh!

nastra Jun 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

nastra commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

nastra commented Jun 19, 2026

Uh oh!

iemejia commented Jun 20, 2026

Uh oh!

Fokko Jun 21, 2026

Choose a reason for hiding this comment

Uh oh!

nastra Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

nastra commented Jun 19, 2026 •

edited

Loading