GH-3558: Bump hadoop to 3.4.3 and properly close buffers for vectored IO#3620
GH-3558: Bump hadoop to 3.4.3 and properly close buffers for vectored IO#3620nastra wants to merge 2 commits into
Conversation
|
Wondering if my approach for the same issue in https://github.com/apache/parquet-java/pull/3579/changes#diff-8da24c84aef62e6e836d073938f7843d289785baaeddf446f3afeae6d4ef4b10R1368 If anyone here knows someone who can help us get this reviewed/merged on the Hadoop side that would be great. |
| // allocated. We track the originals here to ensure they are properly released. | ||
| ByteBufferAllocator baseAllocator = options.getAllocator(); | ||
| List<ByteBuffer> allocatedBuffers = new ArrayList<>(); | ||
| ByteBufferAllocator trackingAllocator = new ByteBufferAllocator() { |
There was a problem hiding this comment.
I don't like adding a wrapper here just to fix something in Hadoop. @steveloughran suggested to set the config to avoid checksums on the Hadoop side: #3356 (comment) This doesn't add any value to Parquet anyway. This is more or less what I suggested in #3559
There was a problem hiding this comment.
makes sense, let's go with your PR then.
Rationale for this change
This closes allocated buffers for vectored IO, because after updating to Hadoop > 3.4.x we see a bunch of failing tests where buffers are not properly closed.
Closes #3558
Closes #3356
What changes are included in this PR?
Are these changes tested?
existing tests
Are there any user-facing changes?