Skip to content

Fix partial insert handling for null measurements#17879

Open
Caideyipi wants to merge 2 commits into
masterfrom
fix-partial-insert-null-columns
Open

Fix partial insert handling for null measurements#17879
Caideyipi wants to merge 2 commits into
masterfrom
fix-partial-insert-null-columns

Conversation

@Caideyipi

Copy link
Copy Markdown
Collaborator

Summary:

  • Skip failed/null measurements and null tablet columns consistently in insert serialization, split, WAL, memtable, memory estimation, and pipe tablet parsing/conversion paths.
  • Keep null row values during pipe row type transfer and avoid composing last-cache values for null retained row/tablet values.
  • Add regression coverage for partial insert, null column/value, table non-FIELD columns, pipe parser/converter, and memory estimation paths.

Tests:

  • mvn -pl iotdb-core/datanode spotless:apply
  • git diff --check
  • Narrow datanode test lifecycle could not complete locally after recreating the branch on master: single-module compile first hit missing generated upstream classes, then -am compile rebuilt upstream modules but failed on Windows pagefile/native memory while compiling thrift-datanode. Before recreating the branch, the same narrow Surefire regression suite passed with 99 tests.

@jt2594838 jt2594838 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use a unified entry point to make null-related judgments instead of scattering them around.

Comment on lines +441 to +449
protected Object[] initTabletValuesForSplit(int columnSize, int rowSize, TSDataType[] dataTypes) {
Object[] values = initTabletValues(columnSize, rowSize, dataTypes);
for (int i = 0; i < values.length; i++) {
if (!hasColumnForSplit(i)) {
values[i] = null;
}
}
return values;
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May init nulls directly in initTabletValues, avoid creating unnecessary objects


public boolean isValidMeasurement(int i) {
return measurementSchemas != null
return measurements != null

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How could measurements be null?

Comment on lines +340 to +370
private static int getValidMeasurementNumber(InsertRowNode insertNode, boolean countFieldOnly) {
int count = 0;
for (int i = 0; i < insertNode.getMeasurements().length; i++) {
if (isValidMeasurement(insertNode, i, countFieldOnly)) {
count++;
}
}
return count;
}

private static int getValidMeasurementNumber(
InsertTabletNode insertNode, boolean countFieldOnly) {
int count = 0;
for (int i = 0; i < insertNode.getMeasurements().length; i++) {
if (isValidMeasurement(insertNode, i, countFieldOnly) && insertNode.getColumns()[i] != null) {
count++;
}
}
return count;
}

private static boolean isValidMeasurement(
InsertNode insertNode, int index, boolean countFieldOnly) {
return insertNode.getMeasurements()[index] != null
&& (!countFieldOnly || isFieldMeasurement(insertNode, index));
}

private static boolean isFieldMeasurement(InsertNode insertNode, int index) {
return insertNode.getColumnCategories() == null
|| insertNode.getColumnCategories()[index] == TsTableColumnCategory.FIELD;
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May encapsulate these methods into InsertNode to reduce redundancy and improve clarity.

Comment on lines 69 to 84
public static long getRowRecordSize(List<TSDataType> dataTypes, Object[] value) {
int emptyRecordCount = 0;
return getRowRecordSize(dataTypes, value, null);
}

public static long getRowRecordSize(
List<TSDataType> dataTypes, Object[] value, TsTableColumnCategory[] columnCategories) {
int dataTypeIndex = 0;
long memSize = 0L;
for (int i = 0; i < value.length; i++) {
if (value[i] == null) {
emptyRecordCount++;
if (value[i] == null || !isFieldColumn(columnCategories, i)) {
continue;
}
memSize += getRecordSize(dataTypes.get(i - emptyRecordCount), value[i], false);
memSize += getRecordSize(dataTypes.get(dataTypeIndex++), value[i], false);
}
return memSize;
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dataTypes is null-free?
If you enforce some restrictions on it now, you should add some comment to mention it.

@sonarqubecloud

sonarqubecloud Bot commented Jun 9, 2026

Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants