Skip to content

Fix/aligned tv page seal#734

Open
hongzhi-gao wants to merge 11 commits intoapache:developfrom
hongzhi-gao:fix/aligned_tv_page_seal
Open

Fix/aligned tv page seal#734
hongzhi-gao wants to merge 11 commits intoapache:developfrom
hongzhi-gao:fix/aligned_tv_page_seal

Conversation

@hongzhi-gao
Copy link
Contributor

@hongzhi-gao hongzhi-gao commented Mar 2, 2026

Summary

Implement aligned-model synchronized page sealing (matching Java semantics) and fix null-handling bugs in both writer and reader paths.

Changes

  1. Aligned seal sync: When any aligned page hits its capacity threshold (point count or memory), seal all time/value pages together (tsfile_writer.cc).
  2. All-null value pages: Allow zero-sized value buffers in ValuePageData::init; fix end_encode_chunk to correctly handle pre-sealed pages and avoid writing redundant all-null pages in the table model (value_page_writer.cc, value_chunk_writer.cc, time_chunk_writer.cc).
  3. Table-model seal guard: Use get_point_numer() > 0 instead of has_current_page_data() in the table write loop to prevent excessive sealing of all-null pages, which caused heap corruption (0xC0000374) in TestNullInTable4.
  4. Reader null handling: Use null-aware read() in QDSWithoutTimeGenerator::next() to avoid reading past the value buffer for null rows.

Testing

  • New tests: AlignedSealSync_PointCountWithNulls, AlignedSealSync_TimeMemoryFirst, AlignedSealSync_ValueMemoryFirst.
  • Full TsFile_Test passes in both Release and Debug.

Comment on lines -719 to +775
time_write_column(time_chunk_writer, tablet);
ASSERT(value_chunk_writers.size() == tablet.get_column_count());
for (uint32_t c = 0; c < value_chunk_writers.size(); c++) {
ValueChunkWriter* value_chunk_writer = value_chunk_writers[c];
if (IS_NULL(value_chunk_writer)) {
continue;
for (uint32_t row = 0; row < tablet.get_cur_row_size(); row++) {
int32_t time_pages_before = time_chunk_writer->num_of_pages();
std::vector<int32_t> value_pages_before(value_chunk_writers.size(), 0);
for (uint32_t c = 0; c < value_chunk_writers.size(); c++) {
ValueChunkWriter* value_chunk_writer = value_chunk_writers[c];
if (!IS_NULL(value_chunk_writer)) {
value_pages_before[c] = value_chunk_writer->num_of_pages();
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing from a column-based order to a row-based order may cause a performance downgrade.
May add a configuration to optionally sacrifice the page size limit:
If strict_page_size=true, use the row-based insertion with checking the page size each time.
Otherwise, insert each column wholly without the page size check, and perform it at the very end.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or, when there is no string/blob/text column, the memory size of each column can be directly computed with its length. Thus, we can divide the incoming column into splits that will cause a page seal, and write them one by one in a column-based order.

Comment on lines +898 to 908
}
int32_t time_pages_after = time_chunk_writer->num_of_pages();
if (time_pages_after > time_pages_before) {
for (uint32_t k = 0; k < value_chunk_writers.size(); k++) {
if (!IS_NULL(value_chunk_writers[k]) &&
value_chunk_writers[k]->get_point_numer() > 0 &&
RET_FAIL(
value_chunk_writers[k]->seal_current_page())) {
return ret;
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the number of value pages not checked.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants