Conversation
…eader handling of null-only value pages so that Debug builds pass.
| time_write_column(time_chunk_writer, tablet); | ||
| ASSERT(value_chunk_writers.size() == tablet.get_column_count()); | ||
| for (uint32_t c = 0; c < value_chunk_writers.size(); c++) { | ||
| ValueChunkWriter* value_chunk_writer = value_chunk_writers[c]; | ||
| if (IS_NULL(value_chunk_writer)) { | ||
| continue; | ||
| for (uint32_t row = 0; row < tablet.get_cur_row_size(); row++) { | ||
| int32_t time_pages_before = time_chunk_writer->num_of_pages(); | ||
| std::vector<int32_t> value_pages_before(value_chunk_writers.size(), 0); | ||
| for (uint32_t c = 0; c < value_chunk_writers.size(); c++) { | ||
| ValueChunkWriter* value_chunk_writer = value_chunk_writers[c]; | ||
| if (!IS_NULL(value_chunk_writer)) { | ||
| value_pages_before[c] = value_chunk_writer->num_of_pages(); | ||
| } | ||
| } |
There was a problem hiding this comment.
Changing from a column-based order to a row-based order may cause a performance downgrade.
May add a configuration to optionally sacrifice the page size limit:
If strict_page_size=true, use the row-based insertion with checking the page size each time.
Otherwise, insert each column wholly without the page size check, and perform it at the very end.
There was a problem hiding this comment.
Or, when there is no string/blob/text column, the memory size of each column can be directly computed with its length. Thus, we can divide the incoming column into splits that will cause a page seal, and write them one by one in a column-based order.
| } | ||
| int32_t time_pages_after = time_chunk_writer->num_of_pages(); | ||
| if (time_pages_after > time_pages_before) { | ||
| for (uint32_t k = 0; k < value_chunk_writers.size(); k++) { | ||
| if (!IS_NULL(value_chunk_writers[k]) && | ||
| value_chunk_writers[k]->get_point_numer() > 0 && | ||
| RET_FAIL( | ||
| value_chunk_writers[k]->seal_current_page())) { | ||
| return ret; | ||
| } | ||
| } |
There was a problem hiding this comment.
Why is the number of value pages not checked.
Summary
Implement aligned-model synchronized page sealing (matching Java semantics) and fix null-handling bugs in both writer and reader paths.
Changes
tsfile_writer.cc).ValuePageData::init; fixend_encode_chunkto correctly handle pre-sealed pages and avoid writing redundant all-null pages in the table model (value_page_writer.cc,value_chunk_writer.cc,time_chunk_writer.cc).get_point_numer() > 0instead ofhas_current_page_data()in the table write loop to prevent excessive sealing of all-null pages, which caused heap corruption (0xC0000374) inTestNullInTable4.read()inQDSWithoutTimeGenerator::next()to avoid reading past the value buffer for null rows.Testing
AlignedSealSync_PointCountWithNulls,AlignedSealSync_TimeMemoryFirst,AlignedSealSync_ValueMemoryFirst.TsFile_Testpasses in both Release and Debug.