Feature/get-sensor-statistic-and-restorable-writer#731
Open
hongzhi-gao wants to merge 23 commits intoapache:developfrom
Open
Feature/get-sensor-statistic-and-restorable-writer#731hongzhi-gao wants to merge 23 commits intoapache:developfrom
hongzhi-gao wants to merge 23 commits intoapache:developfrom
Conversation
… for all timeseries in the file
… model interfaces.
# Conflicts: # cpp/src/file/tsfile_io_writer.h # cpp/src/file/write_file.h
…et-sensor-statistic
jt2594838
approved these changes
Feb 28, 2026
| if (cur_device_id != nullptr && | ||
| (static_cast<unsigned char>(chdr.chunk_type_) & 0x80) != | ||
| 0) { | ||
| aligned_devices_.insert(cur_device_id->get_table_name()); |
Contributor
Author
There was a problem hiding this comment.
If recovered chunk is an aligned time chunk, mark this device as aligned to keep post-recovery write behavior consistent.
Contributor
There was a problem hiding this comment.
Then, why is the table_name being instead of the device itself
Comment on lines
+144
to
+145
| auto device_id = std::make_shared<StringArrayDeviceID>(table_name); | ||
| auto* ms_group = new MeasurementSchemaGroup; |
Contributor
Author
There was a problem hiding this comment.
We rebuild mutable in-memory schema cache from recovered schema; table_name is used as device key, and ms_group supports lazy writer creation later.
…al write_stream position directly from recovered file size
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR adds TsFile recovery and continued writing in C++: open an incomplete or corrupted TsFile, truncate bad tail data, and continue appending with the existing tree and table writers. The behavior is aligned with the Java
RestorableTsFileIOWriterand enables safe recovery after crashes or interrupted writes.Motivation
close()can write correct metadata index and file tail.DeviceTimeseriesMetadataMapand overloads ofget_timeseries_metadata(device_ids)/get_timeseries_metadata()returning a map; addget_all_devices()where appropriate. Only existing devices are included when querying by device list.Implementation
RestorableTsFileIOWriter (new)
open(file_path, truncate_corrupted)opens withO_RDWR|O_CREAT(noO_TRUNC) and runs a self-check:can_write() = false.cur_file_position()is correct when generating tail metadata later.flush_skip_leading_is set so thatflush_stream_to_file()skips these leading bytes and only writes new data. No change to the normal write path.ChunkGroupMetaentries are pushed viapush_chunk_group_meta()and marked withchunk_group_meta_from_recovery_sodestroy()does not free them (they live in the recovery arena).TsFileIOWriter (base)
flush_skip_leading_: when > 0,flush_stream_to_file()skips that many leading bytes in the stream (already on disk) and writes the rest.chunk_group_meta_from_recovery_: when true,destroy()skips freeing chunk group meta (owned by recovery).push_chunk_group_meta(),set_flush_skip_leading()(protected) andfriend RestorableTsFileIOWriter.WriteFile
truncate(size),seek_to_end(),get_position(), andget_fd()for recovery (truncate, append position, and reading file content to restorewrite_stream_).close()is idempotent when already closed.TsFileWriter / TsFileTableWriter / TsFileTreeWriter
init(RestorableTsFileIOWriter* rw)initializes from a recovered writer: takes schema from the file, does not ownio_writer_(io_writer_owned_ = false). Ensurestime_chunk_writer_is created when appending after recovery.RestorableTsFileIOWriter*for appending after recovery (schema/alignment from restored file).Reader API refactor
IDeviceID→vector<shared_ptr<ITimeseriesIndex>>.get_timeseries_metadata(device_ids)returns a map (only existing devices);get_timeseries_metadata()returns metadata for all devices;get_all_devices()returns the same asget_all_device_ids().get_all_devices()returnsvector<shared_ptr<IDeviceID>>(underlying reader).Testing
TruncateFiletest fortruncate(size).get_timeseries_metadata()/get_timeseries_metadata(device_ids)andDeviceTimeseriesMetadataMap; assertions unchanged.