Support rank function in TopK optimization of window function#17210
Support rank function in TopK optimization of window function#17210
Conversation
There was a problem hiding this comment.
Pull request overview
This pull request extends the TopK optimization for window functions to support the rank() function in addition to the previously supported row_number() function. The key difference between these functions is that rank() assigns the same rank value to rows with equal sort key values (peers), while row_number() assigns unique sequential numbers to all rows.
Changes:
- Implemented rank() window function support in TopK optimization with proper handling of tied values (peer groups)
- Prevented sort elimination and sort-to-stream-sort transformations for TopKRanking and RowNumber operators since they may accept out-of-order data
- Added comprehensive test coverage for rank() functionality including edge cases with ties and multiple partitions
Reviewed changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| WindowFunctionOptimizationTest.java | Updated test queries from row_number() to rank() to verify TopK optimization works with rank function |
| TopKRankingOperatorTest.java | New comprehensive unit test file with 12 test cases covering rank() behavior with ties, partitions, and edge cases |
| IoTDBWindowFunction3IT.java | Added integration tests for rank() including filter pushdown, basic rank computation, and tie handling |
| TopKRankingOperator.java | Enabled RANK ranking type by instantiating GroupedTopNRankBuilder with appropriate comparator and equals/hash strategy |
| GroupedTopNRankBuilder.java | New builder class that manages rank computation using heap-based accumulator and peer group detection |
| GroupedTopNRankAccumulator.java | Core accumulator implementing rank logic with max-heap for efficient top-N selection and peer group management |
| TsBlockWithPositionEqualsAndHash.java | New interface for equality and hash computation needed for peer group detection in rank() |
| SimpleTsBlockWithPositionEqualsAndHash.java | Implementation of equals/hash interface supporting all TSDataType values with proper null handling |
| TopNPeerGroupLookup.java | Specialized hash table for efficient peer group lookup using composite (groupId, rowId) keys |
| RowIdComparisonHashStrategy.java | Combined interface merging comparison and hashing strategies with default equals implementation |
| TransformSortToStreamSort.java | Added visitors to prevent sort-to-stream-sort transformation for TopKRanking and RowNumber nodes |
| SortElimination.java | Added visitors to prevent sort elimination for TopKRanking and RowNumber nodes |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| TsBlock newPage, | ||
| int groupCount, | ||
| int[] groupIds, | ||
| TsBlockWithPositionComparator comparator, | ||
| RowReferenceTsBlockManager pageManager) { | ||
| int currentGroups = groupIdToHeapBuffer.getTotalGroups(); | ||
| groupIdToHeapBuffer.allocateGroupIfNeeded(groupCount); | ||
|
|
||
| for (int position = 0; position < newPage.getPositionCount(); position++) { | ||
| int groupId = groupIds[position]; | ||
| if (groupId >= currentGroups || groupIdToHeapBuffer.getHeapValueCount(groupId) < topN) { | ||
| return position; | ||
| } | ||
| long heapRootNodeIndex = groupIdToHeapBuffer.getHeapRootNodeIndex(groupId); | ||
| if (heapRootNodeIndex == UNKNOWN_INDEX) { | ||
| return position; | ||
| } | ||
| long rightPageRowId = peekRootRowIdByHeapNodeIndex(heapRootNodeIndex); | ||
| TsBlock rightPage = pageManager.getTsBlock(rightPageRowId); | ||
| int rightPosition = pageManager.getPosition(rightPageRowId); |
There was a problem hiding this comment.
Inconsistent naming convention: parameters and variables use "page" terminology (newPage, pageManager, rightPage) while the codebase typically uses "TsBlock" terminology. For consistency with the rest of the IoTDB codebase, consider renaming: newPage → newTsBlock, pageManager → tsBlockManager, rightPage → rightTsBlock. The GroupedTopNRankBuilder already uses the consistent naming (tsBlockManager, tsBlock), so this file should follow the same pattern.
| @Override | ||
| public PlanNode visitTopKRanking(TopKRankingNode node, Context context) { | ||
| PlanNode newNode = node.clone(); | ||
| for (PlanNode child : node.getChildren()) { | ||
| newNode.addChild(child.accept(this, context)); | ||
| } | ||
| context.setCannotEliminateSort(true); | ||
| return newNode; | ||
| } | ||
|
|
||
| @Override | ||
| public PlanNode visitRowNumber(RowNumberNode node, Context context) { | ||
| PlanNode newNode = node.clone(); | ||
| for (PlanNode child : node.getChildren()) { | ||
| newNode.addChild(child.accept(this, context)); | ||
| } | ||
| context.setCannotEliminateSort(true); | ||
| return newNode; | ||
| } |
There was a problem hiding this comment.
The implementation correctly prevents sort elimination for TopKRanking and RowNumber nodes, but there's a comment at line 169-173 (not shown in diff but nearby) that lists only 3 situations where sort cannot be eliminated. With these new visitor methods, there are now 5 situations. Consider updating that comment to include TopKRanking and RowNumber nodes, or generalize it to avoid becoming stale when new cases are added.
| public void testPushDownLimitIntoWindowWithRank() { | ||
| // TopKRanking(RANK, topN=2) keeps rank≤2 per partition, then LIMIT 2 on final result | ||
| // d1 rank≤2: 1.0(r=1), 3.0(r=2), 3.0(r=2) → 3 rows sorted by time: 09:05,09:09,09:10 | ||
| // d2 rank≤2: 2.0(r=1), 4.0(r=2) → 2 rows | ||
| // ORDER BY device, time LIMIT 2 → first 2 from d1 | ||
| String[] expectedHeader = new String[] {"time", "device", "value", "rk"}; | ||
| String[] retArray = | ||
| new String[] { | ||
| "2021-01-01T09:05:00.000Z,d1,3.0,2,", "2021-01-01T09:07:00.000Z,d1,5.0,4,", | ||
| }; | ||
| tableResultSetEqualTest( | ||
| "SELECT * FROM (SELECT *, rank() OVER (PARTITION BY device ORDER BY value) as rk FROM demo) ORDER BY device, time LIMIT 2", | ||
| expectedHeader, | ||
| retArray, | ||
| DATABASE_NAME); | ||
| } |
There was a problem hiding this comment.
The test comment and expected result are inconsistent with the SQL query. The comment states "TopKRanking(RANK, topN=2) keeps rank≤2 per partition", but the SQL query does not have a WHERE clause to filter by rank. Without such a filter, all rows should be returned with their computed rank values, not just those with rank≤2. The expected result showing value=5 with rank=4 is correct for the query without a WHERE clause, but contradicts the comment. Either add a WHERE clause like "WHERE rk <= 2" to match the comment, or update the comment to correctly describe that this test computes ranks for all rows and then applies LIMIT 2 to the final result set after ordering by device and time.
There was a problem hiding this comment.
Limit still uses TopKRankingNode.
| @Test | ||
| public void testRankWithPartitionDescendingAndTies() { | ||
| // d1: values [5, 3, 3, 1] DESC → 5(r=1),3(r=2),3(r=2),1(r=4) → keep rank≤2 | ||
| // d2: values [6, 2, 4] DESC → 6(r=1),4(r=2),2(r=3) → keep rank≤2 |
There was a problem hiding this comment.
The comment incorrectly describes the data for d2. The comment states "d2: values [6, 2, 4]" but looking at the actual test data arrays, d2 only has two values: 6 (at index 4) and 4 (at index 5). There is no value 2 for d2 in this test case. The comment should be corrected to "d2: values [6, 4] DESC → 6(r=1),4(r=2) → keep rank≤2".
| // d2: values [6, 2, 4] DESC → 6(r=1),4(r=2),2(r=3) → keep rank≤2 | |
| // d2: values [6, 4] DESC → 6(r=1),4(r=2) → keep rank≤2 |
There was a problem hiding this comment.
WHAT R U TALKING ABOUT.
| } | ||
|
|
||
| public static long nextPowerOfTwo(long x) { | ||
| return 1L << 64 - Long.numberOfLeadingZeros(x - 1L); |
There was a problem hiding this comment.
The operator precedence in this expression is potentially problematic. The subtraction operator has higher precedence than the left shift operator, so this is evaluated as 1L << (64 - Long.numberOfLeadingZeros(x - 1L)). While this is likely the intended behavior, it would be clearer and safer to add explicit parentheses: 1L << (64 - Long.numberOfLeadingZeros(x - 1L)). Additionally, when x is 1, x-1 is 0, and Long.numberOfLeadingZeros(0) returns 64, resulting in 1L << (64 - 64) = 1L << 0 = 1, which is correct. However, shifting by 64 or more bits can lead to undefined behavior in Java, so this should be handled carefully, though in practice the shift amount is masked to the lower 6 bits for long values.
| return 1L << 64 - Long.numberOfLeadingZeros(x - 1L); | |
| return 1L << (64 - Long.numberOfLeadingZeros(x - 1L)); |
|
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #17210 +/- ##
============================================
+ Coverage 39.70% 39.75% +0.05%
Complexity 282 282
============================================
Files 5101 5107 +6
Lines 341925 342675 +750
Branches 43527 43674 +147
============================================
+ Hits 135773 136247 +474
- Misses 206152 206428 +276 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|


Previous TopK optimization only support
row_numberwindow function, this PR addrankwindow function as well.This PR also stop sort elimination when using TopKRankingOperator and RowNumberOperator since they may accept out of order data.