Skip to content

[fix](paimon-cpp) deduplicate Arrow linking to fix SIGSEGV in FilterRowGroupsByPredicate#60883

Open
xylaaaaa wants to merge 3 commits intoapache:masterfrom
xylaaaaa:fix/paimoncpp-dedup-arrow-linking
Open

[fix](paimon-cpp) deduplicate Arrow linking to fix SIGSEGV in FilterRowGroupsByPredicate#60883
xylaaaaa wants to merge 3 commits intoapache:masterfrom
xylaaaaa:fix/paimoncpp-dedup-arrow-linking

Conversation

@xylaaaaa
Copy link
Contributor

Proposed changes

Problem

When ENABLE_PAIMON_CPP is ON, both Doris's own libarrow.a and paimon-cpp's libarrow.a are linked into doris_be, causing 3698 duplicate global symbols. This leads to SIGSEGV crashes in paimon::parquet::ParquetFileBatchReader::FilterRowGroupsByPredicate when libarrow_dataset.a resolves arrow core calls to the wrong copy (compiled with different feature flags).

Both are Arrow 17.0.0 but compiled with different options:

Feature Doris Arrow paimon Arrow
COMPUTE OFF ON
DATASET OFF ON
ACERO OFF ON
FILESYSTEM OFF ON
FLIGHT ON OFF
FLIGHT_SQL ON OFF
PARQUET ON ON

Crash Stack

SIGSEGV invalid permissions for mapped object
 → std::string::basic_string(char const*, ...)
 → paimon::ToPaimonStatus(arrow::Status const&)
 → paimon::parquet::ParquetFileBatchReader::FilterRowGroupsByPredicate(...)

Root Cause

Inside -Wl,--start-group ... --end-group, the linker may resolve symbols from libarrow_dataset.a (paimon's) to Doris's libarrow.a, which was compiled without COMPUTE/FILESYSTEM modules. The internal object memory layout differs, causing arrow::Status and other objects to trigger illegal memory access when passed across library boundaries.

Fix

When the paimon_deps Arrow stack is selected (because Doris lacks libarrow_dataset.a / libarrow_acero.a), remove Doris's arrow from COMMON_THIRDPARTY.

paimon's libarrow.a is a superset of Doris's version (same 17.0.0, with additional modules enabled), so it provides all symbols needed by Doris's libarrow_flight.a / libarrow_flight_sql.a.

Impact

  • Only be/CMakeLists.txt changed (~10 lines).
  • No C++/Java business code changes.
  • No impact when ENABLE_PAIMON_CPP=OFF.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)

…owGroupsByPredicate

When ENABLE_PAIMON_CPP is ON, both Doris's own libarrow.a and paimon-cpp's
libarrow.a were linked into doris_be, causing 3698 duplicate global symbols.
This led to SIGSEGV crashes in paimon::parquet::ParquetFileBatchReader::
FilterRowGroupsByPredicate when libarrow_dataset.a resolved arrow core calls
to the wrong copy (compiled with different feature flags).

Both are Arrow 17.0.0 but compiled with different options:
- Doris:  COMPUTE=OFF, DATASET=OFF, ACERO=OFF, FLIGHT=ON
- paimon: COMPUTE=ON,  DATASET=ON,  ACERO=ON,  FLIGHT=OFF

Fix: when paimon_deps Arrow stack is selected, remove Doris's 'arrow' from
COMMON_THIRDPARTY. paimon's libarrow.a is a superset and provides all symbols
needed by Doris's arrow_flight / arrow_flight_sql.
@xylaaaaa xylaaaaa requested a review from zclllyybb as a code owner February 27, 2026 09:05
Copilot AI review requested due to automatic review settings February 27, 2026 09:05
@Thearas
Copy link
Contributor

Thearas commented Feb 27, 2026

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes SIGSEGV crashes in paimon-cpp's ParquetFileBatchReader when ENABLE_PAIMON_CPP is ON. The crash was caused by linking both Doris's libarrow.a and paimon-cpp's libarrow.a into the binary, creating 3698 duplicate global symbols. Although both are Arrow 17.0.0, they were compiled with different feature flags (Doris: FLIGHT enabled, paimon: COMPUTE/DATASET/ACERO/FILESYSTEM enabled), causing memory layout incompatibilities that led to crashes when arrow_dataset resolved symbols to the wrong copy.

Changes:

  • Implement stack-based Arrow library selection logic that chooses either the complete Doris or paimon_deps Arrow stack
  • When paimon_deps stack is selected, remove Doris's arrow from COMMON_THIRDPARTY to eliminate duplicate symbols
  • Add status messages to indicate which Arrow stack is being used
Comments suppressed due to low confidence (1)

be/CMakeLists.txt:634

  • The comment states "mixing different Arrow versions" but both Doris and paimon use Arrow 17.0.0 according to the PR description. Consider updating the comment to clarify that the issue is mixing the same Arrow version compiled with different feature flags, not different versions.
    # mixing different Arrow versions (e.g. Doris core + paimon dataset/acero),

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 678 to 689
if (_doris_arrow_core AND _doris_arrow_dataset AND _doris_arrow_acero)
set(_selected_arrow_stack "doris")
set(_selected_arrow_core "${_doris_arrow_core}")
set(_selected_arrow_dataset "${_doris_arrow_dataset}")
set(_selected_arrow_acero "${_doris_arrow_acero}")
set(_selected_arrow_filesystem "${_doris_arrow_filesystem}")
elseif (_paimon_arrow_core AND _paimon_arrow_dataset AND _paimon_arrow_acero)
set(_selected_arrow_stack "paimon_deps")
set(_selected_arrow_core "${_paimon_arrow_core}")
set(_selected_arrow_dataset "${_paimon_arrow_dataset}")
set(_selected_arrow_acero "${_paimon_arrow_acero}")
set(_selected_arrow_filesystem "${_paimon_arrow_filesystem}")
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The selection logic prioritizes Doris's Arrow stack over paimon_deps when both are complete. If Doris's build configuration changes in the future to include DATASET and ACERO modules, this could cause the same duplicate symbol issues this PR is fixing, because paimon's code would be linked against Doris's Arrow instead of paimon_deps's Arrow. Consider adding a comment explaining this priority decision, or if paimon_deps should always be preferred when ENABLE_PAIMON_CPP is ON, swap the priority order.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants