Skip to content

feat: Rerouted ReadRows to data client#1299

Open
gkevinzheng wants to merge 4 commits intov3_stagingfrom
read-rows-reduc
Open

feat: Rerouted ReadRows to data client#1299
gkevinzheng wants to merge 4 commits intov3_stagingfrom
read-rows-reduc

Conversation

@gkevinzheng
Copy link
Contributor

@gkevinzheng gkevinzheng commented Feb 26, 2026

Changes Made:

  • Added methods to convert Row and Cell objects in the data client to PartialRowData and Cell objects in the legacy client.
  • Removed legacy client code related to processing ReadRowResponse chunks and testing ReadRowResponse chunks.
  • Removed _update_message_request from RowSet because it's no longer needed to create a ReadRowQuery
  • Rerouted read_row and read_rows to use their data client counterparts in table.py.

@gkevinzheng gkevinzheng requested a review from a team as a code owner February 26, 2026 17:11
@product-auto-label product-auto-label bot added the size: xl Pull request size is extra large. label Feb 26, 2026
@product-auto-label product-auto-label bot added the api: bigtable Issues related to the googleapis/python-bigtable API. label Feb 26, 2026
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @gkevinzheng, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request refactors the Bigtable client's row reading mechanism by migrating the ReadRows functionality to leverage the new data client. This change streamlines the internal implementation, removes outdated chunk processing logic, and ensures that row reading operations are handled consistently through the modern data client API. The primary impact is an internal architectural improvement, leading to a cleaner and more maintainable codebase for row retrieval.

Highlights

  • Data Client Conversion Methods: Implemented conversion methods for Row and Cell objects from the data client to the legacy PartialRowData and Cell types.
  • Legacy Chunk Processing Removal: Eliminated legacy code responsible for processing ReadRowResponse chunks and associated testing.
  • RowSet Method Removal: Removed the _update_message_request method from RowSet as it is no longer required for ReadRowQuery creation.
  • ReadRows Rerouting: Rerouted read_row and read_rows methods in table.py to utilize their counterparts in the new data client.
Changelog
  • google/cloud/bigtable/row.py
    • Added _from_data_client_row class method to convert data client Row objects to PartialRowData.
    • Added _from_data_client_cell class method to convert data client Cell objects to legacy Cell objects.
  • google/cloud/bigtable/row_data.py
    • Removed several unused imports (copy, grpc, warnings, _to_bytes, _RowMerger, _State, data_messages_v2_pb2, data_v2_pb2).
    • Removed RETRYABLE_INTERNAL_ERROR_MESSAGES, _retriable_internal_server_error, and _retry_read_rows_exception functions.
    • Updated DEFAULT_RETRY_READ_ROWS to remove the predicate argument.
    • Refactored PartialRowsData to accept a generator instead of read_method and request, removing all internal chunk processing logic and retry handling.
    • Removed the _ReadRowsRequestManager class entirely.
  • google/cloud/bigtable/row_merger.py
    • Removed the entire file, as its functionality for merging ReadRowsResponse chunks is no longer needed.
  • google/cloud/bigtable/row_set.py
    • Removed _to_bytes import.
    • Removed the _update_message_request method.
  • google/cloud/bigtable/table.py
    • Imported ReadRowsQuery and PartialRowData.
    • Removed RowSet and data_messages_v2_pb2 imports.
    • Modified read_row to use _table_impl.read_row and convert the result to PartialRowData.
    • Modified read_rows to use _table_impl.read_rows_stream and pass the resulting generator to PartialRowsData.
    • Updated _create_row_request to return a ReadRowsQuery object, removing table_name and app_profile_id parameters and adapting logic to ReadRowsQuery methods.
  • tests/system/v2_client/test_data_api.py
    • Updated test_table_read_rows_retry_unretriable_error_establishing_stream to expect exceptions.DataLoss instead of exceptions.Aborted and call consume_all.
    • Updated test_table_read_rows_retry_retriable_internal_errors_mid_stream to import _RETRYABLE_INTERNAL_ERROR_MESSAGES from google.cloud.bigtable.data._helpers.
    • Updated test_table_read_rows_retry_retriable_error_mid_stream_unretriable_error_reestablishing_stream to expect exceptions.DataLoss.
    • Updated test_table_read_rows_retry_timeout_mid_stream and test_table_read_rows_retry_timeout_establishing_stream to import _RETRYABLE_INTERNAL_ERROR_MESSAGES from google.cloud.bigtable.data._helpers, set default_read_rows_operation_timeout, and call consume_all.
  • tests/unit/v2_client/test_row_data.py
    • Updated imports to remove mock and _make_credentials, and added Row, Cell from google.cloud.bigtable.data.row.
    • Added ROWS constant for test data.
    • Added test_cell__from_data_client_cell and test_cell__from_data_client_cell_with_labels for new conversion methods.
    • Added test_partial_row_data__from_data_client_row for the new conversion method.
    • Removed all tests related to _retry_read_rows_exception.
    • Refactored _make_generator and test_partial_rows_data_consume_all to use the new generator-based PartialRowsData.
    • Removed numerous tests related to PartialRowsData internal state, chunk processing, and _ReadRowsRequestManager.
    • Added test_partial_rows_data_deadline_exceeded to test new error handling.
  • tests/unit/v2_client/test_row_merger.py
    • Removed imports for zip_longest, List, pytest, PartialRowsData, InvalidChunk, _RowMerger.
    • Removed all acceptance tests and unit tests related to _RowMerger and chunk processing.
  • tests/unit/v2_client/test_row_set.py
    • Removed a unit test for the _update_message_request method.
  • tests/unit/v2_client/test_table.py
    • Imported ReadRowsQuery.
    • Removed _table_read_row_helper and its associated tests (test_table_read_row_miss_no__responses, test_table_read_row_miss_no_chunks_in_response, test_table_read_row_complete, test_table_read_row_more_than_one_row_returned, test_table_read_row_still_partial).
    • Refactored test_table_read_rows to use mock.patch for read_rows_stream and assert against ReadRowsQuery and converted PartialRowData.
    • Removed test_table_read_retry_rows, test_table_read_retry_rows_no_full_table_scan, test_table_yield_retry_rows, test_table_yield_rows_with_row_set.
    • Removed test__create_row_request_table_name_only and test__create_row_request_with_app_profile_id.
    • Updated _create_row_request tests to reflect the removal of table_name and app_profile_id parameters and the return type being ReadRowsQuery.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request is a nice refactoring that reroutes read_row and read_rows to use the new data client. This simplifies the code by removing the legacy chunk processing and retry logic. The changes look good overall, but I've found one potential issue in the cancel method of PartialRowsData which might lead to resource leaks.

@gkevinzheng gkevinzheng reopened this Feb 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api: bigtable Issues related to the googleapis/python-bigtable API. size: xl Pull request size is extra large.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant