feat(storage): Use raw proto access for read resumption strategy by googlyrahman · Pull Request #1764 · googleapis/python-storage

googlyrahman · 2026-03-03T09:05:56Z

The proto-plus library provides protocol buffer message classes that behave like native Python types. While convenient, my profiling confirms this abstraction comes with a significant performance penalty in our hot paths. Every time we access a field on a proto-plus object, the library triggers dynamic lookups and wrapper instantiation. In our hot data ingestion loop, this overhead accumulates rapidly over multiple chunks.

My benchmarking shows that accessing the underlying C++ Protobuf directly is ~2x faster than going through the proto-plus wrapper (measured over 30,000 iterations). While 30,000 operations might sound high, it corresponds to downloading just 60GB of data (assuming 2MB chunks) for accessing just one attribute. For high-performance workloads (e.g., downloading at 1GB/s) so it only takes 60s to complete the process. Additionally this wrapper overhead introduces measurable latency not just for data access, but for every metadata check and state update that repeats per chunk.

This PR bypass proto-plus in our critical IO loops and interact directly with the underlying C++ Protobuf structures. This eliminates the "Wrapper tax" without changing the external behavior of the application.

gemini-code-assist · 2026-03-03T09:06:14Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the performance of data ingestion by optimizing how Protobuf messages are handled in hot paths. By switching from proto-plus wrappers to direct C++ Protobuf access, the system avoids substantial overhead, leading to faster processing of responses and more efficient data downloads without altering external application behavior.

Highlights

Performance Optimization: Bypassed the proto-plus library in critical I/O loops to directly interact with underlying C++ Protobuf structures, eliminating performance overhead associated with wrapper instantiation and dynamic lookups.
Raw Protobuf Access: Modified reads_resumption_strategy.py to retrieve the raw Protobuf object using getattr(response, '_pb', response) and access fields directly from it.
Robust Field Checks: Replaced implicit boolean checks for optional Protobuf fields (e.g., read_range, crc32c) with explicit HasField() calls for improved reliability and consistency.

Changelog

google/cloud/storage/asyncio/retry/reads_resumption_strategy.py
- Introduced direct access to the underlying C++ Protobuf object via getattr(response, '_pb', response) to improve performance.
- Updated all subsequent field accesses (e.g., read_handle, object_data_ranges, read_range, read_id, read_offset, checksummed_data, crc32c) to utilize the raw Protobuf object or intermediate variables derived from it.
- Replaced direct truthiness checks for optional Protobuf fields with explicit HasField() method calls for more accurate and performant checks.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request refactors the read resumption strategy to use raw protobuf message access instead of proto-plus wrappers, aiming for a significant performance improvement in critical I/O paths. The changes correctly adapt field presence checks to use the raw proto API (e.g., HasField), which is a good improvement for both performance and correctness. I have one point of feedback regarding a type inconsistency introduced in the shared state, which could lead to potential issues.

_{Note: Security Review did not run due to the size of the PR.}

google/cloud/storage/asyncio/retry/reads_resumption_strategy.py

googlyrahman · 2026-03-03T13:46:16Z

/gemini review

gemini-code-assist

Code Review

This pull request improves performance in the read resumption strategy by bypassing the proto-plus library and accessing the underlying raw protobuf messages directly in the hot path. The changes correctly use HasField for checking optional scalar fields and introduce local variables to reduce repeated attribute access, which are good micro-optimizations. The test files have also been updated with stylistic formatting improvements.

I have one suggestion to improve the robustness and consistency of the code when checking for the presence of a message field.

gemini-code-assist · 2026-03-03T13:49:56Z

google/cloud/storage/asyncio/retry/reads_resumption_strategy.py

-        if response.read_handle:
-            state["read_handle"] = response.read_handle
+        proto = getattr(response, "_pb", response)
+        if proto.read_handle:


For consistency and robustness, it's better to use HasField() to check for the presence of message fields on raw protobuf objects, as you've done for read_range. While checking the truthiness of proto.read_handle might work, HasField("read_handle") is the explicitly supported method and avoids any ambiguity.

Suggested change

if proto.read_handle:

if proto.HasField("read_handle"):

product-auto-label bot added size: s Pull request size is small. api: storage Issues related to the googleapis/python-storage API. labels Mar 3, 2026

gemini-code-assist bot reviewed Mar 3, 2026

View reviewed changes

google/cloud/storage/asyncio/retry/reads_resumption_strategy.py Outdated Show resolved Hide resolved

product-auto-label bot added size: m Pull request size is medium. and removed size: s Pull request size is small. labels Mar 3, 2026

googlyrahman changed the title ~~Use raw proto access for read resumption strategy~~ feat(storage): Use raw proto access for read resumption strategy Mar 3, 2026

googlyrahman force-pushed the proto-wrapper branch 2 times, most recently from d06e240 to 08eaf33 Compare March 3, 2026 13:43

feat(storage): access raw protobuf directly in read_resumption_strategy

19d64d3

googlyrahman force-pushed the proto-wrapper branch from 08eaf33 to 19d64d3 Compare March 3, 2026 13:45

googlyrahman marked this pull request as ready for review March 3, 2026 13:45

googlyrahman requested review from a team as code owners March 3, 2026 13:45

blunderbuss-gcf bot assigned chandra-siri Mar 3, 2026

gemini-code-assist bot reviewed Mar 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(storage): Use raw proto access for read resumption strategy#1764

feat(storage): Use raw proto access for read resumption strategy#1764
googlyrahman wants to merge 1 commit intogoogleapis:mainfrom
googlyrahman:proto-wrapper

googlyrahman commented Mar 3, 2026

Uh oh!

gemini-code-assist bot commented Mar 3, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

googlyrahman commented Mar 3, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Mar 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

googlyrahman commented Mar 3, 2026

Uh oh!

gemini-code-assist bot commented Mar 3, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

googlyrahman commented Mar 3, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants