Skip to content

feat: Mutations Batcher shim#1309

Open
gkevinzheng wants to merge 1 commit intov3_stagingfrom
mutations-batcher
Open

feat: Mutations Batcher shim#1309
gkevinzheng wants to merge 1 commit intov3_stagingfrom
mutations-batcher

Conversation

@gkevinzheng
Copy link
Contributor

Changes Made:

  • Replaced mutations batcher implementation with one based off of the data client.
  • Reworked unit tests.
  • Added additional system tests.

@gkevinzheng gkevinzheng requested a review from a team as a code owner March 16, 2026 16:22
@product-auto-label product-auto-label bot added size: l Pull request size is large. api: bigtable Issues related to the googleapis/python-bigtable API. labels Mar 16, 2026
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request refactors the MutationsBatcher to act as a shim, delegating its core batching responsibilities to an internal batcher provided by the data client. This change streamlines the batching mechanism, leveraging the data client's capabilities for more robust and efficient mutation operations, while maintaining the existing public interface.

Highlights

  • Mutations Batcher Implementation: The underlying implementation of the Mutations Batcher has been replaced with one based on the new data client, simplifying the batching logic.
  • Unit Tests Reworked: Existing unit tests have been refactored to align with the new batcher implementation and improve testability.
  • System Tests Added: New system tests have been introduced to cover exception handling and manual flushing scenarios for the Mutations Batcher.
Changelog
  • google/cloud/bigtable/batcher.py
    • Removed deprecated internal batching classes and methods, including _MutationsBatchQueue, _BatchInfo, _FlowControl, _flush_async, and related helpers.
    • Updated imports to utilize MutationsExceptionGroup and RowMutationEntry from the google.cloud.bigtable.data module.
    • Modified the MutationsBatcher constructor to initialize and manage an internal batcher instance from table._table_impl.mutations_batcher.
    • Implemented _init_batcher and _close_batcher methods to control the lifecycle and exception handling of the internal batcher.
    • Updated the mutate method to append mutations directly to the new internal batcher.
    • Refactored the flush method to close and re-initialize the internal batcher, ensuring pending mutations are processed.
    • Adjusted the close method to properly shut down the internal batcher and collect any exceptions.
  • google/cloud/bigtable/data/_async/client.py
    • Removed unused Callable and status_pb2 imports.
  • google/cloud/bigtable/data/_async/mutations_batcher.py
    • Removed unused Callable and Optional imports.
  • tests/system/v2_client/test_data_api.py
    • Added test_mutations_batcher_exceptions to verify the batcher's error propagation and handling.
    • Added test_mutations_batcher_manual_flush to confirm correct behavior when manually flushing mutations.
  • tests/unit/v2_client/test_batcher.py
    • Refactored unit tests to use _setup_batcher and _atexit_mock fixtures for improved test setup and isolation.
    • Updated constructor tests to verify the correct initialization parameters are passed to the internal batcher.
    • Modified tests for mutate_rows, mutate, flush, and close to assert interactions with the mocked internal batcher operations.
    • Updated exception handling tests to reflect the new MutationsExceptionGroup and FailedMutationEntryError types.
    • Removed obsolete test helper classes and functions related to the old batcher implementation.
Activity
  • No specific activity has been recorded for this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request successfully refactors the MutationsBatcher to leverage the new data client's batching capabilities. This significantly simplifies the batcher.py file by removing a substantial amount of custom queueing, flow control, and asynchronous flushing logic. The updated unit and system tests adequately cover the new implementation, including exception handling and manual flushing behavior. The changes align with the goal of migrating to the new data client.

Comment on lines +113 to +115
self._batcher._user_batch_completed_callback = (
self._user_batch_completed_callback
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Accessing the private attribute _user_batch_completed_callback of the underlying _batcher can introduce fragility. If the internal implementation of the data client's batcher changes this private attribute, it could break this shim. Consider if there's a public API or a more robust way to pass this callback to the underlying batcher, or add a comment explaining this design choice and its implications for future maintenance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api: bigtable Issues related to the googleapis/python-bigtable API. size: l Pull request size is large.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant