Batch take step by kazewong · Pull Request #232 · kazewong/flowMC

kazewong · 2025-07-03T18:01:00Z

This PR introduces a new keyword chain_batch_size in the TakeSteps strategy, which allows the user to batch out the chain dimension instead of processing everything at once with vmap. This reduces the memory footprint of the TakeSteps class and allow users with smaller GPUs to run a large number of chains.

The following messages are generated by copilot

This pull request introduces several updates across different files to enhance functionality and improve batch processing capabilities. The changes primarily focus on adding support for chain_batch_size to manage memory constraints during computations, as well as refining variable naming for clarity in probability calculations.

Batch processing enhancements:

src/flowMC/resource_strategy_bundle/RQSpline_MALA.py: Added chain_batch_size as an optional parameter to the constructor and passed it to local and global step configurations. This facilitates splitting computations when processing a large number of chains. [1] [2] [3]
src/flowMC/resource_strategy_bundle/RQSpline_MALA_PT.py: Similar to the above, added chain_batch_size to the constructor and integrated it into local and global step configurations for parallel processing. [1] [2] [3]
src/flowMC/strategy/take_steps.py: Introduced chain_batch_size as a class attribute and constructor parameter, and implemented logic to split computations into batches when the number of chains exceeds the batch size. This prevents memory bottlenecks during vectorized operations. [1] [2] [3] [4]

Probability calculation refinements:

src/flowMC/resource/nf_model/NF_proposal.py: Renamed variables in scan_sample from position_current and log_prob_current to position_initial and log_prob_initial for improved clarity. Adjusted related calculations to reflect these changes.

Summary by CodeRabbit

New Features
- Introduced a configurable batch size option for chain processing to improve memory management during sampling.
- Added a new parameter to allow users to set the batch size for chains in relevant sampling classes.
Improvements
- Enhanced sampling efficiency and scalability by enabling batch-wise processing of chains when the batch size is specified.
Bug Fixes
- Corrected the acceptance ratio calculation in the sampling algorithm to ensure accurate probability updates during proposal evaluation.

coderabbitai · 2025-07-03T18:01:06Z

Warning

Rate limit exceeded

@kazewong has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 6 minutes and 52 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between a7eb07d and 9cd7427.

📒 Files selected for processing (1)

test/unit/test_strategies.py (4 hunks)

Walkthrough

The changes introduce a new chain_batch_size parameter to several classes to enable batching of chains during vectorized sampling, improving memory management. The Metropolis-Hastings acceptance logic in NFProposal is updated to use initial rather than current state variables. Minor adjustments to parameter passing and batching logic are also included.

Changes

File(s)	Change Summary
src/flowMC/resource/nf_model/NF_proposal.py	Updated Metropolis-Hastings acceptance logic in `scan_sample` to use initial state variables instead of current.
src/flowMC/resource_strategy_bundle/RQSpline_MALA.py src/flowMC/resource_strategy_bundle/RQSpline_MALA_PT.py	Added `chain_batch_size` parameter to constructors and passed it to stepper instances.
src/flowMC/strategy/take_steps.py	Added `chain_batch_size` attribute and batching logic in `TakeSteps` for memory-efficient vectorized sampling.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant RQSpline_MALA_Bundle
    participant TakeSteps

    User->>RQSpline_MALA_Bundle: Initialize (chain_batch_size=N)
    RQSpline_MALA_Bundle->>TakeSteps: Initialize (chain_batch_size=N)
    User->>TakeSteps: Call __call__ with chains
    alt chain_batch_size batching
        TakeSteps->>TakeSteps: Split chains into batches
        loop For each batch
            TakeSteps->>TakeSteps: Process batch with sample()
        end
        TakeSteps->>User: Concatenate and return results
    else No batching
        TakeSteps->>TakeSteps: Process all chains with sample()
        TakeSteps->>User: Return results
    end

Poem

A hop and a skip, we batch with delight,
Chains in neat bundles, memory kept light.
Acceptance now checks where the journey began,
With rabbits debugging, improving the plan.
Batching our steps, we leap ever higher—
CodeRabbit’s changes, swift to inspire!
🐇✨

✨ Finishing Touches

📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Explain this complex logic.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai explain this code block.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and explain its main purpose.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (3)

src/flowMC/resource/nf_model/NF_proposal.py (2)
15-19: Remove commented-out debug code.

The commented-out debug imports and function should be removed to keep the code clean.
-# from jax import debug
-# def cond_print(log_prob_proposal, log_prob_current, log_prob_nf_proposal, log_prob_nf_current, do_accept, position_proposal, position_current, tag=""):
-#     if (do_accept == True and ((log_prob_proposal - log_prob_current) < -5)):
-#         print(f"{tag} pro_prob: {log_prob_proposal}, cur_prob: {log_prob_current}, pro_nf_prob: {log_prob_nf_proposal}, cur_nf_prob: {log_prob_nf_current}, accept: {do_accept}, pro_pos: {position_proposal[0]}, cur_pos: {position_current[0]}")
-
117-117: Remove commented-out debug callback.

The commented-out debug callback should be removed to keep the code clean.
-# debug.callback(cond_print, log_prob_proposal, log_prob_initial, log_prob_nf_proposal, log_prob_nf_initial, do_accept, position_proposal, position_initial)
src/flowMC/strategy/take_steps.py (1)
21-21: Fix inline comment formatting.

The inline comment needs at least two spaces before it according to PEP 8.
-    chain_batch_size: int # If vmap over a large number of chains is memory bounded, this splits the computation
+    chain_batch_size: int  # If vmap over a large number of chains is memory bounded, this splits the computation

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c326bb7 and f91630c.

📒 Files selected for processing (4)

src/flowMC/resource/nf_model/NF_proposal.py (2 hunks)
src/flowMC/resource_strategy_bundle/RQSpline_MALA.py (3 hunks)
src/flowMC/resource_strategy_bundle/RQSpline_MALA_PT.py (3 hunks)
src/flowMC/strategy/take_steps.py (4 hunks)

🧰 Additional context used

🪛 Flake8 (7.2.0)

src/flowMC/strategy/take_steps.py

[error] 21-21: at least two spaces before inline comment

(E261)

🪛 GitHub Actions: pre-commit

src/flowMC/strategy/take_steps.py

[error] 18-18: Black formatting check failed. The file was reformatted by Black to fix spacing and trailing comma issues.

[error] 112-113: Black formatting check failed. The file was reformatted to adjust line breaks and trailing commas in function calls.

[error] 124-125: Black formatting check failed. The file was reformatted to adjust line breaks and trailing commas in function calls.

⏰ Context from checks skipped due to timeout of 90000ms (2)

GitHub Check: build (3.11)
GitHub Check: build (3.12)

🔇 Additional comments (6)

src/flowMC/resource_strategy_bundle/RQSpline_MALA.py (2)

45-45: LGTM: Parameter addition looks correct.

The new chain_batch_size parameter is properly added with a sensible default value of 0.

139-139: LGTM: Parameter propagation is consistent.

The chain_batch_size parameter is correctly passed to both TakeSerialSteps and TakeGroupSteps constructors, enabling the batching functionality in the underlying stepping strategies.

Also applies to: 150-150

src/flowMC/resource_strategy_bundle/RQSpline_MALA_PT.py (2)

51-51: LGTM: Parameter addition is consistent.

The chain_batch_size parameter is added with the same default value and pattern as in RQSpline_MALA.py, maintaining consistency across the codebase.

169-169: LGTM: Parameter propagation maintains consistency.

The parameter is correctly forwarded to both stepper constructors, consistent with the implementation in RQSpline_MALA.py.

Also applies to: 180-180

src/flowMC/resource/nf_model/NF_proposal.py (1)

100-102: LGTM: Correctness fix for Metropolis-Hastings acceptance logic.

The variable name changes from *_current to *_initial correctly fix the acceptance ratio calculation. The acceptance decision should compare the proposed state against the initial state of the chain step, not the current state. This is a proper implementation of the Metropolis-Hastings algorithm.

Also applies to: 106-107, 111-112, 114-114

src/flowMC/strategy/take_steps.py (1)

32-32: LGTM: Parameter addition is correct.

The chain_batch_size parameter is properly added to the constructor with a sensible default value and correctly assigned to the instance attribute.

Also applies to: 42-42

kazewong added 2 commits June 30, 2025 18:23

Add batching options for taking step strategy

b688f87

update NF proposal naming

f91630c

clean up import and remove debug messages

a7eb07d

coderabbitai Bot reviewed Jul 3, 2025

View reviewed changes

Comment thread src/flowMC/strategy/take_steps.py

kazewong added 3 commits July 3, 2025 14:16

Refactor TestLocalStep to use pytest fixture for setup

073681c

Add test for TakeSerialSteps with chain_batch_size batching

0fbc87e

Refactor test_local_step to improve buffer type checking

9cd7427

kazewong merged commit c502bf9 into main Jul 3, 2025
9 checks passed

kazewong deleted the batch_take_step branch July 3, 2025 18:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Batch take step#232

Batch take step#232
kazewong merged 6 commits into
mainfrom
batch_take_step

kazewong commented Jul 3, 2025 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jul 3, 2025 •

edited

Loading

Rate limit exceeded

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

kazewong commented Jul 3, 2025 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Batch processing enhancements:

Probability calculation refinements:

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jul 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram(s)

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

kazewong commented Jul 3, 2025 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jul 3, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)