Skip to content

Batch take step#232

Merged
kazewong merged 6 commits into
mainfrom
batch_take_step
Jul 3, 2025
Merged

Batch take step#232
kazewong merged 6 commits into
mainfrom
batch_take_step

Conversation

@kazewong

@kazewong kazewong commented Jul 3, 2025

Copy link
Copy Markdown
Owner

This PR introduces a new keyword chain_batch_size in the TakeSteps strategy, which allows the user to batch out the chain dimension instead of processing everything at once with vmap. This reduces the memory footprint of the TakeSteps class and allow users with smaller GPUs to run a large number of chains.

The following messages are generated by copilot

This pull request introduces several updates across different files to enhance functionality and improve batch processing capabilities. The changes primarily focus on adding support for chain_batch_size to manage memory constraints during computations, as well as refining variable naming for clarity in probability calculations.

Batch processing enhancements:

Probability calculation refinements:

  • src/flowMC/resource/nf_model/NF_proposal.py: Renamed variables in scan_sample from position_current and log_prob_current to position_initial and log_prob_initial for improved clarity. Adjusted related calculations to reflect these changes.

Summary by CodeRabbit

  • New Features

    • Introduced a configurable batch size option for chain processing to improve memory management during sampling.
    • Added a new parameter to allow users to set the batch size for chains in relevant sampling classes.
  • Improvements

    • Enhanced sampling efficiency and scalability by enabling batch-wise processing of chains when the batch size is specified.
  • Bug Fixes

    • Corrected the acceptance ratio calculation in the sampling algorithm to ensure accurate probability updates during proposal evaluation.

@coderabbitai

coderabbitai Bot commented Jul 3, 2025

Copy link
Copy Markdown

Warning

Rate limit exceeded

@kazewong has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 6 minutes and 52 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between a7eb07d and 9cd7427.

📒 Files selected for processing (1)
  • test/unit/test_strategies.py (4 hunks)

Walkthrough

The changes introduce a new chain_batch_size parameter to several classes to enable batching of chains during vectorized sampling, improving memory management. The Metropolis-Hastings acceptance logic in NFProposal is updated to use initial rather than current state variables. Minor adjustments to parameter passing and batching logic are also included.

Changes

File(s) Change Summary
src/flowMC/resource/nf_model/NF_proposal.py Updated Metropolis-Hastings acceptance logic in scan_sample to use initial state variables instead of current.
src/flowMC/resource_strategy_bundle/RQSpline_MALA.py
src/flowMC/resource_strategy_bundle/RQSpline_MALA_PT.py
Added chain_batch_size parameter to constructors and passed it to stepper instances.
src/flowMC/strategy/take_steps.py Added chain_batch_size attribute and batching logic in TakeSteps for memory-efficient vectorized sampling.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant RQSpline_MALA_Bundle
    participant TakeSteps

    User->>RQSpline_MALA_Bundle: Initialize (chain_batch_size=N)
    RQSpline_MALA_Bundle->>TakeSteps: Initialize (chain_batch_size=N)
    User->>TakeSteps: Call __call__ with chains
    alt chain_batch_size batching
        TakeSteps->>TakeSteps: Split chains into batches
        loop For each batch
            TakeSteps->>TakeSteps: Process batch with sample()
        end
        TakeSteps->>User: Concatenate and return results
    else No batching
        TakeSteps->>TakeSteps: Process all chains with sample()
        TakeSteps->>User: Return results
    end
Loading

Poem

A hop and a skip, we batch with delight,
Chains in neat bundles, memory kept light.
Acceptance now checks where the journey began,
With rabbits debugging, improving the plan.
Batching our steps, we leap ever higher—
CodeRabbit’s changes, swift to inspire!
🐇✨

✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
src/flowMC/resource/nf_model/NF_proposal.py (2)

15-19: Remove commented-out debug code.

The commented-out debug imports and function should be removed to keep the code clean.

-# from jax import debug
-# def cond_print(log_prob_proposal, log_prob_current, log_prob_nf_proposal, log_prob_nf_current, do_accept, position_proposal, position_current, tag=""):
-#     if (do_accept == True and ((log_prob_proposal - log_prob_current) < -5)):
-#         print(f"{tag} pro_prob: {log_prob_proposal}, cur_prob: {log_prob_current}, pro_nf_prob: {log_prob_nf_proposal}, cur_nf_prob: {log_prob_nf_current}, accept: {do_accept}, pro_pos: {position_proposal[0]}, cur_pos: {position_current[0]}")
-

117-117: Remove commented-out debug callback.

The commented-out debug callback should be removed to keep the code clean.

-# debug.callback(cond_print, log_prob_proposal, log_prob_initial, log_prob_nf_proposal, log_prob_nf_initial, do_accept, position_proposal, position_initial)
src/flowMC/strategy/take_steps.py (1)

21-21: Fix inline comment formatting.

The inline comment needs at least two spaces before it according to PEP 8.

-    chain_batch_size: int # If vmap over a large number of chains is memory bounded, this splits the computation
+    chain_batch_size: int  # If vmap over a large number of chains is memory bounded, this splits the computation
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c326bb7 and f91630c.

📒 Files selected for processing (4)
  • src/flowMC/resource/nf_model/NF_proposal.py (2 hunks)
  • src/flowMC/resource_strategy_bundle/RQSpline_MALA.py (3 hunks)
  • src/flowMC/resource_strategy_bundle/RQSpline_MALA_PT.py (3 hunks)
  • src/flowMC/strategy/take_steps.py (4 hunks)
🧰 Additional context used
🪛 Flake8 (7.2.0)
src/flowMC/strategy/take_steps.py

[error] 21-21: at least two spaces before inline comment

(E261)

🪛 GitHub Actions: pre-commit
src/flowMC/strategy/take_steps.py

[error] 18-18: Black formatting check failed. The file was reformatted by Black to fix spacing and trailing comma issues.


[error] 112-113: Black formatting check failed. The file was reformatted to adjust line breaks and trailing commas in function calls.


[error] 124-125: Black formatting check failed. The file was reformatted to adjust line breaks and trailing commas in function calls.

⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: build (3.11)
  • GitHub Check: build (3.12)
🔇 Additional comments (6)
src/flowMC/resource_strategy_bundle/RQSpline_MALA.py (2)

45-45: LGTM: Parameter addition looks correct.

The new chain_batch_size parameter is properly added with a sensible default value of 0.


139-139: LGTM: Parameter propagation is consistent.

The chain_batch_size parameter is correctly passed to both TakeSerialSteps and TakeGroupSteps constructors, enabling the batching functionality in the underlying stepping strategies.

Also applies to: 150-150

src/flowMC/resource_strategy_bundle/RQSpline_MALA_PT.py (2)

51-51: LGTM: Parameter addition is consistent.

The chain_batch_size parameter is added with the same default value and pattern as in RQSpline_MALA.py, maintaining consistency across the codebase.


169-169: LGTM: Parameter propagation maintains consistency.

The parameter is correctly forwarded to both stepper constructors, consistent with the implementation in RQSpline_MALA.py.

Also applies to: 180-180

src/flowMC/resource/nf_model/NF_proposal.py (1)

100-102: LGTM: Correctness fix for Metropolis-Hastings acceptance logic.

The variable name changes from *_current to *_initial correctly fix the acceptance ratio calculation. The acceptance decision should compare the proposed state against the initial state of the chain step, not the current state. This is a proper implementation of the Metropolis-Hastings algorithm.

Also applies to: 106-107, 111-112, 114-114

src/flowMC/strategy/take_steps.py (1)

32-32: LGTM: Parameter addition is correct.

The chain_batch_size parameter is properly added to the constructor with a sensible default value and correctly assigned to the instance attribute.

Also applies to: 42-42

Comment thread src/flowMC/strategy/take_steps.py
@kazewong kazewong merged commit c502bf9 into main Jul 3, 2025
9 checks passed
@kazewong kazewong deleted the batch_take_step branch July 3, 2025 18:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant