Skip to content

Enhancement: Dynamic Batching#37

Open
ryanznie wants to merge 5 commits into
mainfrom
enhancement/dynamic-batching
Open

Enhancement: Dynamic Batching#37
ryanznie wants to merge 5 commits into
mainfrom
enhancement/dynamic-batching

Conversation

@ryanznie
Copy link
Copy Markdown
Owner

@ryanznie ryanznie commented Feb 1, 2026

  • Added dynamic batching configs
  • Minor: cleaned comments

@ryanznie ryanznie self-assigned this Feb 1, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Feb 1, 2026

Coverage report

Click to see where and how coverage changed

FileStatementsMissingCoverageCoverage
(new stmts)
Lines missing
  scripts
  setup_triton_repo.py 65-66, 82-90
Project Total  

This report was generated by python-coverage-comment-action

@ryanznie
Copy link
Copy Markdown
Owner Author

ryanznie commented Feb 1, 2026

@gemini-cli /review

1 similar comment
@ryanznie
Copy link
Copy Markdown
Owner Author

ryanznie commented Feb 1, 2026

@gemini-cli /review

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Feb 1, 2026

🤖 Hi @ryanznie, I've received your request, and I'm working on it now! You can track my progress in the logs for more details.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📋 Review Summary

This pull request focuses on improving the Triton inference backend and its configuration. The changes include code cleanup by removing unnecessary comments and a more robust client implementation. The dynamic batching configuration for the Triton server has been refined, which should lead to better performance. Overall, the changes are positive and improve the quality of the codebase.

🔍 General Feedback

  • The code is cleaner and more readable after the removal of commented-out code and explanatory comments.
  • The Triton client is more robust as it now dynamically determines the output name instead of assuming a fixed name.
  • The refined dynamic batching settings are a good step towards performance optimization.
  • In src/inference.py, the _get_triton_datatype function has a fallback to "FP32" for unknown numpy dtypes. This could hide potential data type mismatches. It would be more robust to raise a TypeError for unsupported types to ensure that only expected data types are handled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant