Skip to content

feat: TensorRT-LLM Benchmark Command Generator (M113)#250

Merged
hlin99 merged 1 commit into
mainfrom
feat/m113-trtllm-commands
Apr 6, 2026
Merged

feat: TensorRT-LLM Benchmark Command Generator (M113)#250
hlin99 merged 1 commit into
mainfrom
feat/m113-trtllm-commands

Conversation

@hlin99

@hlin99 hlin99 commented Apr 6, 2026

Copy link
Copy Markdown
Member

Summary

TensorRT-LLM Benchmark Command Generator — generate ready-to-run TRT-LLM engine build, server launch, and benchmark commands for P:D ratio exploration.

Mirrors the existing vLLM and SGLang command generators.

Changes

  • TRTLLMCommandGenerator class in trtllm_commands.py
  • TRTLLMCommandConfig, TRTLLMServerCommand, TRTLLMBenchmarkCommand, TRTLLMCommandSet Pydantic models
  • TRT-LLM specific: trtllm-build engine build + tensorrt_llm.serve server + trtllm-bench benchmark
  • Options: max_batch_size, kv_cache_free_gpu_mem_fraction, pp_size, dtype, engine_dir
  • Shell script output with engine build → server → benchmark → cleanup lifecycle
  • CLI trtllm-commands subcommand with table + JSON + --output-script
  • Programmatic generate_trtllm_commands() API
  • 29 new tests

Closes #249

- TRTLLMCommandGenerator class in trtllm_commands.py
- TRTLLMCommandConfig, TRTLLMServerCommand, TRTLLMBenchmarkCommand, TRTLLMCommandSet models
- Generate trtllm-build engine and tensorrt_llm.serve commands for each P:D ratio
- TRT-LLM options: max_batch_size, kv_cache_free_gpu_mem_fraction, pp_size, dtype, engine_dir
- Shell script output with engine build + server + benchmark lifecycle
- CLI trtllm-commands subcommand with table + JSON output
- Programmatic generate_trtllm_commands() API
- 29 new tests

Closes #249

@hlin99-Review-Bot hlin99-Review-Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved by hlin99-Review-Bot

Idea Value: Good — completes the TRT-LLM toolchain (importer M112 + command generator M113), consistent with vLLM and SGLang patterns.

Code Quality:

  • Clean Pydantic models (TRTLLMCommandConfig, TRTLLMServerCommand, TRTLLMBenchmarkCommand, TRTLLMCommandSet)
  • Generator correctly produces engine build → server → benchmark → cleanup lifecycle
  • CLI registration follows established pattern
  • 29 tests, all passing
  • docs/iterations/current.md and ROADMAP.md updated
  • CI all green (lint + tests 3.10/3.11/3.12)

LGTM 🚀

@hlin99-Review-BotX hlin99-Review-BotX left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved by hlin99-Review-BotX

Idea Value: Good — M113 completes the TRT-LLM command generator, consistent with the vLLM (M109) and SGLang (M111) patterns. Aligns with roadmap.

Code Quality:

  • Clean Pydantic models (TRTLLMCommandConfig, TRTLLMServerCommand, TRTLLMBenchmarkCommand, TRTLLMCommandSet)
  • Generator correctly produces engine build → server → benchmark → cleanup lifecycle
  • CLI registration follows established pattern (register_trtllm_commands, _cmd_trtllm_commands)
  • Shell script generation with proper set -euo pipefail and server PID management
  • __init__.py exports and __all__ updated correctly
  • ROADMAP.md and docs/iterations/current.md updated
  • 29 tests, CI all green (lint + tests 3.10/3.11/3.12)

LGTM 🚀

@hlin99 hlin99 merged commit be15d4a into main Apr 6, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: TensorRT-LLM Benchmark Command Generator (M113)

3 participants