-
Notifications
You must be signed in to change notification settings - Fork 792
Pull requests: pytorch/torchtitan
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[GraphTrainer] Refactor SAC pass: use module_fqn for layer boundaries
ciflow/8gpu
CLA Signed
This label is managed by the Meta Open Source bot.
#3050
opened Apr 21, 2026 by
SherlockNoMad
Contributor
Loading…
2 tasks done
ci: pin torchvision alongside torch in vlm 8-GPU workflow
ciflow/rocm-mi300
ciflow/8gpu
CLA Signed
This label is managed by the Meta Open Source bot.
module: rocm
#3047
opened Apr 21, 2026 by
rishisinhanj
Contributor
Loading…
Support mark_dynamic in minimal_fx_tracer
ciflow/8gpu
CLA Signed
This label is managed by the Meta Open Source bot.
#3046
opened Apr 21, 2026 by
tugsbayasgalan
Contributor
Loading…
[graph_trainer] Add CP support to CooR precompile
ciflow/8gpu
CLA Signed
This label is managed by the Meta Open Source bot.
#3042
opened Apr 21, 2026 by
bobrenjc93
Contributor
•
Draft
Move RL batch-invariant tests to A10G
ciflow/8gpu
CLA Signed
This label is managed by the Meta Open Source bot.
llm_trainer: promote cuda-graph replay on top of fused RoPE
ciflow/8gpu
CLA Signed
This label is managed by the Meta Open Source bot.
#3040
opened Apr 21, 2026 by
bobrenjc93
Contributor
•
Draft
llm_trainer: promote fused RoPE for llama3 8B tp2 fsdp4
ciflow/8gpu
CLA Signed
This label is managed by the Meta Open Source bot.
#3039
opened Apr 21, 2026 by
bobrenjc93
Contributor
•
Draft
[MoE][3/n]swap to torchao dispatcher and set pad_multiple during config time
ciflow/8gpu
CLA Signed
This label is managed by the Meta Open Source bot.
#3037
opened Apr 20, 2026 by
acisseJZhong
Contributor
•
Draft
4 tasks
quantize on config instead of on model
ciflow/8gpu
CLA Signed
This label is managed by the Meta Open Source bot.
#3032
opened Apr 20, 2026 by
acisseJZhong
Contributor
Loading…
2 tasks
fix(hf_datasets): shuffle HuggingFaceTextDataset on re-loop and replay on resume
ciflow/8gpu
CLA Signed
This label is managed by the Meta Open Source bot.
#3023
opened Apr 18, 2026 by
CrepuscularIRIS
Loading…
[GraphTrainer] Introduce CPU offload pass for activation memory savings
ciflow/h100.8
Trigger H100.8 CI
ciflow/8gpu
CLA Signed
This label is managed by the Meta Open Source bot.
#3015
opened Apr 17, 2026 by
mlazos
Loading…
fix: normalize n_tokens_seen by cp_degree when context parallelism is…
ciflow/8gpu
CLA Signed
This label is managed by the Meta Open Source bot.
#3009
opened Apr 17, 2026 by
TryingtobeingNikhil
Loading…
Fix: reproducible training resume across epoch boundaries for map and streaming datasets
ciflow/8gpu
CLA Signed
This label is managed by the Meta Open Source bot.
#3008
opened Apr 17, 2026 by
slimfrkha
Contributor
Loading…
[HybridEP] Enable HybridEP with graph_trainer
ciflow/8gpu
CLA Signed
This label is managed by the Meta Open Source bot.
#3007
opened Apr 17, 2026 by
syed-ahmed
Contributor
Loading…
[ignore-for-now][llm_trainer] Add experiment for LLM-driven model optimization
ciflow/8gpu
CLA Signed
This label is managed by the Meta Open Source bot.
#3006
opened Apr 17, 2026 by
bobrenjc93
Contributor
•
Draft
[rl] Generator refactor
ciflow/8gpu
CLA Signed
This label is managed by the Meta Open Source bot.
#3001
opened Apr 16, 2026 by
joecummings
Member
Loading…
[moe] load-balancing aux loss
ciflow/8gpu
CLA Signed
This label is managed by the Meta Open Source bot.
Fix DTensor attr handling in make_fx tracer
ciflow/8gpu
CLA Signed
This label is managed by the Meta Open Source bot.
#2998
opened Apr 16, 2026 by
tugsbayasgalan
Contributor
Loading…
[graph_trainer] Copy forward metadata to backward subgraphs
ciflow/8gpu
CLA Signed
This label is managed by the Meta Open Source bot.
#2995
opened Apr 16, 2026 by
tugsbayasgalan
Contributor
Loading…
[NOT READY FOR REVIEW][Full DTensor] Config-based Full DTensor for Llama3
ciflow/8gpu
CLA Signed
This label is managed by the Meta Open Source bot.
[Module] Remove LocalMapInnerAttention, use static LocalMapSpec
ciflow/8gpu
CLA Signed
This label is managed by the Meta Open Source bot.
#2986
opened Apr 15, 2026 by
fegin
Contributor
Loading…
Refactor Loss with Scale to fix PP last stage gradients
ciflow/8gpu
CLA Signed
This label is managed by the Meta Open Source bot.
#2984
opened Apr 15, 2026 by
wwwjn
Contributor
Loading…
Increase timeout for features test to 60 minutes.
ciflow/8gpu
CLA Signed
This label is managed by the Meta Open Source bot.
#2982
opened Apr 15, 2026 by
akashveramd
Collaborator
•
Draft
Previous Next
ProTip!
Follow long discussions with comments:>50.