Skip to content

GitAhubI-Lover/IM-WorkOne

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

IM-work: ML Optimization Task (SwiGLU Fusion)

Task Objective

This task challenges an LLM to implement a Fused SwiGLU Activation Layer in PyTorch, a critical component in modern Transformer architectures (like Llama).

Difficulty & Validation

  • Model Tested: claude-3-haiku-20240307
  • Sample Size: 100 iterations
  • Pass Rate: about 25%
  • Key Challenge: The model must not only implement the mathematical formula $SwiGLU(x) = SiLU(xW + b) \otimes (xV + c)$ but also fulfill the engineering constraint of weight fusion (using a single nn.Linear for both $W$ and $V$).

How to Run

  1. Set ANTHROPIC_API_KEY.
  2. Run uv run main.py.

About

What skill does this task teach?:It teaches the model to balance mathematical correctness with high-performance ML engineering constraints (specifically tensor fusion and memory bandwidth optimization). • Pass Rate details:Achieved about 25% pass rate over 100 trials, ensuring the task is neither too trivial nor impossible for RL training.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages