Distributed LLM Inference Pipeline

Implements:

Run Tensor Parallel

python tensor_parallel/shard_model.py

python pipeline_parallel/pipe_runner.py

localhost:9090 (Prometheus) localhost:3000 (Grafana)

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
benchmarks		benchmarks
monitor		monitor
pipeline_parallel		pipeline_parallel
rpc		rpc
tensor_parallel		tensor_parallel
README.md		README.md