Skip to content

A pull-based distributed computation engine built with gRPC, sharding, leases, and fault-tolerant aggregation.

Notifications You must be signed in to change notification settings

AdityaPainuli/distributed-compute-engine

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Distributed Matrix Compute Engine (Learning Project)

This repo contains Distributed computation engine built to explore concepts such as sharding, cordination, fault tolerance, gRPC, TLS, and idempotent execution.

Core idea is to distribute matrix multiplication across multiple worker processes, aggregates results and tolerates worker failures via lease-based retries

High level flow and architecture

Architecture

Core roles

  • Coordinator

  • Owns global job state

  • Splits jobs into shards

  • Assigns shards to worker using leases

  • Aggregates shared results deterministically

  • Workers

  • Stateless computes nodes

  • Pull shards from coordinator

  • Compute partial matrix results

  • Report results back

  • Client

  • Submits jobs to coordinator

Repository Structure

distributed-system-learning/
├── proto/ # Protobuf definitions and generated code
│ ├── main.proto
│ ├── main.pb.go
│ └── main_grpc.pb.go
│
├── coordinator/ # Coordinator (scheduler + aggregator)
│
├── worker/ # Worker gRPC client and compute logic
│
├── client/ # Job submission client
│
├── certs/ # TLS certificates (examples only)
│ └── README.md
│
├── .env.example # Example environment variables
├── .gitignore
└── README.md

Execution Gurantees

  • Shard execution: at-least-once
  • Final result: exactly-one
  • Safe reteries on worker failure
  • Determinitisc aggregation

Communication Model

  • gRPC (Protocol Buffers) for all internal communication
  • UNary RPCs only
  • Workers pull work instead of coordinator pushing

Transport security

  • gRPC over TLS
  • Coordinator presents a server certificate
  • Clients/workers verify server identity using a CA certificate

About

A pull-based distributed computation engine built with gRPC, sharding, leases, and fault-tolerant aggregation.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages