A modular, distributed task orchestration system built from scratch using Python Sockets. Features a multi-threaded TLS-secured TCP broker, priority-based job scheduling, fault-tolerant worker nodes, a REST API, and a live web dashboard.
- Kurumeti Mitesh-PES2UG24AM080
- Mohammed Tahir Siddique-PES2UG24AM092
- Rohith G S-PES2UG24AM138
In modern software engineering, we rarely run "heavy" tasks (like video encoding, AI processing, or bulk emailing) directly on a web server because it would freeze the user interface.
This project implements a Distributed Task Queue, which is the backbone of companies like Netflix, Amazon, and Google. By building this from scratch using Sockets, you demonstrate:
- System Architecture: Understanding the Producer-Consumer pattern.
- Networking Fundamentals: Designing a custom application-layer protocol over TLS.
- Security: Mandatory SSL/TLS encryption for all control and data exchanges.
- Concurrency: Managing multiple simultaneous connections without data corruption.
- Reliability: Solving the problem of "What happens if a worker crashes mid-task?"
- API Design: Exposing backend state via a REST API.
- Frontend Integration: Visualizing a live backend system in a web dashboard.
- Performance Evaluation: Measuring throughput and latency under load.
Distributed-Job-Queue-System/
βββ certs/
β βββ gen_certs.py # Script to generate self-signed TLS certificate
β βββ server.crt # TLS certificate (generated β not committed to git)
β βββ server.key # TLS private key (generated β not committed to git)
βββ docs/
β βββ protocol_v1.md # Full protocol spec: TLS, message types, lifecycle
βββ server/
β βββ main.py # Entry point: TLS socket listener + starts REST API
β βββ queue_manager.py # Priority Queue with Thread Locking & tie-breaker
β βββ worker_registry.py # Tracks health and status of connected workers
β βββ api.py # REST API (port 8080) β dashboard + JSON endpoints
βββ worker/
β βββ worker.py # Execution node: TLS connect, poll, execute, return
β βββ tasks.py # Task library with dynamic registration support
βββ client/
β βββ submit_job.py # Producer script: submits jobs over TLS
βββ dashboard/
β βββ index.html # Live web dashboard (auto-refreshes every 2 seconds)
βββ tests/
β βββ stress_test.py # 100 concurrent TLS jobs + latency/throughput report
βββ README.md # You are here
βββ requirements.txt # No external dependencies (Python 3 built-ins only)
All TCP communication on port 65432 is mandatory TLS-encrypted. Plain (non-TLS) connections are rejected with an ssl.SSLError.
Requires OpenSSL installed on your system.
python certs/gen_certs.py
This generates:
certs/server.crtβ self-signed X.509 certificate (trusted by workers and clients)certs/server.keyβ RSA 2048-bit private key (server only)
| Component | Role | Context Type |
|---|---|---|
server/main.py |
Wraps server socket | PROTOCOL_TLS_SERVER + cert + key |
worker/worker.py |
Connects securely | PROTOCOL_TLS_CLIENT + loads server.crt |
client/submit_job.py |
Connects securely | PROTOCOL_TLS_CLIENT + loads server.crt |
tests/stress_test.py |
100 TLS connections | PROTOCOL_TLS_CLIENT + loads server.crt |
If a plain TCP client connects without TLS, the TLS handshake fails and the server prints:
[Server] SSLError β rejected non-TLS connection: ...
The server does not crash β it catches the ssl.SSLError and continues accepting new connections.
Python 3 only. No pip installs required β uses built-in socket, ssl, threading, json, heapq, uuid, http.server.
- Clone the repo and ensure all files are in their respective folders.
- Generate TLS certificates:
python certs/gen_certs.py - The
QueueManageruses a(priority, counter, job)heap tuple to preventTypeErrorwhen two jobs share the same priority.
Communication Flow:
- TLS Handshake: Every connection starts with a TCP 3-way handshake followed by a TLS handshake. The server presents
server.crt; clients verify it. - Ingestion: A Client connects over TLS, sends a JSON-encoded job with a priority level, receives an ACK, and disconnects.
- Scheduling: The Server places the job in a
heapqPriority Queue. Priority 1 jobs always move to the front. - Distribution: Workers poll with
GET_JOB. The Server pops the highest priority job and dispatches it. - Execution: The Worker runs the task from
tasks.pyand returns aRESULTpacket. - Fault Tolerance: If a worker disconnects mid-task, the server detects it, sets the job status to
REQUEUED, and pushes it back to the priority queue. - Dashboard: The REST API (port 8080) exposes live stats, worker status, security info, and job log.
| Task | Payload | Description |
|---|---|---|
reverse_string |
{"data": "hello"} |
Reverses a string |
to_uppercase |
{"data": "hello"} |
Converts string to uppercase |
fibonacci |
{"n": 10} |
Returns the nth Fibonacci number |
is_prime |
{"n": 97} |
Checks if a number is prime |
slow_task |
{"seconds": 3} |
Sleeps N seconds (simulates heavy work) |
from tasks import register_task, unregister_task, list_tasks
register_task("double_number", lambda p: int(p["n"]) * 2)
unregister_task("slow_task")
print(list_tasks())Tasks registered on one worker are not automatically available on others β each worker has its own
TASK_MAPin memory.
The API runs on port 8080 (plain HTTP, local only).
| Method | Endpoint | Description |
|---|---|---|
GET |
/ |
Serves the live dashboard |
GET |
/api/stats |
Queue size, worker counts, job totals, requeued count |
GET |
/api/workers |
Connected workers, status, current job |
GET |
/api/jobs |
Full job log (newest first) |
GET |
/api/security |
TLS status, protocol, cert file, port |
POST |
/api/submit |
Submit a job via HTTP |
POST /api/submit body:
{ "task": "fibonacci", "payload": {"n": 10}, "priority": 2 }Open http://localhost:8080 in your browser.
- 8 stat cards β queue size, workers, idle, busy, completed, failed, requeued, total
- TLS badge in header β shows π TLS: ON / OFF based on
/api/security - Security panel β protocol, port, cert status, non-TLS rejection policy
- Worker panel β live worker list with idle/busy status and current job
- Job submit form β submit any task directly from the browser
- Status breakdown chart β bar chart of queued/completed/failed/requeued
- Task distribution chart β bar chart of jobs by task type
- Job log table β full history with job ID, task, priority badge, status badge, worker, result, timestamp
Step 1 β Generate certs (once only):
python certs/gen_certs.py
Step 2 β Start server:
python server/main.py
Step 3 β Start workers (two terminals):
python worker/worker.py
Step 4 β Submit jobs:
python client/submit_job.py
Dashboard:
http://localhost:8080
What to watch for:
- Dashboard header shows π TLS: ON
- Workers appear live as green (idle), turn yellow (busy) when executing
submit_job.pysubmitsslow_task(priority 5) first, thenreverse_string(priority 1) β the high-priority job completes first- Try submitting jobs from the dashboard form and watch the job log and charts update in real time
python tests/stress_test.py
Submits 100 concurrent jobs over TLS across all task types with random priorities. Output:
=============================================
Total Jobs : 100
Success : 100
Failed : 0
Total Time : 1.83s
Avg Latency : 18.45 ms/job
Throughput : 54.64 jobs/sec
=============================================
- Avg Latency β measured per job using
time.perf_counter()(high resolution timer) - Throughput β successful jobs per second over total wall-clock time
- Kurumeti Mitesh
- Tahir
- Rohith