Skip to content

Latest commit

 

History

History
150 lines (108 loc) · 5.79 KB

File metadata and controls

150 lines (108 loc) · 5.79 KB

🧩 Multi-Threaded Proxy Server (with and without Cache)

A high-performance, concurrent HTTP/HTTPS proxy server built in C using core concepts of
Operating Systems (threads, semaphores, synchronization, caching) and Computer Networks
(socket programming, HTTP parsing, request forwarding, and tunneling).

Developed and maintained at 👉 github.com/Prayas248/MultiThreaded-Proxy-Server


📘 Index


🧠 Introduction

This project implements a multi-threaded caching proxy server that intermediates between clients and web servers.

It can:

  • Handle multiple simultaneous HTTP/HTTPS client requests using threads and semaphores.
  • Cache frequently accessed responses using an LRU (Least Recently Used) algorithm.
  • Expose runtime metrics via an HTTP endpoint (/__metrics).
  • Log every transaction with structured logging for observability.

⚙️ Key Concepts Used

🧩 Operating Systems

  • Multithreading: Implemented using pthread_create() for concurrent client handling.
  • Semaphores: Limit the number of active threads (sem_wait() / sem_post()).
  • Mutex Locks: Protect cache and shared resources against data races.
  • LRU Cache Management: Automatically evicts least-recently used entries when full.
  • Thread Pools: Efficient reuse of worker threads to reduce context-switch overhead.

🌐 Computer Networks

  • Socket Programming (TCP/IP): End-to-end client-proxy-server communication.
  • HTTP Request Parsing: Parses GET, CONNECT, headers, and HTTP versions.
  • HTTPS Tunneling: Handles encrypted traffic via the CONNECT method.
  • Timeout Handling: Prevents blocking on slow or dead connections.
  • Dynamic Response Forwarding: Streams data from origin servers to clients efficiently.

🧩 Features Added (v2)

Feature Description
1️⃣ Thread Pool & Semaphores Manages concurrency safely and prevents overload.
2️⃣ HTTPS CONNECT Support Enables secure HTTPS proxy tunneling.
3️⃣ Timeout Handling Adds recv/send timeouts for both client and server sockets.
4️⃣ /__metrics Endpoint Exposes live proxy statistics:
total_requests, active_clients, cache_hits, cache_misses.
5️⃣ Structured Logging Prints one-line logs for each request:
[timestamp] METHOD HOST PATH STATUS BYTES TIME(ms)
6️⃣ LRU Cache Caches responses to improve speed on repeated requests.
7️⃣ Graceful Shutdown & Resource Handling Ensures sockets, threads, and cache memory are safely released.

🧱 System Architecture

┌────────────┐ ┌────────────────┐ ┌────────────┐ │ Browser │ ─────▶ │ Proxy Server │ ─────▶ │ Web Server │ └────────────┘ └────────────────┘ └────────────┘ │ ▼ ┌──────────┐ │ Cache │ │ (LRU) │ └──────────┘

  • Proxy Server: Listens on a user-defined port, parses requests, forwards them to target servers.
  • Cache: Stores frequently accessed responses; managed using timestamps and LRU logic.
  • Thread Pool: Spawns worker threads to handle concurrent client sessions.
  • Metrics & Logger: Collects statistics and outputs structured logs for monitoring.

⚙️ Networking Workflow

  1. Client Connection: Proxy accepts new TCP connections using accept().
  2. Thread Handling: Each connection is assigned to a thread from the pool.
  3. Request Parsing: HTTP request is parsed (method, host, port, headers).
  4. Cache Lookup:
    • If found → send from cache.
    • If not → forward to remote server via connect().
  5. Response Forwarding: Proxy receives data from remote server using recv() and forwards to client using send().
  6. Cache Update: Response stored in cache for future requests.
  7. Logging: Each request logged with duration, bytes, and status.
  8. Metrics Update: Counters updated in atomic variables for /__metrics.

🧪 How to Run

$ git clone https://github.com/Prayas248/MultiThreaded-Proxy-Server.git
$ cd MultiThreaded-Proxy-Server
$ make all
$ ./proxy <port_number>

---

---

## 📊 Structured Logging

Every request (HTTP, HTTPS, or cache hit) is logged with time, method, host, status, size, and latency.

Example output:
[2025-10-14 19:55:02] CONNECT google.com (tunnel) 200 0B 14ms
[2025-10-14 19:55:06] GET example.com /index.html 200 4839B 88ms
[2025-10-14 19:55:10] CACHE - - 200 4839B 2ms
[2025-10-14 19:55:13] GET localhost /__metrics 200 84B 1ms


---

## 🧪 Demo

![](https://github.com/Prayas248/MultiThreaded-Proxy-Server/blob/main/pics/cache.png)

### Cache Behavior
- **First Visit:** Cache miss — `"url not found"` printed.  
- **Subsequent Visit:** Cache hit — `"Data retrieved from Cache"` printed.

---

## 📈 Metrics Endpoint

The proxy exposes a live statistics endpoint for real-time monitoring.

**View metrics:**
```bash
curl http://localhost:8080/__metrics

total_requests 125
active_clients 4
cache_hits 39
cache_misses 86