Instrumented Causal Language Model Server

This project implements an instrumented causal language model wrapper and server that produces token-by-token generation and captures per-token traces.

Features

Instrumented Model: A wrapper around a Hugging Face causal language model that captures logits, hidden states, and attention matrices.
Streaming API: A WebSocket-based API for real-time streaming of generated tokens and traces.
Persistence: Session traces are saved to compressed artifacts for later analysis.
Intervention API: An API for modifying previous generations and re-running them.

Getting Started

1. Installation

Clone the repository and install the required dependencies:

git clone https://github.com/your-username/instrumented-llm.git
cd instrumented-llm
pip install -r requirements.txt

2. Running the Server

Start the FastAPI server using Uvicorn:

uvicorn app.main:app --host 0.0.0.0 --port 8000

3. Running the Tests

To run the test suite, use the following command:

python -m unittest discover tests

Usage

Generating Text (Non-Streaming)

curl -X 'POST' \
  'http://localhost:8000/api/generate' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "prompt": "Hello, world!",
  "max_new_tokens": 10
}'

Generating Text (Streaming)

Make a request to the /api/generate endpoint with "stream": true to get a WebSocket URL.
Connect to the WebSocket URL to receive the streaming results.

Retrieving Artifacts

GET /api/session/{session_id}/metadata: Get the metadata for a session.
GET /api/session/{session_id}/artifact: Get the path to the session artifact.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.github/workflows		.github/workflows
app		app
backend		backend
docs		docs
experiments		experiments
frontend		frontend
tests		tests
Dockerfile		Dockerfile
Makefile		Makefile
README.md		README.md
package-lock.json		package-lock.json
requirements.txt		requirements.txt
sessions.db		sessions.db

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Instrumented Causal Language Model Server

Features

Getting Started

1. Installation

2. Running the Server

3. Running the Tests

Usage

Generating Text (Non-Streaming)

Generating Text (Streaming)

Retrieving Artifacts

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Instrumented Causal Language Model Server

Features

Getting Started

1. Installation

2. Running the Server

3. Running the Tests

Usage

Generating Text (Non-Streaming)

Generating Text (Streaming)

Retrieving Artifacts

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages