Skip to content

sw30labs/sst-autoresearch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SST-AutoResearch

Speaker State Trajectory analysis via a Karpathy-style autoresearch loop.

Treats a speaker's voice as a nonlinear dynamical system — extracts acoustic features per frame, embeds them into phase space (Takens' theorem), and analyzes the resulting trajectory for attractors, bifurcations, regime changes, and Lyapunov exponents. An LLM drives the research cycle: hypothesize → design experiment → execute → evaluate → reflect → loop.

Quick Start

# Install
cd sst_autoresearch
pip install -e .

# Configure model in .env (see .env for options)

# Smoke test
python smoke_test.py

# Run autoresearch (2 iterations to start)
python -m src.graph data/wav/your_file.wav --max-iter=2

Architecture

START → ingest_audio → hypothesize → design_experiment → run_experiment → evaluate → reflect
reflect → hypothesize          (pivot — try new direction)
reflect → design_experiment    (deepen — more evidence)
reflect → synthesize → END     (conclude — write report)

All DSP/statistics run as deterministic Python. The LLM handles hypothesis generation, experiment design, result interpretation, and meta-reasoning about what to explore next.

Feature Extraction

Per-frame features (25ms frames, 10ms hop) via Parselmouth (Praat) and librosa: F0, jitter, shimmer, HNR, formants F1-F3, MFCC 1-13, spectral centroid, rolloff, flux, RMS energy, zero crossing rate.

Dynamical Analysis

Recurrence plots + RQA, Lyapunov exponents (Rosenstein), Grassberger-Procaccia correlation dimension, sample entropy, regime change detection.

Phase Space

Time-delay embeddings (Takens' theorem) with automatic optimal τ (AMI) and embedding dimension selection.

Backends

Configured via .env:

Backend Setting Hardware Model
MLX SST_BACKEND=mlx Mac Studio M3 Ultra mlx-community/Qwen3.5-122B-A10B-bf16
Ollama SST_BACKEND=ollama NVIDIA DGX qwen3.5:122b

Project Structure

sst_autoresearch/
├── .env                    # Backend + model config
├── smoke_test.py           # Layer-by-layer validation
├── pyproject.toml
├── src/
│   ├── graph.py            # LangGraph state machine
│   ├── llm.py              # LLM interface (MLX + Ollama)
│   ├── nodes/              # Autoresearch loop nodes
│   │   ├── hypothesize.py
│   │   ├── design.py
│   │   ├── execute.py
│   │   ├── evaluate.py
│   │   ├── reflect.py
│   │   └── synthesize.py
│   ├── features/           # DSP pipeline
│   │   ├── acoustic.py     # Parselmouth + librosa extraction
│   │   └── dynamics.py     # RQA, Lyapunov, entropy, regimes
│   └── prompts/            # LLM prompt templates
├── data/wav/               # Input audio files
├── outputs/reports/         # Generated research reports
└── notebooks/

Requirements

  • Python ≥ 3.11
  • For MLX backend: Apple Silicon Mac with mlx-lm
  • For Ollama backend: Ollama server with a Qwen3.5 model pulled
  • Audio dependencies: librosa, parselmouth, scipy

About

Speaker State Trajectory analysis — treats voice as a nonlinear dynamical system and drives research with a Karpathy-style autoresearch loop.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages