SST-AutoResearch

Speaker State Trajectory analysis via a Karpathy-style autoresearch loop.

Treats a speaker's voice as a nonlinear dynamical system — extracts acoustic features per frame, embeds them into phase space (Takens' theorem), and analyzes the resulting trajectory for attractors, bifurcations, regime changes, and Lyapunov exponents. An LLM drives the research cycle: hypothesize → design experiment → execute → evaluate → reflect → loop.

Quick Start

# Install
cd sst_autoresearch
pip install -e .

# Configure model in .env (see .env for options)

# Smoke test
python smoke_test.py

# Run autoresearch (2 iterations to start)
python -m src.graph data/wav/your_file.wav --max-iter=2

Architecture

START → ingest_audio → hypothesize → design_experiment → run_experiment → evaluate → reflect
reflect → hypothesize          (pivot — try new direction)
reflect → design_experiment    (deepen — more evidence)
reflect → synthesize → END     (conclude — write report)

All DSP/statistics run as deterministic Python. The LLM handles hypothesis generation, experiment design, result interpretation, and meta-reasoning about what to explore next.

Feature Extraction

Per-frame features (25ms frames, 10ms hop) via Parselmouth (Praat) and librosa: F0, jitter, shimmer, HNR, formants F1-F3, MFCC 1-13, spectral centroid, rolloff, flux, RMS energy, zero crossing rate.

Dynamical Analysis

Recurrence plots + RQA, Lyapunov exponents (Rosenstein), Grassberger-Procaccia correlation dimension, sample entropy, regime change detection.

Phase Space

Time-delay embeddings (Takens' theorem) with automatic optimal τ (AMI) and embedding dimension selection.

Backends

Configured via .env:

Backend	Setting	Hardware	Model
MLX	`SST_BACKEND=mlx`	Mac Studio M3 Ultra	`mlx-community/Qwen3.5-122B-A10B-bf16`
Ollama	`SST_BACKEND=ollama`	NVIDIA DGX	`qwen3.5:122b`

Project Structure

sst_autoresearch/
├── .env                    # Backend + model config
├── smoke_test.py           # Layer-by-layer validation
├── pyproject.toml
├── src/
│   ├── graph.py            # LangGraph state machine
│   ├── llm.py              # LLM interface (MLX + Ollama)
│   ├── nodes/              # Autoresearch loop nodes
│   │   ├── hypothesize.py
│   │   ├── design.py
│   │   ├── execute.py
│   │   ├── evaluate.py
│   │   ├── reflect.py
│   │   └── synthesize.py
│   ├── features/           # DSP pipeline
│   │   ├── acoustic.py     # Parselmouth + librosa extraction
│   │   └── dynamics.py     # RQA, Lyapunov, entropy, regimes
│   └── prompts/            # LLM prompt templates
├── data/wav/               # Input audio files
├── outputs/reports/         # Generated research reports
└── notebooks/

Requirements

Python ≥ 3.11
For MLX backend: Apple Silicon Mac with mlx-lm
For Ollama backend: Ollama server with a Qwen3.5 model pulled
Audio dependencies: librosa, parselmouth, scipy

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
data/wav		data/wav
src		src
.env.example		.env.example
.gitignore		.gitignore
ARCHITECTURE.md		ARCHITECTURE.md
README.md		README.md
pyproject.toml		pyproject.toml
smoke_test.py		smoke_test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SST-AutoResearch

Quick Start

Architecture

Feature Extraction

Dynamical Analysis

Phase Space

Backends

Project Structure

Requirements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SST-AutoResearch

Quick Start

Architecture

Feature Extraction

Dynamical Analysis

Phase Space

Backends

Project Structure

Requirements

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages