Skip to content

maduarte95/essay-trajectories

Repository files navigation

'THINKING ABOUT THINKING' INTENSIVE -- QUANTITATIVE ANALYSIS

Thinking About Thinking

This repo processes and analyses human and AI-generated essays produced in a "Thinking about Thinking" 5-day intensive.

It includes scripts for:

  1. Chunking essays into logical steps
  2. Cleaning chunks into bullet points
  3. Embedding cleaned chunks with a sentence-embedding model
  4. Analyzing trajectories with several metrics

Data

Essay data is organized in two directories:

  • FINAL/ - Human-written essays (7 essays: students A-G)
  • META/ - AI-generated essays from blind prompts (7 essays: students A-G)

Each essay is a markdown file named student-{ID}-{source}-{title}.md where:

  • {ID} is the student identifier (A through G)
  • {source} is either FINAL (human) or META (AI)
  • {title} is the essay topic

The essays were produced during a 5-day "Thinking about Thinking" intensive. On Day 2, each student captured their essay intentions in a "blind prompt." On Day 5, that frozen prompt was submitted to an LLM with a standardizing metaprompt, producing an AI-generated essay from the same seed. Students also produced their own essays through 5 days of structured thinking.

Processed data outputs are stored in:

Pipeline

The analysis pipeline consists of five main steps:

1. Chunking (01_chunk.py)

Segments each essay into sequential argumentative units using the Anthropic Batch API with Claude Sonnet 4.5. Each chunk represents one argumentative move. The chunking preserves exact original text without paraphrasing.

Output: JSON files in output/data/chunks/ with format {student}-{source}.json

2. Chunk Cleaning (01_02_clean_chunks_batch.py)

Cleans and simplifies chunks into bullet points using the Anthropic Batch API. This optional but recommended step condenses verbose chunks into concise bullet-point summaries while preserving semantic content, which improves embedding quality.

Output: Updated JSON files in output/data/chunks/ with added cleaned_chunks field

3. Embedding (02_embed.py)

Generates semantic embeddings for each chunk using the sentence-transformers library with the all-MiniLM-L6-v2 model (384 dimensions). Embeddings are normalized for cosine similarity calculations.

Output: JSON files in output/data/embeddings/ containing chunks and their embedding vectors

4. Metrics Computation (03_metrics.py)

Computes trajectory metrics for each essay in PCA-reduced space:

  • Displacement metrics: mean, variance, and max step sizes between consecutive chunks
  • Tortuosity: ratio of path length to straight-line distance (path efficiency)
  • Momentum: directional consistency between consecutive steps
  • Divergence curves: position-aligned comparison of human vs AI trajectories
  • Homogeneity: pairwise similarity within human and AI groups

All metrics are computed using cosine distance in a 138-component PCA space (95% variance explained).

Output:

5. Visualization (04_figures.py)

Generates publication-ready figures:

  • Figure 1: 2D PCA trajectory visualization (one subplot per student)
  • Figure 2: Aggregate metric comparison (human vs AI)
  • Figure 2b: Paired metric comparison (per-student lines)
  • Figure 3: Divergence curves showing how human/AI trajectories drift apart
  • Figure 4: Homogeneity matrix heatmap (pairwise similarity)
  • Figures 5-7: Displacement profile visualizations (violin plots, series, distributions)

Output: PDF and PNG figures in plots/

Additional Scripts

Usage

This project uses uv for dependency management. All scripts should be run using uv run from the project root.

Setup

  1. Install dependencies:
uv sync
  1. Copy .env.EXAMPLE to .env and add your Anthropic API key:
cp .env.EXAMPLE .env
# Then edit .env and add your actual API key

Running the Pipeline

Option 1: Streamlined Pipeline (Recommended)

Run the entire pipeline with a single interactive command:

uv run python main.py

This will guide you through all steps with:

  • Interactive confirmations before using the Anthropic API
  • Automatic detection of existing outputs
  • Option to run all steps or select specific ones
  • Clear progress tracking and error handling

Option 2: Manual Step-by-Step

Run scripts individually in order:

# 1. Chunk essays into argumentative units
uv run python scripts/01_chunk.py

# 2. Clean chunks into bullet points (optional but recommended)
uv run python scripts/01_02_clean_chunks_batch.py

# 3. Generate embeddings for chunks
uv run python scripts/02_embed.py

# 4. Compute trajectory metrics
uv run python scripts/03_metrics.py

# 5. Generate figures
uv run python scripts/04_figures.py

Expected Processing Time

  • Chunking (01): ~30 mintues- 1 hour (Anthropic Batch API)
  • Chunk Cleaning (01_02): ~ 30 minutes - 1 hour (Anthropic Batch API)
  • Embedding (02): ~1-2 minutes (local model)

Supported Measures

All trajectory analyses use cosine distance as the primary distance metric and are performed in PCA-reduced embedding space to reduce noise and focus on main patterns of semantic variation. We work in embedding space (not a decoded schema space) since our essays differ in thesis and argument structure.

Distance Metric

Cosine Distance measures the angular separation between embedding vectors:

cosine_distance(u, v) = 1 - (u · v) / (||u|| ||v||)

Range: [0, 2], where 0 = identical, 1 = orthogonal, 2 = opposite

Cosine Similarity (used for the similarity matrix) is the inverse:

cosine_similarity(u, v) = 1 - cosine_distance(u, v) = (u · v) / (||u|| ||v||)

Range: [-1, 1], where 1 = identical, 0 = orthogonal, -1 = opposite

PCA Dimensionality Reduction

Before computing trajectory metrics, all embeddings are projected into a shared PCA subspace:

  1. Fit PCA on pooled embeddings from all essays (human + AI, all students)
  2. Transform each essay's embeddings using the fitted PCA
  3. Compute metrics in the PCA-reduced space

Dimensions:

  • Original: 384 dimensions (all-MiniLM-L6-v2 model)
  • PCA-reduced: 138 components (95% variance explained)

Trajectory Metrics

All metrics computed in PCA-reduced space using cosine distance:

1. Step-to-Step Displacement

Cosine distance between consecutive chunk embeddings. Derived metrics:

  • Mean displacement: Average semantic step size
  • Displacement variance: Variability in step sizes
  • Max displacement: Largest single conceptual jump

Interpretation: Higher variance suggests exploratory process with uneven pacing..

2. Tortuosity (Path Efficiency)

Ratio of total path length to straight-line endpoint distance:

tortuosity = (sum of displacements) / endpoint_distance

Range: [1, ∞), where 1 = perfectly direct path, >1 = circuitous/wandering

Expected pattern: AI essays should have lower tortuosity (planned all at once, direct path). Human essays should have higher tortuosity (5-day iterative process, revisiting and reframing).

3. Momentum (Directional Consistency)

Average cosine similarity between consecutive direction vectors. Measures whether an essay maintains consistent conceptual direction.

direction[i] = embedding[i+1] - embedding[i]
momentum = mean(cosine_similarity(direction[i], direction[i+1]))

Range: [-1, 1], where 1 = perfect momentum, 0 = orthogonal changes, -1 = reversing

Adapted from: Nour et al. (2025), "Charting trajectories of human thought using large language models" (VECTOR framework)

4. Matched-Pair Divergence Curve

Position-aligned comparison showing how human and AI trajectories drift apart over the essay:

  1. Interpolate both trajectories to 20 common points (normalized positions 0 to 1)
  2. Compute cosine distance at each position: divergence[t] = cosine_distance(human[t], ai[t])

Interpolation method: Linear interpolation in embedding space.

Expected pattern: Divergence should increase over the essay. Both start from the same blind prompt (similar openings), but the 5-day human process produces increasing departure from AI baseline.

5. Inter-Essay Homogeneity (Pairwise Similarity)

Measures similarity within human group and within AI group:

  1. Interpolate all trajectories to 20 points
  2. Flatten to single vectors (20 × embedding_dim)
  3. Compute pairwise cosine similarities
  4. Compare mean within-group similarities

Expected pattern: AI essays should be more homogeneous (higher AI-AI similarity). This operationalizes "process convergence" — collapsing to a single prompt produces more similar outputs across different people.

Visualization Methods

2D PCA Projection

To visualize high-dimensional trajectories (384D) in 2D:

  1. Pool all chunk embeddings from all essays
  2. Fit PCA extracting top 2 components
  3. Project all embeddings to 2D
  4. Plot each essay as a trajectory with points connected by lines

Rationale for pooled PCA: Ensures visual comparability across essays

Limitations: 2D projection captures only ~20-30% of variance. Qualitative illustration only — should not be over-interpreted.

Statistical Considerations

With N=7 student pairs (6 complete pairs), we report:

  • Descriptive statistics (means, standard deviations)
  • Individual paired comparisons
  • Visualizations of patterns

Documentation

Detailed documentation is available in the docs/ directory:

These documents provide comprehensive details on:

  • Computational methods and mathematical formulas
  • PCA preprocessing and dimensionality reduction
  • Each trajectory metric with interpretation guides
  • Visualization techniques and their limitations
  • Implementation code examples
  • Design decisions and rationale

They were largely used to provide context for development with Claude Code. Some might not be updated.

References

Nour, M. M., et al. (2025). "Charting trajectories of human thought using large language models." Nature.

Caveat: Unlike Nour et al. (2025) which used supervised decoding (participants retold the same known story), our essays differ in thesis, argument, and conclusions. No ground-truth schema exists. We work in semantic embedding space (with PCA preprocessing) — appropriate for our design but means we cannot decode trajectories into shared human-interpretable argumentative moves.

Created with Claude Code, in line with the intensive's theme.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors