Research Program: 3 (Representation, Language, and Cultural Cognition) Status: Reproducible artifact (Zenodo DOI preprint) — not submitted to any venue; target: TACL Relationship to other work: Anchor of Program 3 (companions: macaronic, third-vertex-llm, habitus)
The Platonic Representation Hypothesis (PRH) claims that networks trained on different modalities converge toward a shared latent
paper/main.tex— canonical manuscript (ACL-styled)paper/references.bib— shared bibliographysubmissions/emnlp-2026/main.tex,submissions/colm-2026/main.tex— legacy frozen venue snapshots (not the active target; retained for provenance)experiments/— reproducible pilot: 100 stimuli (50 computational + 50 judgment) × 5 languages (~1,800 inputs incl. variants), embedded through 7 models (UniXcoder, MiniLM-L12, Nomic v1.5, E5-small/base/large, BGE-M3), pinned by HuggingFace revision SHA inexperiments/src/model_registry.py. Results: NL-code alignment 35/35 tier-1 + 35/35 OOD (tier-2/3); cross-lingual P3 probing across 7 models (model-class dependent); P7 spacing/punctuation robustness. P1/P2 honestly reported as not-supported / failed-and-reinterpretedplanning/— TODO, decisions log, review notes, P2-strategy audit
- Target venue: TACL (journal, OpenReview, rolling; not yet submitted). Rationale in
planning/decisions.md2026-06-03 - CodeSage-Large-v2 as a modern code-trained robustness model (closes the single-code-trained-model gap before submission)
- Real cross-dialect evaluation via MADAR / NADI (Arabic) + AI Hub (Korean) corpora, replacing the retracted within-English dialect probe — see
planning/decisions.md2026-06-03 - Native-speaker validation of the 5-language stimulus set (camera-ready)
- Reconcile content drift between
paper/main.texand the frozensubmissions/*/main.texsnapshots manually
-
DDD-style layout (
paper/canonical,submissions/<venue>/frozen snapshots): forces editorial drift between venues to be explicit rather than silently mutating one shared file. Rationale inplanning/decisions.md2026-04-19 entry. -
experiments/scripts/vsexperiments/src/is a library-vs-entry-point split, not a version distinction. - 5 languages × 100 ops is sized as a pilot, not a benchmark: enough to show the qualitative P2 break, small enough to remain reproducible end-to-end on a single machine.
-
Z stratification is the load-bearing theoretical move — it lets PRH stay true while explaining why two systems sharing
$Z$ can still fail to communicate.
- Refuting PRH. The paper refines it, not against it.
- A single auto-synced manuscript across venues. Venue snapshots are intentionally frozen.
- A general theory of "communicability" for arbitrary modalities — scope is NL ↔ code.
- Claims about subjective experience or consciousness from representational similarity. The neuroscience parallel is by analogy only.
- (none — this repo carries no external persons, tokens, or third-party identifiers)
cd experiments
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
cp .env.example .env # OpenAI + Mistral keys
python scripts/run_all.pySee experiments/README.md for model list and prediction-to-script mapping.
MIT