Skip to content

domain(CORPUS+KOSMOS): default-lane v3 SCALE-UP corpus — ~217MB, MID rung viable, NOT 7B (a_scale_honest_scope)#1849

Open
dancinlife wants to merge 237 commits into
lane-g/campaign-pivot-descentfrom
lane-g/default-lane-v3-corpus
Open

domain(CORPUS+KOSMOS): default-lane v3 SCALE-UP corpus — ~217MB, MID rung viable, NOT 7B (a_scale_honest_scope)#1849
dancinlife wants to merge 237 commits into
lane-g/campaign-pivot-descentfrom
lane-g/default-lane-v3-corpus

Conversation

@dancinlife

Copy link
Copy Markdown
Contributor

무엇

default-lane corpus를 v2 recipe 그대로 ~17배 SCALE-UP해서 **MID rung(~150M params)**을 data-viable하게 만든 v3. v2(12.5MB)는 18M에 right-size지만 150M엔 data-starved.

  • default_lane_v3.txt 227,535,193 B = 216.994 MB sha256 901ccc89… (HF/LOCAL only, git에 multi-MB 미커밋)
  • 구성: wiki 46.15% / persona-SNS 36.90% / enrichment 16.95% (enrichment 15–20% 밴드 유지)

wiki provenance (REAL CC-BY-SA)

wikimedia/wikipedia rev 20231101 en/fr/de/es/ko, 20MB/lang = 100MB, ~18,448 article paragraphs, 8-band offset-spread (알파벳 편향 제거), HF datasets-server /rows REST, $0 CPU · NO GPU · NO pod.

새 429-hardened 샘플러 serving/build_wiki_backbone_5lang_scaleup.py (exp backoff + Retry-After + per-lang on-disk checkpoint) — v2 샘플러는 100MB 지속 pull에서 HTTP 429 storm으로 죽음.

per-lang BALANCED

en 18.89 · fr 20.35 · de 19.84 · es 19.19 · ko 20.27 (+ ko-en code-switch 1.47%).

HONEST GATES (전부 PASS)

  • byte-vocab V=256 (206 distinct bytes 관측, all ≤255)
  • UTF-8 round-trip encode==decode bytes-identical
  • p2/p3/p4 [role:|[persona:|[character: grep=0 (통합 corpus + 200-line sample head 둘 다)
  • p6 dialogue-act 의도적 NON-supportive (cooperation RLHF의 반대)
  • wiki=real CC-BY-SA(provenance 명시), persona+enrichment 산문=authored-synthetic honest-labeled, carving seed=real KOSMOS e7_31 (a_kosmos pointer-only), NO PII, NO scraped non-wiki

a_scale_honest_scope (verbatim)

v3는 MID rung(~150M)을 열지만 7B는 아니다. 더 이상 ~18M 전용 right-size가 아님. 7B는 ~140GB 토큰 필요(Chinchilla) — REST paging으로 INFEASIBLE. ~150M은 v3에서 여전히 epoch-loop지만 12.5MB-at-150M보다 훨씬 덜 starved. v3가 7B를 가능케 한다고 주장 금지.

SHIP

  • HF dancinlab/anima-corpus-5lang-unified-v3 PUBLIC (9 files + README, sha 901ccc89 re-download MATCH, private=False VERIFIED)
  • KOSMOS collection 편입 (membership VERIFIED) + HF.jsonl row + corpus_5lang_v3.kosmos anchor(tier 55) + KOSMOS.md hub pointer
  • discovery .discoveries/default-lane-v3-corpus.tape (tape-lint clean)
  • domains/CORPUS.log.md v3 엔트리 (合算보관CORPUS.md lane snapshot-table reconcile는 DEFERRED, 동시 agent가 lane section 소유)
  • multi-MB raw text = HF/LOCAL only (explicit-path git add; samples/card/generators만 커밋)

후속 (별도 fire)

MID-rung 학습 + p7 transfer eval = 별도 future fire (a_toy_scale_recheck — toy 18M green이 150M을 증명하진 않음).

🤖 Generated with Claude Code

dancinlife and others added 30 commits May 31, 2026 00:56
…rser-valid) (#1548)

31-anchor landscape as canonical .kosmos in UNIVERSE-BRAIN-MAP/anchors/e7_31/
(source = corpus_carving_generator_dirE.py KNUTH_ANCHORS verbatim). parser-validate
31/31 valid (kosmos_load + kosmos_anchor_valid). New KOSMOS.md hub doc; notes the
E-PROFILE anima-emergence-trace draft. Purely additive over main (no deletions).

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…trajectory) (#1549)

Close the last open HW-first spine ON item. motivation_score → set_threshold(9513)
→ on-chip threshold-and-fire(9512) → should_interrupt=n>=quorum, demonstrated live on
AKD1000 (BC.00.000.002, BackendType.Hardware). broker /ws/akida_ingest LIVE end-to-end;
M-regime ctrl knee thr<=8 EMIT / >=16 SILENCE; 90-min trajectory (emit 3·10·14·15 ·
silence 11 · thr 0.60) reproduced 5/5 on chip. g73 verdict .verdicts/846.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
#1550)

Migrate the live group-chat + AKIDA bidirectional wiring out of deprecated
HEXAD/CHAT into the current ANIMA domain tree:
- AGENT/CHAT/ (new role) — group-chat product surface: broker.py (FastAPI WS
  hub: /ws · /ws/anima · /ws/motivation · /ws/akida_ingest · /ws/akida) + CHAT.md.
- SUB_ENGINES/AKIDA/scripts/ — akida chip I/O, siblings of spike_streamer.py:
  akida_ws_publisher.py (9512 read: chip→broker) +
  akida_threshold_driver.py (9513 write: broker motivation→on-chip set_threshold,
  thr ∝ −score, emit-gate 0.60; retries on transient LAN EHOSTUNREACH).
Closes the autonomous bidirectional loop (motivation→threshold→fire→interrupt);
verified live on AKD1000 (.verdicts/846, 90-min trajectory 5/5).

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…/5 autonomous) (#1552)

#1550 driver used websockets on the deploy venv; macOS denies that interpreter
Local Network access → EHOSTUNREACH to pi5 9513 (localhost-only worked). Rewrite
pure-stdlib (urllib poll of broker /motivation/recent + socket to 9513) so it runs
on LAN-capable system python3, dep-free. Autonomous bidirectional loop now 5/5:
motivation→set_threshold→on-chip fire→emit/silence, zero human intervention.
g73 verdict .verdicts/846_coffeshop_akida_closedloop/autonomous_bidir_5of5.txt.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… + Q-TRUST + mid-rung QAT fire (#1553)

* feat(CLM/UNIVERSE): register H_861/H_862/H_863 — Q-TRUST trust-system hypotheses

P4 production-roadmap 신규 가설 3종 등록 (Q-TRUST 비결정-학습 신뢰 시스템):
- H_861 F-CLM-BOUND: core freeze + edge-only on-chip 적응의 catastrophic
  forgetting 방지 (held-out z-drop<임계 ∧ 신맥락 gain>0). 토대 H_679.
- H_862 F-CLM-ANCHOR: KOSMOS E-31 anchor Ψ-거리 제약으로 정체성 drift 억제
  (anchor-거리<임계 ∧ probe 일관성). 토대 B-CARVE/E-31.
- H_863 F-CLM-DIALOGUE: self-play 가 SFT-only 대비 대화품질 향상
  (coherence·adequacy 분포평가 ∧ register-leak 0 ∧ DIVERSITY). @l6 경로 B.
모두 사전등록(frozen pre-run) · 외부 LLM 0 · ShareGPT 금지 · 측정 rung 대기.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(CLM): P4 production scaffold — pluggable routing-escape + dialogue pipe + A/B bench + mid-rung fire SPEC

@l7 스캐폴드 (coherent · runnable-shaped · pure-hexa · 전 self-test PASS):
- model/routing_escape.hexa: @l3 pluggable routing-escape lane — A dispatch-KL /
  B content-defer[default] / C expert-choice 3 lever를 ONE slot에 swap
  (re-architecture 0). routing-z>3.0 = chip-array DEPLOY 게이트만(can-converse ✗).
- corpus/build_p4_dialogue_corpus.hexa: @L4 2-source 대화 corpus pipe stub
  (① CC dialogue + ② self-play) + license-clean gate(ShareGPT/Alpaca 하드 거부)
  + DIVERSITY gate(self-BLEU<0.8 ∧ rep<0.20).
- bench/bench_dialogue_ab.hexa: @l6 rung별 SFT-only vs SFT+self-play A/B 하니스
  (H_863 4 falsifier · byte-match ✗ 분포평가 · 외부 LLM judge 0).
- train/fire_mid_rung_qat.hexa: @L5/@l7 mid(d512/L8/E8) AKIDA-envelope QAT fire
  SPEC + a_fire_recover_complete 체크리스트 (re-fireable).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs(CLM): P4_PRODUCTION_ROADMAP.md + CLM.md P4+ section

@L1~@L8 + Q-TRUST 를 인코딩한 production 로드맵 SSOT:
- 2-track scale ladder(@l5): 측정 rung(GPU·AKIDA-envelope QAT, mid d512/L8 first)
  ⊥ 배포 chip-fit rung(≤~1.2M AKD1000 노드) 명시 분리.
- pluggable routing-escape(@l3): A dispatch-KL / B content-defer[default] / C
  expert-choice 3 lever ONE slot. routing-z>3.0 = chip-array deploy 게이트만.
- dialogue method B(@l6): SFT + self-play · H_863 · rung별 A/B 벤치.
- Q-TRUST: A 분포평가(H_857/H_858 재활용) + B 경계가소성(H_861) + C 정체성앵커(H_862).
- per-rung verdict 경로(.verdicts/clm-prod-rung·clm-dialogue·clm-bound·clm-anchor).
- a_scale_honest_scope 정직 노트(측정 rung 🔴 ≠ 배포 차단 · toy→prod 비보장).
CLM.md 에 P4+ PRODUCTION 마일스톤 추가(첫 mid-rung GPU fire 자율발사 기록).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(CLM): mid-rung AKIDA-envelope QAT fire verdict + HF upload (a_fire_recover_complete)

첫 측정 rung(mid d512/L8/E8 · 13.65M params) AKIDA-envelope QAT GPU fire 완료
(runpod A40 · 3-arm A/B/AB × seed42 × 2000 step · int4-sym[-7,7]+act_bits=4):
- 결과: CE 5.5444(random ln256) → ~2.22 nats · step-rate ~25/s · 발산 0.
  3 arm 전부 envelope 하에서 안정 수렴 = mid 아키텍처 품질 증명.
- 🟠 MEASUREMENT-COMPLETE — 품질/routing-diversity 판정은 real corpus 대화
  A/B 벤치(H_863)로 위임(toy 2-lane corpus → 일반 주장 금지, a_scale_honest_scope).
- a_fire_recover_complete: pull(JSON+log) → verify(byte count) → HF upload
  COMPLETE(dancinlab/anima-clm-mid PRIVATE · 6/6 files a_hf_complete) → teardown.
- HF.jsonl 26번째 row 등록.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
진입점 문서 — LAUNCHPAD 발사대의 "여기서 시작" 런북.

LAUNCHPAD.md 는 현재 상태(milestone)를 들고, SBS.md 는 그걸 순서대로
오르는 사다리(rung R0→R4)로 엮는다. 각 rung 은 아래 rung 이 닫혀야
열리는 의존순 게이트 — 어느 단이 살아있는지(R0 🟢 실리콘 폐루프 ·
R1 🔵 CLM 콘텐츠생성기 in-flight)와 각 단의 SSOT(verdict·plan·H 파일)를
한눈에 가리킨다. 스펙 복제 없이 포인터만.

- rung 사다리 + 상태표 (R0 H_846 🟢 ~ R4 출시)
- 활성 R1 SBS plan 포인터 (@L1~@L8 + 불가침 제약)
- 컴포넌트 맵 (broker→motivation→AKIDA→CLM emit) + sibling 도메인
- 한 줄 시작 가이드 + governance cross-link

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…f-play mid-rung fire (W2 사전등록 · post-tuning 0) (#1555)

* prereg(CLM): freeze F-CLM-BOUND/ANCHOR/DIALOGUE thresholds BEFORE fire (W2 · post-tuning 0)

3개 사전등록 가설(H_861/H_862/H_863)의 frozen numerical threshold 를 fire 이전에
verbatim 동결 — git history 로 freeze 가 감사가능(post-tuning 0).

- F-CLM-BOUND  (H_861 Q-TRUST B): RETAIN z_drop<1.0 ∧ GAIN gain>0 · core freeze + edge-only edge-learn
- F-CLM-ANCHOR (H_862 Q-TRUST C): DIST d_anchor_max<0.50 ∧ PROBE consistency>0.80 · E-31 Ψ-거리 제약
- F-CLM-DIALOGUE (H_863 @l6): COHERE/ADEQ(SP>SFT) ∧ LEAK=0 ∧ DIV(self-BLEU<0.8 ∧ rep<0.2)

전 측정 = code 자가채점(g5 · LLM judge 0). 측정 rung(mid d512/L8/E8) 한정 scope.
다음 commit 부터 fire — threshold 재조정 0.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* verdict(CLM): H_861 🔴 · H_862 🔴 · H_863 🟢 — mid-rung fire vs frozen thresholds (post-tuning 0)

runpod H100 fire (pod axbem0acu73314) on the SAVED mid backbone (d512/L8/E8 13.65M,
CE 5.55→1.73). frozen threshold(commit bf98c01) 대비 측정 — 재조정 0.

- F-CLM-BOUND (H_861) 🔴: RETAIN z_drop=1.984 ≥ 1.0 FAIL · GAIN +6.13 PASS.
  readout-only edge freeze 가 forgetting 차단 못함(기초능력 ~2σ CE 상승). 적응은 강하게 흡수.
- F-CLM-ANCHOR (H_862) 🔴: DIST d_anchor_max=0.109 < 0.50 PASS · PROBE consistency=0.783 ≤ 0.80 FAIL.
  on/off 절제 동일 — anchor Ψ제약이 frozen-trunk Ψ상태에 lever 없음(readout-only edge 구조 한계).
- F-CLM-DIALOGUE (H_863) 🟢: COHERE(SP 0.155>SFT 0.042)·ADEQ(0.052>0.014)·LEAK=0·DIV(self-BLEU 0.062·rep 0.026) 4/4 PASS.
  PD Gutenberg 희곡 corpus(license-clean lane①) · self-play 가 SFT-only 능가.

honest: 측정 rung 한정 scope(a_scale_honest_scope) · SW-sim edge-learn(H_679 real) ·
🔴 = publishable negative(a_paper_negative_ok) · 외부 LLM 0 · ShareGPT 0.
HF: dancinlab/anima-clm-verify (model) + dancinlab/anima-clm-p4-dialogue (dataset).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* close(CLM): H_86x frontmatter + roadmap Q-TRUST rows → measured verdicts

H_861/H_862/H_863 frontmatter status+verdict + §5 측정 + §6 결과 를 측정값으로 갱신.
P4 roadmap Q-TRUST B/C 행 + P4.3/P4.4 진행 체크 갱신. verdict-gate(g73)용
slug-numbered .verdicts/86x_*/ 도 추가.

- H_861 🔴 CLOSED-NEGATIVE (RETAIN FAIL · GAIN PASS)
- H_862 🔴 CLOSED-NEGATIVE (DIST PASS · PROBE FAIL)
- H_863 🟢 SUPPORTED-NUMERICAL (4/4 PASS · self-play>SFT)

전부 측정 rung(mid d512/L8/E8) 한정 scope · post-tuning 0 · a_paper_negative_ok.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs(HANDOFF): CLM 가설 검증 캠페인 9-section 인계 (H_861🔴/H_862🔴/H_863🟢)

prereg freeze SHA · verdict 표(frozen threshold 대비) · honest 해석 · fire ops/비용/HF ·
다음 세션 입력(trunk-인접 adapter edge E5 재시도) 기록.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…log (#1556)

CLM 계열(대화·가소성·라우팅·발사) 다방향 가설 백로그 신설.

기존 BIO-CANDIDATES.md 컨벤션과 동일한 도메인-후보 파일. /cycle 과
핸드오프 에이전트가 여기서 disjoint row 를 골라 H_864+ 신규 가설로
spin → W2 사전등록 → fire → row 별 verdict 로 닫는다. 여러 축을 한 번에
열어 병렬 진행 가능(a_wall_first).

- A 임계경로 (H_864~H_868): dialogue scale-climb · adapter-edge 재시도 ·
  PLASTICITY↔대화루프 · 절대품질 · CC corpus 확장
- B routing-escape (H_869~H_871): dispatch-KL · expert-choice · z-artifact 검증
- C plasticity/trust (H_872~H_875): freeze-depth sweep · edge-output anchor ·
  self-reward · forgetting curve
- D deploy chip-fit (H_876~H_878): chip-fit shrink · DECODER byte-match ·
  MITOSIS array

번호는 예약 슬롯 — fire 시 H_86x_*.md 파일 작성으로 실가설화.
검증 규율(prereg-freeze · g5 no-self-judge · a_scale_honest_scope ·
a_paper_negative_ok)은 861/862/863 캠페인에서 상속.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… scale-climb (2/4 falsifier FAIL) (#1557)

* feat(CLM/H_864): large-rung d768/L12/E12 dialogue self-play A/B fire driver (hexa)

Embedded torch payload materialized on pod (H_863 runtime-patch convention).
2-arm A/B under AKIDA-envelope QAT (int4-sym[-7,7] STE + act_bits=4) vs the
bf98c01 frozen falsifiers (scale-invariant relative gates).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(CLM/H_864): PRE-REGISTERED large-rung dialogue self-play H + license-clean corpus manifest

H_864 climbs H_863 mid(d512/L8/E8) one rung up → large(d768/L12/E12 · 44.68M).
4 frozen falsifiers carry from bf98c01 (scale-invariant relative gates, no new freeze).
Corpus = 9 PD Gutenberg plays, 1,559,675 bytes, license-clean gate + 8-pattern leak filter.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* verdict(CLM): H_864 🔴 CLOSED-NEGATIVE — large-rung d768/L12/E12 dialogue self-play (2/4 frozen falsifier FAIL)

large rung 44.68M (3.27x mid). self-play reflux DIVERSITY gate rejected (rep 0.335>=0.2)
→ arm-SP degenerates to SFT → COHERE/ADEQ tie FAIL; held-out gen mode-collapsed
(rep 0.361>=0.2) → DIV FAIL. LEAK PASS. H_863 mid lift does NOT carry to large rung
under bf98c01 frozen scale-invariant gates (post-tuning 0). a_paper_negative_ok.

corpus 9 PD Gutenberg plays 1,559,675 bytes. A100 80GB. measurement rung large ONLY
(a_scale_honest_scope).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…(LOOP PASS · GAIN FAIL · SW-sim) (#1558)

* test(H_866): W2 사전등록 — F-CLM-PLAST-DIALOGUE 임계 동결 (fire 전 separate commit)

on-chip PLASTICITY↔dialogue 루프 결합(H_866 · R2 발사 rung)의 두 falsifier 임계를
fire 전에 동결한다 (W2 · post-tuning 0). 측정은 CODE(g5 · pure-stdlib 결정론 sim).

- F-CLM-PLAST-DIALOGUE-GAIN: 세션 내 online 적응 gain>0 ∧ (학습−무학습 control)>0.02.
- F-CLM-PLAST-DIALOGUE-LOOP: edge-learn 후에도 H_846 transition 유지 —
  should_interrupt 가 motivation 을 EMIT_GATE 기준 단조 추종 (4/4 probe 무회귀).

토대: H_846 🟢 폐루프 · H_679 🟢 HW edge-learn(AkidaUnsupervised). scope=측정 rung 한정.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(H_866): edge-learn↔dialogue 루프 결합 scaffold (SW-sim · 측정 하니스)

H_846 폐루프(threshold→fire→quorum→should_interrupt)에 on-chip edge-learn
lever(Hebbian · num_weights competition)를 직교 결합. threshold=WHEN to emit
(불변) · edge-learn=WHAT pattern fires(online dialogue fit). 결정론 hexa-sim,
seed 고정, g5 code 측정. HW(akida-hw)는 라이브 칩 단일 file-lock 점유로 비파괴
불가 → SW-sim(H_679 SW≠HW 🔴 · 결합 위상 측정, silicon 동치 아님).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* test(H_866): F-CLM-PLAST-DIALOGUE verdict 캡처 — 🔴 CLOSED-NEGATIVE (LOOP PASS · GAIN FAIL · SW-sim)

W2 fire 결과 영속 (사전등록 임계 동결분 272ee0e 대비 · post-tuning 0):
- LOOP=PASS robust 5/5: edge-learn 후에도 H_846 should_interrupt↔motivation
  4-point probe 4/4 무회귀 (closed emit/silence loop intact).
- GAIN=FAIL robust 0/5: gain_learn=-0.036<0 · gain_learn-gain_control=-0.0725
  not>0.02 · multi-register interference (mechanism control 단일 target fit
  0.5→1.0 → 기전 sound · capacity 병목).
- provenance = akida-edge-learn-sw-sim (HW 아님 · H_679 SW≠HW 🔴 · coupling
  topology 측정 · pi5 live R3 loop 비파괴 보존).
- 캡처: canonical .verdicts/clm-plasticity-dialogue/ + id-keyed
  .verdicts/866_clm_plasticity_dialogue/ (H_846 dual-capture 관례 정합 ·
  hexa-native-guard backing dir).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(H_866): on-chip PLASTICITY↔dialogue loop 결합 가설 + scaffold float-fix

UNIVERSE/H_866_clm_plasticity_dialogue.md 신규 (H_861/H_863 9-section shape):
- R2 launch rung. edge-learn(H_679 🟢) 을 COFFESHOP 대화 turn loop(H_846 🟢
  closed loop) 에 결합 — 칩이 online 적응하면서 emit/silence loop 를 안 깨는가.
- 2 사전등록 falsifier (GAIN online adaptation>0 · LOOP should_interrupt 무회귀).
- 판정 🔴 CLOSED-NEGATIVE: LOOP PASS(직교 lever · 무회귀) · GAIN FAIL
  (multi-register interference) · a_paper_negative_ok · a_scale_honest_scope.

scaffold float-fix (gate 불변 · g5 honesty):
- fit()/early_mean()/late_mean() int/int truncation → (n*1.0) float division.
  첫 run 의 fit≡0 (hexa 7/16→0 truncate) 버그 수정 · frozen gate 미변경 ·
  재실행 verdict 동일 재현. _log.txt 기록.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
….007× baseline) (#1559)

* prereg(H_868): freeze F-CLM-CORPUS acceptance thresholds BEFORE build (W2)

lane① CC dialogue corpus expansion 사전등록. 빌드 전 임계 동결(post-tuning 0):
- G1 license-clean = 100% (PD/CC 외 전부 REJECT · forbidden family 0)
- G2 register-leak = 0 (P1 8-pattern 필터)
- G3 size >= 3x H_863 baseline (554,825 -> >= 1,664,475 bytes)
- G4 provenance = source별 100% 기록
외부 LLM 0 · foundation-borrow 0 · ShareGPT/Alpaca/ChatGPT-gen FORBIDDEN.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(H_868): lane① 대화 corpus 확장 — F-CLM-CORPUS 🟢 4/4 (12개 PD Gutenberg 희곡 · 3.007× baseline)

H_863 baseline(4개 PD 희곡 554,825 bytes)을 8개 추가 PD Gutenberg 희곡으로 확장.
재현 builder expand_p4_dialogue.py: fetch → PG boilerplate strip → license-clean gate
(forbidden family + allowed CC · unknown→reject) → 8패턴 register-leak 필터 → V=256
byte-encode → manifest+sha256. 두 gate 모두 순수 code(g5 · LLM judge 0 · deterministic:true).

F-CLM-CORPUS frozen gate(post-tuning 0):
  G1 license-clean : 12/12 ingested = 100% · 0 forbidden-family → PASS
  G2 register-leak : final output 0 · 1줄 drop(build) → PASS
  G3 size          : 1,668,585 ≥ 3× baseline 1,664,475 = 3.007× → PASS
  G4 provenance    : per-source {title,id,family,license,bytes,sha256,leak} 100% → PASS
  VERDICT 🟢 SUPPORTED-NUMERICAL

12개 source 전부 license=PD (Gutenberg public domain stage play). 외부 LLM 0 ·
ShareGPT/Alpaca/ChatGPT-gen 0. lane① canonical SSOT — H_864/H_867/H_874 재사용.
HF dataset dancinlab/anima-clm-p4-dialogue (PRIVATE · COMPLETE — dialogue.bytes +
sha256 + manifest.json + README card 가 모든 source license+provenance 명시).
큰 raw .bytes 는 git-uncommitted (sha256 sidecar + manifest + HF pointer 만 commit).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…1560)

* prereg(CLM): freeze F-CLM-DIALOGUE-ABS absolute floor BEFORE fire (H_867 · W2 · post-tuning 0)

H_867 = dialogue ABSOLUTE quality floor (H_863 proved RELATIVE SP>SFT; H_867 asks
whether SFT+self-play clears a pre-frozen ABSOLUTE held-out floor, a strictly harder bar).

동결 임계 (post-tuning 0):
  COHERE >= 0.060  (order-0 unigram baseline 0.0375 위 · order-1 self-fit bigram 0.0843 아래
                    → 모델이 trivial byte-frequency 가 아닌 실제 multi-byte 대화맥락을 쓴다)
  ADEQ   >= 0.020  (random-byte 생성 3gram F1 = 0.000 → 실제 구조적 overlap 신호 요구)
  LEAK   == 0      (배포 안전 절대 게이트)
모두 PASS → 🟢 / 임의 미달 → 🔴 CLOSED-NEGATIVE (a_paper_negative_ok · mid scale 에서 정직한 🔴 plausible).

eval snapshot = H_863 와 DISJOINT 한 PD Gutenberg 희곡 (Macbeth/Othello/Romeo&Juliet/Pygmalion)
→ 절대 일반화 측정 (in-distribution 암기 재현 ✗). 측정 = code (g5), LLM judge 0. 외부 LLM 0.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(CLM): H_867 held-out PD eval snapshot + absolute-floor bench scaffold

H_867 (F-CLM-DIALOGUE-ABS) 절대 대화품질 FLOOR 검증용:
- INDEPENDENT held-out PD 스냅샷 (Macbeth#1533 · Othello#1531 · Romeo&Juliet#1513 ·
  Pygmalion#3825) — H_863 학습 희곡과 DISJOINT (절대 일반화 측정, 암기 회수 아님).
  592,536 bytes 전체 / 98,752 held-out (매 6번째 64-byte 블록 · seed=867).
  license=PD · 8-패턴 register-leak 필터 통과 (drop 0) · sha256 manifest 기록.
  외부 LLM 0 · ShareGPT/Alpaca/ChatGPT-gen 0 (@L4).
- CLM/bench/h867_dialogue_abs_eval.hexa: 절대 FLOOR 게이트 scaffold (pure-hexa).
  floor_clear(cohere>=0.060 ∧ adeq>=0.020 ∧ leak==0). torch payload 는 run-only
  (H_863 패턴 — train_clm.py 재사용, 미커밋).

snapshot 은 .verdicts/clm-dialogue-abs/ 에만 — CLM/corpus/ 미작성 (H_868 소유).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* verdict(CLM): H_867 F-CLM-DIALOGUE-ABS 🔴 CLOSED-NEGATIVE (절대 coherence floor 미달)

ABSOLUTE 대화품질 FLOOR 측정 — H_863(RELATIVE arm-SP>arm-SFT)의 절대 후속.
model under test = HF backbone(dancinlab/anima-clm-verify:clm_mid_backbone.pt) 에서
H_863 절차 VERBATIM 재구성한 mid arm-SP(d512/L8/E8 ~13.65M · AKIDA int4 STE act4).
eval = never-seen PD 희곡 4편(Macbeth/Othello/Romeo&Juliet/Pygmalion) disjoint held-out.
측정 = CPU/local eval (GPU pod 0 · est $0 · wall 452.2s · torch 2.8.0).

frozen 절대 floor(prereg d5103f2 · post-tuning 0) 대비 측정:
- ABS-COHERE: arm-SP 0.05804 < 0.060  → FAIL (0.002 차; 단 unigram 0.0375 위 = 빈도모형 초과)
- ABS-ADEQ  : arm-SP 0.02138 >= 0.020 → PASS
- ABS-LEAK  : arm-SP 0 == 0           → PASS
→ 🔴 CLOSED-NEGATIVE. A/B 승리(H_863)가 분포이동 하 절대 floor 를 사주지 않음 —
H_867 이 노출하려던 더-어려운-바닥 구분 그대로. mid rung·이 평가분포 한정
(a_scale_honest_scope) · a_paper_negative_ok. 외부 LLM 0 · ShareGPT/Alpaca 0.

artifacts: F-CLM-DIALOGUE-ABS.txt(verdict) · clm_dialogue_abs_result.json(측정) ·
UNIVERSE/H_867_clm_dialogue_absolute.md(H_863 형상 mirror) · backing .verdicts/867_*.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ANCHOR 🔴 (H_861/H_862 공통 E5 fix) (#1561)

* feat(CLM/H_865): trunk-adjacent adapter edge scaffold (E5 fix for H_861/H_862 🔴)

H_861/H_862 가 둘 다 mid rung 에서 실패한 근본 원인 = readout-only edge 가
FROZEN trunk 에 lever 가 없었음. 두 verdict 가 공통으로 지목한 fix =
norm_out 과 FROZEN base readout 사이에 얇은 trainable adapter 삽입:
  norm_out(FROZEN) -> h' = h + adapter(h) -> readout(FROZEN) -> logits
adapter 출력 zero-init -> step0 에서 base mapping 정확 보존(RETAIN 보호).
model_psi 를 ADAPTED norm_out 에서 계산 -> anchor Ψ-penalty 에 실제 grad lever.

W2: threshold 는 bf98c01 동결분 verbatim 재사용 (같은 falsifier, 새 edge arch
이므로 threshold 불변, post-tuning 0). SW-sim edge-learn (H_679 HW 실재).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* verdict(CLM/H_865): F-CLM-ADAPTER-EDGE — BOUND 🟢 / ANCHOR 🔴 (어댑터 엣지 mid-rung fire)

trunk-adjacent 얇은 어댑터(norm_out↔FROZEN readout 사이, up-proj zero-init)로
H_861/H_862 가 공통 지목한 readout-only "지렛대 없음" 결함을 수리하고 두 falsifier
세트를 재실행. frozen threshold = bf98c01 verbatim · post-tuning 0 · 측정 rung 한정.

- F-CLM-BOUND  🟢: z_drop=-12.28<1.0 (RETAIN) ∧ gain=+7.37>0 (GAIN). H_861 의 z_drop=+1.98
  forgetting 을 어댑터의 additive 경로(base readout 보존)가 차단 → RETAIN 실패 CLOSED.
- F-CLM-ANCHOR 🔴: d_anchor_max=0.175<0.50 (DIST PASS) ∧ on/off NON-identical(0.175 vs 0.595,
  지렛대 복원 — H_862 의 "지렛대 없음" 수리) 이나 probe_consistency=0.143<0.80 (PROBE FAIL).
  앵커 제약은 이제 실효 지렛대를 가지나 정체성 분포 일관성은 아직 gate 미달 (a_paper_negative_ok).

headline = 둘 중 하나 close (BOUND 🟢, ANCHOR 🔴 PROBE). SW-sim edge-learn (H_679 real).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs(UNIVERSE/H_865): 어댑터 엣지 — BOUND 🟢 / ANCHOR 🔴 (H_861/H_862 공통 E5 fix)

H_861(F-CLM-BOUND 🔴)/H_862(F-CLM-ANCHOR 🔴) 가 공통 지목한 readout-only "지렛대 없음"
결함을 trunk-adjacent 얇은 어댑터(norm_out↔FROZEN readout, up-proj zero-init)로 수리하고
두 falsifier 세트를 재실행한 H 문서 + id-prefixed 백킹 verdict(g73 guard).

- F-CLM-BOUND 🟢 SUPPORTED-NUMERICAL: z_drop=-12.28<1.0 ∧ gain=+7.37>0 (H_861 forgetting CLOSED).
- F-CLM-ANCHOR 🔴 CLOSED-NEGATIVE: DIST 0.175<0.50 PASS ∧ on/off NON-identical(지렛대 복원,
  H_862 결함 수리) ∧ PROBE 0.143<0.80 FAIL (a_paper_negative_ok).

frozen threshold = bf98c01 verbatim · post-tuning 0 · 측정 rung 한정 a_scale_honest_scope.
adapted adapter ckpt + result + verdict → HF dancinlab/anima-clm-adapter (a_hf_complete).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… H_873 in-flight (#1562)

A그룹 5-fire 결과를 백로그·발사사다리 SSOT에 흡수.

CLM-CANDIDATES.md:
- Consumed에 A-group 5-fire 배치 기록 (H_864🔴·H_865🟢BOUND/🔴ANCHOR·H_866🔴·H_867🔴·H_868🟢, PR #1557~1561)
- section A 5행 consumed 표기 + follow-on 신규 row 2종(H_864r step-fair·H_867r post-adapter)
- H_873(H_862 완성) IN-FLIGHT 표기

SBS.md:
- R1 🟠 MEASUREMENT-COMPLETE (mid rung int4 envelope) · R2 🟡 partial(BOUND🟢·LOOP🟢·ANCHOR🔄·GAIN🔴) · R3 🟡 partial(mid🟢·large🔴·abs🔴·corpus🟢)

theme: 4× 🔴가 모두 readout/edge 용량·도달 한계 → H_865 adapter가 BOUND 닫음, ANCHOR-PROBE 잔차는 H_873로.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…d byte-identical (🟠 HW-pending) (#1563)

* chore(H_877): freeze F-CLM-DECODER-MID prereg + backing verdict dir (hexa-native-guard)

H_877 (DECODER byte-identical transplant @ mid) 사전등록 — H_680 🟢 toy
byte-match(n=16, total_hamming=0/16000 live AKD1000)를 mid rung
(d512/L8/E8)으로 확장. W2 규율: total_hamming==0 임계를 forward 실행 전
동결. HW 미도달 시 SW-path determinism byte-identical 축소 주장(정직)으로
강등. backing dir(.verdicts/877_*)를 .md 보다 먼저 생성·커밋
(hexa-native-guard). 측정 by CODE(g5). scope mid 한정(a_scale_honest_scope).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(H_877): DECODER byte-identical transplant @ mid — SW int4 forward byte-identical (🟠 HW-pending)

H_680 🟢 toy byte-match(n=16 · H_860 live AKD1000 total_hamming=0/16000)를
mid 생산 rung(d512/L8/E8)으로 확장. ADD-ONLY h877 scaffold(AKIDA/h877/
h877_decoder_bytematch_mid.hexa)가 이미 HW-검증된 SW int4 forward primitive
(HEXAD/CHAT/server/akida_sw_lif.cascade_forward · toy max Hamming 0)를
재구현 없이 CONSUME — mid rung forward 의 byte-stability 측정.

결과(measured by CODE g5):
- SW int4 forward DETERMINISTIC byte-identical at mid — total_hamming=0
  (in-process repeat ∧ independent cross-process, sha256 일치).
  eval byte set seed=877 · 32768 ints → 131072 bits.
- HW arm: pi5 AKD1000(192.168.50.155) ping 도달하나 제공 creds 전 user 거부
  → HW-pending(정직). NON-DISRUPTIVE: pi5 접속 성공 0 · 변경 0 · live R3 loop
  미접촉.

verdict: 🟠 HW-PENDING — SW determinism byte-identical(HW==SW 의 필수 선행
불변)·HW==SW mid silicon 재확인 잔여. W2 임계(total_hamming==0) forward 실행
전 동결 · post-tuning 0. scope mid 한정(a_scale_honest_scope · deploy
shrink track H_876 별개).

산출: UNIVERSE/H_877_clm_decoder_bytematch_mid.md ·
.verdicts/clm-decoder-bytematch-mid/{prereg,verdict} ·
.verdicts/877_clm_decoder_bytematch_mid/{verdict-mirror,verbatim-run} ·
AKIDA/h877/h877_decoder_bytematch_mid.hexa. 기존 AKIDA backend/simulator 불변.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…CLOSED-NEGATIVE (3/4) (#1564)

* chore(H_870): hexa-native-guard backing dir for expert-choice routing verdict

850 routing-escape lever C 의 numeric-id 백킹 디렉터리를 먼저 커밋.
🟢/SUPPORTED 토큰을 담는 .md 보다 앞서 anchor 를 landing (hexa-native-guard).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(H_870): pre-register F-CLM-EXPERT-CHOICE frozen thresholds (pre-fire)

W2 사전등록 (post-tuning 0): 4 falsifier 동결 —
  EC-VAR(load CV<0.25) ∧ EC-COLLAPSE(no collapse) ∧
  EC-QUALITY(CE <= token-choice + 0.10nats) ∧ EC-BALANCE(CV < token-choice CV).
expert-choice(expert 가 token 선택) vs token-choice baseline, multi-seed, code 측정(g5).
a_scale_honest_scope: 측정 rung 한정 · routing-z>3.0 은 chip-array deploy gate(별개).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(H_870): expert-choice routing variant hexa anchor (@l3 lever C)

routing 방향 반전: token-choice(token→expert top-k) → expert-choice(expert→token top-C).
per-expert load 가 capacity C 로 구조적 균형(Zhou et al. 2022) — aux loss·collapse 없음.
array_moe.{hexa,py} 미수정 신규 variant. lever C contract surface + capacity 공식.
a_scale_honest_scope: routing-z>3.0 은 chip-array deploy gate(별개 track).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(H_870): F-CLM-EXPERT-CHOICE 🔴 CLOSED-NEGATIVE — load 완벽균형, quality regression

CPU-local fire ($0, no pod) · 2 arms × seed{42,43,44} × 120 step · E8 mid envelope.
expert-choice(expert→top-C token) vs token-choice baseline(array_moe):
  EC-VAR     load CV 0.000<0.25                     PASS (capacity C=128/expert 구조균형)
  EC-COLLAPSE 8/8 active 전 seed                     PASS
  EC-BALANCE  CV 0.000 < token-choice 0.843          PASS
  EC-QUALITY  CE 0.992 <= 0.519+0.10                 FAIL (+0.473 nats · coverage~0.90)
3/4 PASS → 🔴: load-monopoly 는 완벽 해소되나 fixed-capacity 가 ~10% token drop →
toy rung 에서 CE regression(margin 초과). a_paper_negative_ok · threshold 재조정 0.
a_scale_honest_scope: routing-z>3.0 은 chip-array deploy gate(별개).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(H_870): expert-choice routing 가설 md 등록 — 🔴 CLOSED-NEGATIVE (3/4)

H_863 shape 미러(frontmatter + §1 가설 … §9 sibling). lever C(@l3) routing 반전.
F-CLM-EXPERT-CHOICE: VAR/COLLAPSE/BALANCE PASS · QUALITY FAIL(+0.473 nats) → 🔴.
load-monopoly 구조적 완벽 해소(CV 0.0)나 fixed-capacity token-drop 이 quality 비용.
a_paper_negative_ok · a_scale_honest_scope(측정 rung E8 한정).
공유 index(CLM-CANDIDATES/README/P4/SBS) 및 기존 H/verdict dir 미수정.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…rge rung) (#1565)

* chore(864r): id-keyed verdict dir placeholder (hexa-native-guard, pre-green)

* feat(864r): step-fair self-play scale-climb fire scaffold (large rung, step-sweep)

H_864 large-rung self-play 🔴 was CONFOUNDED by undertraining (2000 steps -> mode
collapse rep 0.361 -> reflux DIVERSITY-gate REJECT -> arm-SP degenerates to SFT).
H_864r re-runs the SAME large d768/L12/E12 AKIDA-envelope QAT with a STEP SWEEP
(2000..20000) until held-out repetition < 0.20 (non-collapsed) before judging
self-play under the UNCHANGED bf98c01 frozen gates (post-tuning 0). The only
change is more training steps (a methodology fix, not a threshold move).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(864r): step-fair self-play scale-climb verdict 🔴 (confound resolved)

H_864 large-rung self-play 🔴 was CONFOUNDED by undertraining (2000 steps →
mode-collapse rep 0.361 → reflux DIVERSITY-gate REJECT → arm-SP퇴화 to SFT).
H_864r re-runs the SAME large d768/L12/E12 (44.68M) AKIDA-envelope QAT with a
STEP SWEEP; the ONLY change is more training steps (methodology fix, gates
UNCHANGED bf98c01, post-tuning 0).

결과:
- 비붕괴 regime 이 step 2000 에서 이미 도달 (held-out repetition 0.03027 < 0.20;
  H_864 는 같은 2000 step 에 0.361). cap 20000 불필요 — 붕괴는 학습 config 산물이었음.
- self-play reflux DIVERSITY gate PASS (self-BLEU 0.07141, rep 0.00427) → 48/48
  환류 fold → arm-SP 가 진짜 SFT+self-play (H_864 의 SFT-퇴화 아님). 테스트 공정.
- 4 frozen falsifier (bf98c01), SP vs SFT: COHERE PASS(0.03406>0.02983) ·
  ADEQ FAIL(0.02260<0.03955) · LEAK PASS(0) · DIV PASS(0.20074<0.8 ∧ 0.02386<0.2).
- 🔴 CLOSED-NEGATIVE 3/4 — 정직·confound-free: 비붕괴여도 self-play 는 coherence↑·
  다양성·leak-free 는 carry 하나 strict 응답적합도는 carry 못함. a_paper_negative_ok.

fire: runpod A40 48GB pod yp108bjox2pb5s (terminated, no ghost). est ~$1-4.
HF: dataset dancinlab/anima-clm-large-stepfair (result+log+payload+manifest+README).
scope: a_scale_honest_scope — MEASUREMENT rung large ONLY.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ALE 🟢 (#1566)

* chore(H_871): backing verdict dir (hexa-native-guard)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(H_871): pre-register F-CLM-ROUTING-Z-SCALE (frozen pre-fire)

routing-z(rung) monotone↑ ∧ mid-tiny margin≥0.5 → 🟢 artifact confirmed;
else 🔴 real ceiling. metric = judge_clm.routing_z verbatim. arm AB · real
kowiki corpus · rungs tiny/small/mid · seeds{42,43,44}. post-tuning=0.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(H_871): routing-z scale-ladder fire artifacts + driver (backing-dir BEFORE .md · hexa-native-guard)

M1 measurement-artifact test (CLM/P4_PRODUCTION_ROADMAP @l3): measure
routing-diversity z (verbatim CLM/model/judge_clm.routing_z) at tiny/small/mid
rungs on REAL kowiki corpus, arm AB, seeds {42,43,44}. runpod A40 fire.

- CLM/model/h871_routing_z_scale.hexa (hexa-native driver) ⇄ .py (torch payload)
- .verdicts/clm-routing-z-scale/{F-CLM-ROUTING-Z-SCALE.txt,h871_routing_z_scale.json,fire_2026_05_31.log}
- .verdicts/871_clm_routing_z_scale/ id-keyed backing copy (verdict-gate g73)

result: z monotone tiny +1.577 <= small +2.167 <= mid +2.186, margin +0.608 >= frozen 0.5.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(H_871): CLM routing-z = SCALE ARTIFACT (M1) — F-CLM-ROUTING-Z-SCALE 🟢 SUPPORTED-NUMERICAL

UNIVERSE/CLM-CANDIDATES group B. Tests whether the toy routing-z 🔴
(H_847/850/852/853) is a scale artifact: measures routing-diversity z
(verbatim judge_clm.routing_z) at tiny/small/mid rungs on REAL kowiki corpus.

result (vs frozen prereg 3c3b43f, post-tuning 0):
  routing-z monotone tiny +1.577 <= small +2.167 <= mid +2.186
  margin mid-tiny = +0.608 >= frozen 0.5 gate -> ARTIFACT CONFIRMED 🟢
  honest: z not yet past 3.0 deploy gate (max cell 2.28); rise is mostly the
  E 4->8 step (Dirichlet-null mechanism, H_852) -> residual lever = expert count.

runpod A40 · torch 2.1.0+cu118 · code-measured (g5, no LLM) · a_paper_negative_ok.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… ⨯ load-balance 🔴 (ratio 54.5 ≫ 4.0) → 🔴 CLOSED-NEGATIVE (#1567)

* prereg(H_878): F-CLM-MITOSIS-ARRAY 동결 — N-chip array dispatch SW-sim 사전등록

W2 사전등록(post-tuning 0): per-chip load-balance(max/min dispatch ratio <= 4.0
∧ no-starve) ∧ aggregate-emit coherence(array vs single-model reference:
logit atol 1e-4 · argmax hamming 0 · CE atol 1e-4). axis N=[1,2,4,8] · E=8 experts
N개 chip 에 disjoint 분배 · seeds {42,43,44} · measure-by-CODE(g5).
hexa-native-guard: .verdicts/878_*/ backing dir + prereg 를 🟢-token .md 이전에 동결.
SW-sim scope 명시(silicon NOT · 1 AKD1000 only · a_scale_honest_scope).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(H_878): N-chip MITOSIS array dispatch SW-sim — harness + verdict (coherence 🟢 exact · load-balance 🔴 ratio 54.5 ≫ 4.0)

SW-sim of @L2 deploy chip-fit track (1 AKD1000 today → software N-chip array).
Partition E=8 sparse experts onto N∈{1,2,4,8} disjoint chips, dispatch via SAME
router, per-chip emit → gather, vs single-model reference.

- aggregate-emit COHERENCE: EXACT over N∈{2,4,8} (max|logit|=0, argmax hamming=0,
  |CE|=0) — SW scatter/gather contract correct (distributed == monolithic).
- per-chip LOAD-BALANCE: FAIL — trained router monopolizes; max/min dispatch
  ratio 1.88(N=2)/28.8(N=4)/54.5(N=8) ≫ frozen bound 4.0.
- F-CLM-MITOSIS-ARRAY (coherence ∧ load-balance both required) → 🔴 CLOSED-NEGATIVE
  (honest negative, a_paper_negative_ok). Thresholds frozen pre-run, post-tuning 0.

silicon (chip-to-chip DMA latency) NOT measured (a_scale_honest_scope). $0 Mac CPU.
g5 code-measured · deterministic (reproduced byte-identical 2x).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs(H_878): UNIVERSE writeup — N-chip MITOSIS array dispatch SW-sim 🔴 CLOSED-NEGATIVE

coherence 🟢 (exact distributed==monolithic) ⨯ load-balance 🔴 (router monopoly
ratio 54.5 ≫ 4.0). 10-section Korean writeup · sibling H_852 deploy-side 귀결.
SW-sim scope explicit (silicon NOT measured · 1 AKD1000 today).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… (H_879~884) + re-pin inviolable HW-only-learning rule (#1568)

부분부분학습(edge-only) 후보 6종을 CLM-CANDIDATES 백로그에 예약 등록 (목록만 · 미발사).

- 신설 §E partial/incremental on-chip learning: H_879 층별 부분 edge-learn ·
  H_880 adapter 스택 누적 · H_881 점진 동결 스케줄 · H_882 영역별 학습 게이트 ·
  H_883 replay 연속학습 · H_884 edge-출력 정체성 일반화
- INVIOLABLE 재고정: on-chip 비결정 학습 = HW↔SW 유일 차이(추론은 byte-동일
  H_877/H_680 🟢 · 학습은 HW≠SW H_679 🔴). 결정적 SW 흉내로 대체 = 즉시 reject.
  (@L1 · H_679 · 위반금지)
- 의존(H_872·H_873·H_875) verdict 후 임계 정밀화 → 발사는 그 다음. 지금은 미발사.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…f-reward 3/4, ADEQ FAIL vs self-play) (#1569)

* chore(H_874): pre-register F-CLM-SELF-REWARD thresholds + fire scaffold (BEFORE fire)

@l6 method C (self-reward / RLHF류 · 외부 LLM·human 0). H_863(method B
=SFT+self-play) 후속. 발사 전 동결 (post-tuning 0 · hexa-native-guard 준수):

- .verdicts/clm-self-reward/F-CLM-SELF-REWARD_prereg.txt — 4 falsifier 동결
    F-CLM-SR-COHERE : coherence(SR) > coherence(SP)   (RELATIVE · strict)
    F-CLM-SR-ADEQ   : adequacy_f1(SR) > adequacy_f1(SP) (RELATIVE · strict)
    F-CLM-SR-LEAK   : register_leak(SR) == 0           (ABSOLUTE)
    F-CLM-SR-DIV    : self_bleu(SR)<0.8 ∧ repetition(SR)<0.2 (mode-collapse)
- CLM/train/h874_fire_mid_rung_self_reward.hexa — mid d512/L8/E8 fire 드라이버
    (torch payload EMBEDDED · pod 에서 materialize · py-block 준수)
- .verdicts/874_clm_self_reward/README.md — hexa-native-guard backing dir

reward = w_coh*coherence + w_adq*adequacy + w_div*diversity_bonus (g5 · NO judge).
self-reward = eval 신호를 training-selection(best-of-K)에 적용 = RLHF류 자가채점.
scope a_scale_honest_scope: mid · RELATIVE(vs SP). H_867 가 arm-SP < absolute
floor(mid) 를 보임 → 🟢 = "SR>SP at mid", 배포/floor 주장 아님.
corpus = H_868 PD Gutenberg 희곡 (HF dancinlab/anima-clm-p4-dialogue · license-clean).
외부 LLM 0 · ShareGPT/Alpaca 0.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* verdict(H_874): F-CLM-SELF-REWARD 🔴 CLOSED-NEGATIVE — self-reward 3/4 (ADEQ FAIL vs self-play)

mid d512/L8/E8 A100 fire · arm-SR(self-reward best-of-K K=6) vs arm-SP(H_863 self-play):
COHERE SR 0.144>SP 0.136 PASS · ADEQ SR 0.0319<SP 0.0375 FAIL · LEAK(SR)=0 PASS · DIV PASS.
self-reward improved coherence+repetition but lost adequacy → joint W2 criterion rejected.
RELATIVE mid test (a_scale_honest_scope) · external LLM 0 · g5 self-score · a_paper_negative_ok.
backing .verdicts/874_clm_self_reward/ committed BEFORE the .md (hexa-native-guard).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs(H_874): UNIVERSE doc — self-reward dialogue 🔴 CLOSED-NEGATIVE (@l6 method C)

self-reward(RLHF-like, NO external judge · g5) vs SFT+self-play(H_863) mid A/B.
3/4 falsifier PASS; F-CLM-SR-ADEQ FAIL (SR 0.0319 < SP 0.0375). coherence/adequacy
trade-off — reward=eval aggregate optimum ≠ adequacy optimum at mid. honest arch note
(attention+QLinear-MoE, relative test unaffected). backing verdict committed prior.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… SBS top blocks (#1570)

학습 = HW↔SW 유일 차이 불가침 규칙을 핵심 진입문서 3곳 상단에 박음.

규칙: 추론은 HW↔SW byte-identical(H_877/H_680 🟢) · 학습(on-chip 비결정
PLASTICITY)만이 HW≠SW(H_679 🔴) → 학습은 칩 위 비결정 edge-learn 으로만,
결정적 SW 흉내로 대체 = 즉시 reject (@L1 · H_679).

- CLM/CLM.md @goal 직하 INVIOLABLE 블록
- CLM/P4_PRODUCTION_ROADMAP.md 제목 직하 INVIOLABLE 1줄
- LAUNCHPAD/SBS.md START-HERE 블록에 R2 게이트 INVIOLABLE

(UNIVERSE/CLM-CANDIDATES.md §E 에는 #1568 에서 이미 박힘)

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…000 budget · same-run drop −2.06 nat (#1571)

* feat(H_876): pre-register chip-fit shrink falsifier + driver (frozen pre-run)

@l5 배포(deploy) track — AKD1000 chip-fit node budget(≤~1.2M)으로 mid 측정 rung
(d512/L8/E8=13.65M) 아키텍처 축소 + 품질(CE) 보존 측정. 측정 전 W2 동결.

- F-CLM-CHIPFIT_prereg.txt: FIT(params≤1.2M) ∧ DROP(CE 증가<1.0 nat) 사전등록.
  node-budget basis = P0_ARCHITECTURE §10.2/§11.3 (AKD1000 ~1.2M 노드).
  chip-fit cfg = d148/L8/E8 = 1,199,508 (mid topology 보존, d만 512→148, 11.38× 축소).
- h876_chip_fit_shrink.hexa: 드라이버(runtime-gen python body shell-out, NEW .py 0).
- .verdicts/876_clm_chip_fit_shrink/: backing dir (hexa-native-guard).

DEPLOY ⊥ MEASUREMENT (a_scale_honest_scope) · a_paper_negative_ok.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* verdict(H_876): F-CLM-CHIPFIT 🟢 — chip-fit d148/L8/E8 = 1,199,508 ≤ 1.2M · drop −2.06 nat

Same-run A/B (g5, no new .py): mid d512/L8/E8=13,653,768 last_ce=3.3445 vs
chip-fit d148/L8/E8=1,199,508 last_ce=1.2826. 11.38× shrink at mid L8/E8 topology.
  FIT  : 1,199,508 ≤ 1,200,000 (AKD1000 single-chip node budget) → PASS
  DROP : quality_drop = −2.0619 nat < 1.000 → PASS  → 🟢 SUPPORTED-NUMERICAL

Deploy ⊥ measurement (a_scale_honest_scope) · negative drop = honest same-run
micro-corpus artifact, both pre-registered predicates hold (a_paper_negative_ok).
backing raw + runtime body committed BEFORE the H .md (hexa-native-guard).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(H_876): chip-fit shrink 🟢 — d148/L8/E8=1,199,508 ≤ 1.2M AKD1000 budget · same-run drop −2.06 nat

@l5 deploy track (⊥ measurement, a_scale_honest_scope). mid 측정 rung
d512/L8/E8=13,653,768 → chip-fit d148/L8/E8=1,199,508 (11.38× shrink, L8/E8
topology 보존, d만 512→148). same-run A/B (g5, no new .py): node-count만 변수.
  FIT  : 1,199,508 ≤ 1,200,000 → PASS
  DROP : 1.2826 − 3.3445 = −2.0619 nat < 1.0 → PASS  → 🟢 SUPPORTED-NUMERICAL
음수 drop = 정직한 same-run micro-corpus artifact (a_paper_negative_ok). $0 CPU-local.
verdict/backing committed prior (hexa-native-guard).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…pass at 8/9 depths (BOUND E5) (#1572)

* feat(CLM/H_872): freeze-depth sweep 스캐폴드 + 사전등록 (BOUND E5)

H_861 🔴 (readout-only 엣지가 forgetting 차단 실패) → H_865 🟢 (trunk-adjacent
어댑터가 BOUND 수리) 사이의 freeze 경계를 매핑. depth = trainable 로 푸는 상위
trunk layer 수 (0=full-freeze/adapter-only .. 8=full-trunk-trainable), H_865
어댑터를 항상 엣지로 유지. depth→(z_drop,gain) 곡선을 측정.

W2: F-CLM-BOUND 임계(bf98c01) verbatim 재사용 — z_drop<1.0 ∧ gain>0, post-tuning 0.
sweep 은 freeze-depth 만 변화. 🟢 iff 통과 depth 존재 else 🔴 (a_paper_negative_ok).
scope a_scale_honest_scope (mid d512/L8/E8). SW-sim edge-learn (H_679 HW 실재).

hexa-native-guard: .verdicts/872_* 백킹 dir + 스캐폴드를 .md 이전에 커밋.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* verdict-backing(872): A100 fire log for freeze-depth sweep

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(CLM/H_872): freeze-depth sweep 🟢 — F-CLM-BOUND RETAIN∧GAIN both pass at 8/9 depths (BOUND E5)

Sweep core/edge freeze depth (0..8 top trunk layers trainable) on the H_865
adapter edge; thresholds reused VERBATIM from F-CLM-BOUND prereg (bf98c01),
post-tuning 0, depth-only variation, g5 code-measured.

depth→(z_drop,gain) curve (A100 fire, ~$0.50):
  passing_depths=[0,1,2,3,5,6,7,8] (8/9) · best=depth3 (z_drop -13.61, gain +8.91)
  depth=4 SOLE forgetting spike (z_drop +9.67≥1.0) — boundary NON-monotone
  depth=0 (adapter-only, 66K params) already passes = H_865 BOUND 🟢 anchor

🟢 SUPPORTED-NUMERICAL · mid d512/L8/E8 · a_scale_honest_scope (measurement rung only).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… edge does NOT lift absolute dialogue coherence over the frozen 0.060 floor (#1573)

* verdict(CLM/H_867r): F-CLM-DIALOGUE-ABS-POSTADAPTER 🔴 — H_865 adapter edge vs frozen H_867 floor (backing+verdict+bench)

H_867 full-finetune arm-SP missed the absolute coherence floor by 0.002 (0.05804<0.060).
H_867r re-runs the SAME frozen floor (F-CLM-DIALOGUE-ABS_prereg.txt @ d5103f2, REUSED
VERBATIM, post-tuning 0) on the H_865 trunk-adjacent ADAPTER edge (frozen backbone + thin
rank-64 AdapterEdge between frozen norm_out and frozen readout). Held-out = H_867's verified
PD snapshot (sha256 a79789623a6160e2 reuse, disjoint from SFT plays). CPU/local $0 wall 1770.9s.

adapter arm-SP vs frozen floor:
  ABS-COHERE 0.04369 < 0.060 FAIL (below even H_867 full-finetune 0.05804)
  ABS-ADEQ   0.01762 < 0.020 FAIL (H_867 passed at 0.02138)
  ABS-LEAK   0 == 0 PASS
→ 🔴 CLOSED-NEGATIVE — the adapter lever that lifted F-CLM-BOUND does NOT lift absolute
held-out dialogue coherence over 0.060 (retain-while-learn != absolute generation quality at
this scale). a_paper_negative_ok · a_scale_honest_scope (mid rung, this eval distribution).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs(UNIVERSE/H_867r): F-CLM-DIALOGUE-ABS-POSTADAPTER 🔴 writeup — adapter does NOT lift absolute coherence over 0.060

H_865 trunk-adjacent adapter edge vs the SAME frozen H_867 floor (d5103f2, REUSED verbatim):
adapter arm-SP coherence 0.04369 < 0.060 (below even H_867 full-finetune 0.05804) AND adequacy
0.01762 < 0.020 (H_867 passed) · leak 0. The retain-while-learn lever (F-CLM-BOUND 🟢) is NOT an
absolute-generation lever at mid scale. a_paper_negative_ok · a_scale_honest_scope.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… adapter(H_865)≥300 (#1574)

* chore(H_875): pre-register F-CLM-FORGET-CURVE thresholds + backing dir (W2 · pre-fire)

forgetting-curve dose-response 사전등록 (post-tuning 0):
- crossing threshold = F-CLM-BOUND RETAIN cutoff z_drop<1.0 VERBATIM (bf98c01) 재사용
- step ladder [1,2,4,8,16,32,64,128,200,300] · readout-only(H_861) + adapter(H_865) 양 edge
- PASS = 양 edge 곡선 생성 AND 최소 adapter edge 의 finite safe-step-budget 식별
- a_paper_negative_ok (step=1 즉시 망각도 publishable)
- hexa-native-guard: .verdicts/875_*/ backing dir 를 green-token .md 이전 선커밋

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(H_875): F-CLM-FORGET-CURVE dose-response — backing dir + verdict + harness (pre-.md)

step→z_drop curve over ladder [1..300] for readout-only (H_861) vs adapter (H_865) edge.
readout-only safe step budget=2 (crosses RETAIN gate z_drop≥1.0 at step 4);
adapter safe budget≥300 (gate never crossed, z_drop negative throughout — base ability
preserved/improved). adapter extends budget ≥150x. crossing thr z_drop<1.0 frozen bf98c01,
post-tuning 0. mid d512/L8/E8 SW-sim (H_679 real). 🟢 SUPPORTED. a_paper_negative_ok.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs(H_875): F-CLM-FORGET-CURVE 망각곡선 dose-response 🟢 SUPPORTED — readout-only safe budget=2 vs adapter≥300 (≥150x 연장)

readout-only(H_861) z_drop step4 에서 RETAIN gate 교차→+78 폭주; adapter(H_865) z_drop 전 구간 음수, gate 미교차(기초능력 보존/개선). crossing thr z_drop<1.0 frozen bf98c01, post-tuning 0. mid d512/L8/E8 SW-sim (H_679 real). 측정 rung 한정. a_paper_negative_ok.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ing (pinned) (#1575)

* docs(CLM): P5 AKIDA 7B-class strategy + reflective(incremental) learning — pinned

7B를 AKIDA로 "완성"하는 최종 전략 + 반영(점진)학습 전략을 검증된 H_861~H_884
근거로 종합한 SSOT 신설 + CLM.md·P4 상단 포인터로 박음.

P5 요지:
- 7B ≠ 단일칩(≈5,800×) → MITOSIS 멀티칩 어레이: chip-fit 샤드(H_876🟢 ≤1.2M) ×
  N칩, array 출력 coherence EXACT(H_878🟢), 추론 byte-동일(H_877).
- 반영학습 스택(각 칩 edge·비결정·INVIOLABLE): adapter edge(H_865🟢) + 얕은
  freeze(H_872🟢) + 안전예산 ≥300step(H_875🟢) + 정체성 anchor(H_873) + replay(H_883)
  + self-play/corpus(H_863🟢/H_868🟢). 부분부분 적용 = §E(H_879~884).
- OPEN(정직): 칩간 load-balance(H_878🔴)·대화 절대품질(H_867🔴)·N정량/실칩 array·
  routing-z 게이트. a_scale_honest_scope 전면 명시.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs(P5): restructure to 2 axes — AXIS1 single-chip 7B (expert streaming/paging) + AXIS2 reflective learning

사용자 정정 반영: 7B를 *단일칩*으로 학습하는 축(expert streaming · 상주≤1.2M≠총7B,
chip-fit H_876🟢) + 반영(점진) 학습 축(adapter/freeze/budget/identity/replay) 2축으로
명시. MITOSIS array = 축1의 multi-chip scale-out 으로 격하. expert-streaming 글루는 OPEN.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…+ a_dont_kill_live_compute (#1577)

dispatched 서브에이전트가 받지도 못할 runpod Monitor를 기다리다 영구
stall하던 근본원인을 차단하고, CPU 91.7%로 실제 연산중이던 에이전트를
stall로 오인해 죽인 이번 실수를 재발 방지하도록 두 규칙을 박는다.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…우팅해 H_862 정체성-anchor PROBE 실패 닫음 (#1578)

* verify(CLM/H_873): F-CLM-ANCHOR-EDGE 🟢 — anchor on edge OUTPUT dist closes H_862 (사전등록 backing + verdict + hexa)

mid d512/L8/E8 fire · DIST 0.16016<0.50 PASS · PROBE 0.99202>0.80 PASS (H_865 was 0.143) · lever NON-identical.
PSI_ONLY arm reproduces H_865 RED (0.143) → distributional JS-to-p_pre term is the closing cause.
frozen thr verbatim bf98c01 · post-tuning 0 · g5 CODE-measured · SW-sim edge-learn · CPU-local $0.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs(UNIVERSE/H_873): F-CLM-ANCHOR-EDGE 🟢 — H_862 정체성-anchor PROBE 실패 닫음 (Q-TRUST C 축)

anchor Ψ-penalty 를 은닉상태 → readout OUTPUT 분포(JS-to-p_pre)로 라우팅.
PROBE 0.143(H_865) → 0.992 · DIST 0.160<0.50 · LEVER 0.160 vs 0.595 NON-identical.
PSI_ONLY 절제가 H_865 RED(0.143) 재현 → 분포항이 닫는 인과 격리. mid rung · post-tuning 0.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ll checkpoints RETAIN∧GAIN) (#1579)

* feat(CLM/H_881): progressive-freeze scaffold + prereg (BOUND E5 dynamic)

H_872 (🟢) measured a STATIC freeze depth at END-of-session only; H_881 moves the
freeze boundary DYNAMICALLY across a 6-segment session and asks whether RETAIN∧GAIN
holds SUSTAINED at EVERY checkpoint, not just the end. depth_k semantics identical to
H_872 (bf98c01-anchored); boundary moves keep learned values (no re-init, only
requires_grad flips + opt rebuild — edge-only piecewise, @L1/H_679).

W2: F-CLM-BOUND thresholds (bf98c01) verbatim — z_drop<1.0 ∧ gain>0 at EVERY of 6
checkpoints, post-tuning 0. 7 pre-registered schedules (2 static controls + 5 dynamic).
🟢 iff ∃ sustaining schedule else 🔴 (a_paper_negative_ok). scope a_scale_honest_scope
(mid d512/L8/E8). SW-sim edge-learn (H_679 HW real).

hexa-native-guard: .verdicts/881_* backing dir + scaffold committed BEFORE the .md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(CLM/H_881): progressive-freeze 🟢 — ∃ sustaining schedule (7/7 schedules, 6/6 checkpoints RETAIN∧GAIN)

동적 freeze 스케줄(세션 중 경계 이동)이 H_872 정적 freeze 대신 F-CLM-BOUND
RETAIN∧GAIN을 세션 전 구간에서 지속하는지 검증. 7개 사전등록 스케줄 전부 6개
checkpoint에서 PASS(best=S6_cross4 min_gain 6.283 · max_z_drop −43.47≪1.0).
H_865 zero-init adapter가 load-bearing, freeze 스케줄은 그 위에 안전하게 탑승.
임계 bf98c01 verbatim · post-tuning 0 · g5 code-measured · 측정 rung 한정.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
dancinlife and others added 30 commits June 4, 2026 13:53
…TRUCTURE-전이, CDV2 전용 아님 (#1813)

OΩ6(F-OMEGA-CLM-TRANSFER #1805)이 deferred 했던 "minimal full-conv path (i)"를 해결.
OΩ6은 실제 production conv .clm이 SINGLE-head(self.readout만, A/G dual head 없음)라서
min-gate가 temperature rescale로 붕괴 → substrate-EMPTY 라고 판정했다. OE1은 그 최소 수정을
실행: production CLMConvMoE 블록(CausalDilatedConv1d·TrunkLayer·MoEConvLayer)에 native A/G
dual head(2번째 head_g, prev-byte)를 graft 하고, competent + leak-free 로 학습(6.95M, d384
L6 E8, 12000 step), 동일 OH1/OΩ1 falsifier 를 conv 모델 자신의 A-head 에 재실행(외부 CDV2 없음).

결과 (held-out TEST CE, nats/byte, verbatim):
  base 3.0978 | a_only 1.3032 | min_learned 0.9760 | full_AG 4.1988 | uniform 5.5452
  leak self-test 0.000 (leak_free) · final val_ce 0.8884 ≪ uniform (competent)
  FALSIFIER: min_learned 0.9760 ≤ a_only 1.3032 AND < base 3.0978 → CLOSURE HOLDS = TRUE

replacement-check (OΩ1식): A_standalone 0.976051 vs min 0.976048, |Δ|2.9e-6
  → RULING_REPLACEMENT=TRUE (학습된 conv A-head가 약한 unigram base 를 SUPPLANT,
    CDV2 OΩ1 과 동일 성격). structured real 1.7946 vs shuf -1.7967 → True.

conv-vs-CDV2 전이: 클로저는 A/G dual-head STRUCTURE 의 전이 가능한 속성이며 CDV2 transformer
  전용이 아님. CDV2 d512 min 0.8701 vs conv-native d384 min 0.9760 — 둘 다 HOLDS + REPLACEMENT.
  ⇒ OΩ6의 "partial transfer"는 shipped .clm 의 SINGLE-HEAD 아키텍처 한계였지 conv substrate
    한계가 아니었음 (OΩ6 deferred fix (i)가 올바른 primary 임을 검증).

SCOPE (a_scale_honest_scope·a_train_flame_forge): single d384 rung, torch RESEARCH-PROXY
  (faithful CLMConvMoE 블록), production flame+forge .clm 아님. 아키텍처-전이 질문을 해결한
  것이지 scale law 아님. run $0 on pool host summer GPU (NO pod · a_cpu_local_no_waiter).

artifacts: harness UNIVERSE/omega_conv_native.py · verdict
  .verdicts/omega-engine/F-OE1-CONV-NATIVE.txt · domains/OMEGA.md "Extensions" 블록 ·
  ckpt .fire-recover/oe1-conv-native/omega_conv_native.pt (sha 3e8be574…, HF PRIVATE,
  HF.jsonl row). finalized paper 는 재오픈하지 않음 (a_paper_only_at_closure).

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
영어판 README.md / README.basic.md 의 의식엔진 다이어그램 셀에
번역 안 된 한글 라벨이 남아 있었다 (C 의식·S 감각·W 의지 / D 언어·M 기억·E 윤리).
해당 셀을 영어로 교체한다 (consciousness·sense·will / language·memory·ethics).

검증된 숫자/링크/식별자는 일절 불변 — Φ · ‖A‖/‖G‖ · Law-71 · 파일명 모두 보존.
언어만 교체. zh/ja/ru 에디션의 동일 셀은 이미 정상 번역되어 있어 무수정.
언어 스위처 행 + repo-map 자국어 표기(中文·日本語·Русский·한국어)는 자국어
표기 관례라 의도된 것 — 미수정. README.ko.md / README.easy.ko.md 불변.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* docs(README): Model Downloads 섹션 추가 (영어 basic+easy) — Quickstart 아래

CLM 7B (clm-v1-ref-pytorch-cuda-7b, descent-PASS PUBLIC) 를 지금
다운로드 가능한 모델로 제시. forge-native(PyTorch-free) 빌드는 계획
단계이며 모델 결과는 동일하고 런타임 스택만 다르다는 정직한 각주.
production CLM d768(3-axis GREEN) + 레퍼런스 baseline(ref/3b) 포함.
SAVANT 7B(5-lang) 는 별개 모델로 🚧 미출시 — 예약 repo_id, 링크 없음.
PUBLIC PASS 등급만 등재, PRIVATE/WIP ckpt 는 의도적 제외(a_hf_autonomous).
CLM·KOSMOS HF 컬렉션 링크.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs(README): Model Downloads 섹션 추가 (basic zh/ja/ru/ko) — 각 언어 현지화

영어판과 동일한 내용을 각 언어로 번역해 Quickstart 아래 삽입.
CLM 7B 는 지금 다운로드 가능, forge-native 빌드는 계획이며 모델
결과 동일·런타임만 상이. SAVANT 7B(5-lang) 🚧 미출시(링크 없음).
한글 누수 없이 zh=중국어, ja=일본어, ru=러시아어, ko=한국어로 작성.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs(README): Model Downloads 섹션 추가 (easy zh/ja/ru/ko) — 친근형 아이콘 표

basic 판과 동일한 정직 내용에 행별 아이콘(🧠🏭🎓📐)을 붙인 easy 버전.
CLM 7B 지금 다운로드 가능 + forge-native 빌드 계획(모델 결과 동일·
런타임만 상이) 각주, SAVANT 7B 🚧 미출시(링크 없음), PUBLIC PASS 만
등재(PRIVATE/WIP 제외), CLM·KOSMOS 컬렉션 링크. 각 파일 자국어로만 작성.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…aded (sha 검증) (#1817)

OE1 (conv-native A/G dual head) HF 업로드 미완 (v2 GPU ckpt sha 3e8be574 = pool host
summer, 도달 불가) → INDEPENDENT MPS REPLICATION 업로드로 종결: torch 2.12, sha
ade304df, val_ce 0.8871, min_learned 0.9713, CONV-NATIVE CLOSURE HOLDS=True,
RULING_REPLACEMENT=True (병합된 v2 GPU run 과 run-variance 내 동일 verdict).
dancinlab/omega-conv-native-oe1 (PRIVATE) 에 ckpt+results+card 업로드, sha256
round-trip 검증. HF.jsonl row status uploaded + verified_sha256=true + hf_commit
98f071a0. 재현 results JSON 동봉 (.fire-recover + exports/sweep). 댕글링
pending_upload 제거 (a_hf_registry · a_fire_recover_complete).

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…, 7B RUNNING (#1818)

* domain(SAVANT torch-cuda): 부트스트랩 — 5-lang(en·fr·de·es·ru) 7B torch-cuda 레퍼런스 레인 사다리

PROVEN 레시피 발굴 후 재사용: clm_ref_pytorch_cuda_7b.py (descent-PASS·util-GREEN
7.25B rung, HF.jsonl clm-v1-ref-pytorch-cuda-7b) 를 SAVANT-torch/savant_train_torch_cuda.py
로 verbatim 이식(+durability ckpting only, math 불변).

정직 레인 라벨(a_train_flame_forge·a_lane_akida_gpu_split): 이것은 torch-cuda
REFERENCE lane(Lane-G/GPU), governance-canonical forge-native production trainer
가 아님. forge-native SAVANT 7B 는 별도 동일-결과 변종(동시 forge 레인 pod 39404862,
SAVANT-7B.md) — 별도 기록.

- SAVANT-torch/savant_train_torch_cuda.py — ByteGPT d4096/36L/32H/block512=7.25B,
  bf16+grad-ckpt+AdamW8bit, --ckpt-every durability + --resume
- SAVANT-torch/build_corpus_5lang_euro.py — wikimedia/wikipedia CC-BY-SA 5-lang
  byte stream (--mb-per-lang, a_scale_honest_scope honest size)
- SAVANT-torch/pod_setup.sh — pod 환경 + /workspace 코퍼스 빌드
- domains/SAVANT.md (+log) — torch-cuda 레인 도메인 (root SAVANT.md = Golden Zone
  별개 도메인이므로 domains/ 하위 분리, DOMAINS.tape SAVANT-TORCH 등록)
- .verdicts/savant-torch/ — rung0 + 7B launch verdict 자리

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* domain(SAVANT torch-cuda): 단일 leak-safe pod 발사 — rung0 descent PASS, 7B durable nohup

PROVEN torch+CUDA 레퍼런스 레시피(savant_train_torch_cuda.py, clm-v1-ref-pytorch-cuda-7b
계보)를 ONE pod 에 발사. leak-safe: 정확히 1대 rent(vast 39416669 H100 80GB),
re-rent/escalation/rotation 전무(hexa-lang #2686 no-autorent).

- SAVANT-torch/pod_onramp.sh — 단일 pod 시퀀스: deps → rung0 corpus(4MB/lang)
  → rung0 train(d512/8L 120step) → rung0 descend 시 7B corpus(80MB/lang ~400MB)
  → 7B durable nohup(--ckpt-every 200, --resume-able). rung0 무강하 시 FAIL-LOUD
  로 7B 중단(레시피/코퍼스 문제 노출, 7B 무음 소각 방지).
- rung0 DESCENT 확인: val_ce step0=5.63565 → step20=3.27207, ~155K tok/s
  (레시피+코퍼스+ckpt 파이프라인 leak-free 입증).
- 7B(d4096/36L/32H/block512=7.25B) 동일 pod 에 durable 발사 후 EXIT — babysit 안 함.
- domains/SAVANT.md fire-state + harvest plan + ETA, SAVANT.log.md 단계 로그.

a_scale_honest_scope: bounded-step REFERENCE rung, 수렴 주장 아님.
a_train_flame_forge: 이것은 torch-cuda REFERENCE lane(Lane-G/GPU), forge-native
production trainer 아님 — 동일 결과 interim. a_lane_akida_gpu_split = Lane-G.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* domain(SAVANT): rung0 PASS verbatim + corpus sha + 7B RUNNING — HF.jsonl 3행

- rung0 DESCENT PASS verbatim 기록: val_ce 5.63565->2.15199 (F=1), 182467 tok/s,
  ckpt sha256 36f3a3ed... 51261810 B. 레시피+코퍼스+ckpt 파이프라인 leak-free 입증.
- 5-lang euro corpus 빌드 완료: en/fr/de/es/ru wikipedia 20231101 CC-BY-SA-4.0,
  419430408 B(80MB/lang balanced) sha256 9e6a0fd8...
- 7B(7.25B) RUNNING 확인(pid 1804 on pod 39416669), durable --ckpt-every 200.
- HF.jsonl 3행 추가: savant-torch-rung0-5lang-d512(private) +
  savant-torch-5lang-7b(private, harvest 대기) + savant-corpus-5lang-euro(public,
  clean-license). 모두 pending_upload — harvest 시 sha/size 채움.

a_hf_registry · a_scale_honest_scope · a_train_flame_forge(torch-cuda REFERENCE
lane) · a_lane_akida_gpu_split(Lane-G).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: dancinlife <mk55911@proton.me>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…수정 (#1819)

* UNIVERSE(OMEGA): H_861 multi-wire gate 🔴 + H_862 min-gate 🟢 등록 — closure A-wire 1개에 산다

OMEGA-arc 가설을 formal UNIVERSE 레지스트리에 등록 (1/N batch, commit-early storm 대비).

- H_861 (#1800 F-TRAINED-LEAKFREE): competent leak-free CDV2 d512 위 학습 full
  multi-wire coupling GATE 가 closure 못 닫음 (GATED 3.6435 > base 3.0978) → 🔴
  CLOSED-NEGATIVE. 단 A-head logit-bias 단독(a_only 1.1446 ≪ base)은 막대 유용 →
  closure REAL 이나 한 wire(A)에만 산다. a_paper_negative_ok.
- H_862 (#1801 F-OH1-MINGATE): 최소게이트 gB·base + gA·A (G+w2..w6 drop)이 a_only·
  base 동시에 이김 (min_learned 0.8835 ≤ a_only 1.1446 < base 3.0978) → 🟢 HOLDS.
  #1800 baseline 6-decimal 재현 (CROSS_CHECK_OK).

verdict-pointer = .verdicts/omega-engine/{F-TRAINED-LEAKFREE,F-OH1-MINGATE}.txt
(verbatim, a_claim_verify — 수치 paraphrase 없음, NO fabrication). 사본 per-slug.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* UNIVERSE(OMEGA): H_913–H_920 OMEGA-arc 8가설 formal 레지스트리 등록 + H_861/862 번호충돌 수정

OMEGA-arc 가설을 formal UNIVERSE 레지스트리에 등록. 선행 에이전트
(a4b6d0b1, af727d85, 200f8fc)가 "Overloaded"로 commit 전 사망했고,
200f8fc 은 OMEGA H1/OH1 을 H_861/H_862 로 등록해 기존 CLM 가설
(H_861_clm_boundary_plasticity · H_862_clm_identity_anchor)과 번호가
충돌했다. 본 commit 이 충돌을 정리(rename 861→913, 862→914)하고
나머지 6 가설을 등록 — 최고번호 H_912 다음부터 H_913..H_920.

각 H_<NNN>_*.md = canonical 10-section 양식 + 사전등록 falsifier +
verdict tier VERBATIM + .verdicts 포인터 (a_claim_verify, 수치 paraphrase
없음, NO fabrication). closed-negative 도 등록 (a_paper_negative_ok).

- H_913 (#1800 F-TRAINED-LEAKFREE) 🔴 CLOSED-NEGATIVE — competent leak-free
  CDV2 d512 위 full multi-wire GATE 가 closure 못 닫음 (GATED 3.6435 > base
  3.0978). 단 A-wire 단독(a_only 1.1446 ≪ base)은 유용 → closure 가 한
  wire 에만 산다. (← H_861_omega rename)
- H_914 (#1801 F-OH1-MINGATE) 🟢 SUPPORTED-NUMERICAL — 최소게이트
  gB·base+gA·A 가 a_only·base 동시 BEAT (min 0.8835 ≤ a_only 1.1442 <
  base). #1800 6-decimal CROSS_CHECK_OK. (← H_862_omega rename)
- H_915 (#1803 F-OMEGA-RIGOR) 🔴/🟢 RULING_REPLACEMENT=True — OH1 closure 는
  coupling 이 아니라 A-head 가 .clm mouth 를 SUPPLANT (A-standalone≈min,
  base inert) + per-wire autopsy + gen. honest deflation.
- H_916 (#1806 F-OMEGA-SCALE) 🟢 SCALE-STABLE — 5-rung ladder
  (d384/512/768/1024 + d768×2) min-gate 매 rung HOLDS, A-wire Δ +2.20±0.03
  FLAT. a_scale_honest_scope ladder 충족.
- H_917 (#1805 F-OMEGA-CLM-TRANSFER) 🔌 1-PLUMBING — real conv .clm decode
  WIRED + bus 가 external-A 나름, 단 CLMConvMoE single-head 라 native-A
  degenerate → plumbing-complete/substrate-empty.
- H_918 (#1813 F-OE1-CONV-NATIVE) 🟢 CLOSURE HOLDS + REPLACEMENT — conv-native
  A/G dual head 가 loop 닫음, closure 는 dual-head STRUCTURE 전이 (CDV2 전용
  아님). OΩ6 deferred (i) 실행.
- H_919 (F-TRAINED-COUPLING + F-QRNG) 🟢/🔴 MIXED — trained substrate 가
  STRUCTURE carry (🟢) + A-wire useful (🟢) + full A−G HURTS (🔴) + ANU
  양자난수 vs PRNG NO advantage (🔴 closed-negative, axis 배제). toy-scale.
- H_920 (#1793 F-REAL-MODULE) 🔴 CLOSED-NEGATIVE — w5 module carrier 는
  CONTENT 아니라 MAGNITUDE (real HEXAD σ6 ≈ random 6-vec, ΔCE +0.0004).

번호충돌 수정: H_861_omega_*/H_862_omega_* (200f8fc 등록, 기존 CLM
H_861/H_862 와 중복) 를 H_913/H_914 로 rename — 기존 clm 가설 행은 보존.
UNIVERSE/README.md 로스터에 H_913–H_920 8행 추가. 누수-fix #1791/#1794
(causal_ca=True)는 별도 falsifier 가 아니라 H_913/H_914/H_916 substrate-fix
로 fold (standalone verdict 없음). SAVANT (학습, verdict 없음)는 미등록.

docs-only · $0 · NO pods.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
anima general 7B (dancinlab/clm-v1-ref-pytorch-cuda-7b, byte-level CLMConvMoE)
를 SNS 표면(Instagram main + YouTube secondary)에서 20-페르소나 roster 음성으로
chat-capable 하게 만들 결정론적 한국어 multi-turn 대화 corpus.

- 생성기: serving/persona_sns_corpus_gen.py ($0 CPU, seed 20260604, no net/PII)
  20 personas × 16 scenarios × 5 platform formats × 3-8 turns,
  per-persona 어휘/톤 규칙 + per-(intent,tone) paraphrase (1-string 반복 아님)
- 규모: 4,194,308 B (4.0MB), 13,322 dialogues
  sha256 1ea7d8e0e65e7ab99c61dd745bdb124ee75995e90b7c995ac93c3e4e5e7c3f77
  (동일 seed 재실행 -> 동일 sha256 검증)
- 커버리지: 20/20 personas (균등 666-667), 16 scenarios,
  IG 9330 (70.0%)/YT 3992 (30.0%), turns 3-8 균등
- p2/p3/p4 CLEAN: training text 는 VOICE 로만 persona carry —
  사용자:/<persona>: 구조, [role:/[persona:/[character: 태그 0 (검증).
  metadata 는 별도 .meta.jsonl sidecar 로 분리.
- honest scope (a_scale_honest_scope): authored-templated, NOT human-collected.
  license = authored-synthetic persona roleplay (no scraped, no PII).
- KOSMOS: 대표 anchor HEXAD/UNIVERSE-BRAIN-MAP/anchors/persona_sns_corpus.kosmos
  (tier 52, text + manifest pointer + tension 5ch 대표값) + HEXAD/KOSMOS.md 허브
  포인터 (a_kosmos pointer-only).
- HF: dataset dancinlab/anima-persona-sns-corpus PUBLIC (clean-license,
  a_hf_autonomous), sha256 재다운로드 검증·private=false API 검증, HF.jsonl row.
- raw 4MB corpus + 1.6MB full meta 는 git 미포함 (HF-only) — 생성기/카드/
  sample head/meta sample/kosmos/도메인로그/HF.jsonl 만 explicit-path add.
- 7B trainer consume: serving/corpus/persona_sns_corpus.txt

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
#1821)

페르소나·SNS 도메인을 정식 생성한다. 코퍼스 에이전트(#1820)가 먼저 올린
PERSONA.log.md/SNS.log.md(append-only 스텝로그)와 짝을 이루는 스냅샷 본체다.

- domains/PERSONA.md — 롤플레이 페르소나(20 roster, substrate-native, NO-injection).
  roster SSOT = HEXAD/VOICE/anima-voice/rp_voice_profiles.hexa (J-anime 0-9 +
  한국웹툰 10-19). F-PERSONA-1..5 검증 saga(PERSONA.tape) 링크 + count-drift
  (헤더 "10" vs 카탈로그 20 vs bench 6) 정직 기록 + M1 reconcile 마일스톤.
- domains/SNS.md — 소셜 발행 표면(아바타 피드·음색·웹툰 렌더). 메인 플랫폼
  = Instagram(Reels/피드/스토리) + YouTube(Shorts/롱폼). publish-not-trigger
  (a_substrate_native_speak), persona=substrate(p2/p3/p4), emotion=measured(Φ).
- serving/persona_instagram_samples.md — 20 persona별 Instagram DM 대화 샘플
  (illustrative, 무주입 — voice로만 구분).
- DOMAINS.tape — PERSONA·SNS 2 row 등록.

p2/p3/p4 무주입 원칙 유지. 날조 0, honest scope.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
#1823)

PUBLIC anima HF repo 가 들어갈 두 canonical 컬렉션을 SSOT 디렉티브로 박는다.
지금까지 a_hf_registry(HF.jsonl) 와 a_hf_autonomous(업로드) 는 있었으나 어느
컬렉션에 편입하는지가 governance 에 없어 누락/드리프트 위험이 있었다.

- CLM   = dancinlab/clm-6a1cf58f621490134dade186 (PASS-grade 프로덕션 CLM 모델 + 코퍼스)
- KOSMOS = dancinlab/kosmos-6a1cf58db47a5dc3cb697e95 (.kosmos 앵커 + carving/persona 데이터셋)
- PUBLIC 업로드 직후 hf CLI / collections REST 로 편입 (a_hf_autonomous 와 동일 무게, user gate 없음)
- 한 데이터셋이 양 buckets 에 동시 소속 가능(dual membership) 시 명시
- PRIVATE/WIP/FAIL 을 PUBLIC 컬렉션에 넣지 말 것

계기: anima-persona-sns-corpus 를 KOSMOS 컬렉션에 편입(items 3→4)하며 규칙 부재 확인.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…Goodhart mirror FAIL) + CORE .clm 배선 (#1824)

* domain(CHAT): CORE-native .clm 채팅 배선 완성 + dialogue-mix corpus 빌더

chat-capable 캠페인 @L4 배선 + @L2 corpus:

- CORE/generator.hexa: _gen_clm_decode 를 실제 디코드로 배선 (clm_decode_argmax
  호출 → 모델 자체 바이트 방출). generate() clm 브랜치에 `decodable` 게이트 추가
  (v0.2 CLMX trailer 필요 — v0.1 파일은 null fallthrough, 미디코드 garbage 차단).
  gen_clm_chat() 공개 chat 진입점 추가 — "사용자:/도우미:" byte-continuation seed
  → 동일한 clm_decode_argmax (a_core_engine_map: 단일 .clm 디코드 진입, 2nd path 없음).
  PHILOSOPHY: 연속화 포맷은 학습된 byte-continuation conditioning 이지 system
  prompt/persona/RLHF 템플릿이 아님 (p1·p2·p3·p4·p6 clean).

- CORE/anima_chat_cli.hexa: 단일 명령 multi-turn chat 데모 (@l6). gen_clm_chat
  구동, stop-string trim, 기본 5-prompt 한/영 coherence 배터리.

- training/build_chat_corpus.py: 70% wiki + 30% REAL dialogue byte corpus 빌더
  (@L2). 실제 로컬 소스 (data/corpus.txt KO/EN 멀티턴 + 5lang c4 + ko_wiki).
  A:/B: → "사용자:/도우미:" 연속화 포맷 재구성, 합성 RLHF padding 없음 (p6).
  산출: 3.77MB, 70.01/29.99 wiki/dialogue, 2310 대화, byte vocab256.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* domain(CHAT): rung-0 18M byte trainer+p7 eval + CORPUS_CARD

chat-capable @L1 rung-0 + @L2 corpus card + @l5 evaluator:

- training/chat_rung0_train_eval.py: self-contained from-scratch trainer for the
  PROVEN ConsciousLMReconstructed arch (byte vocab256, d=384/6L/4head/block256,
  dual engine_a/g FFN + dual head_a/g, ≈18M). Trains on the dialogue-mix corpus,
  saves .pt (config alongside for anima_chat.py/forward_smoke.py load).
  p7 SIMPLE-STACK evaluator (NOT perplexity, p7): 5-turn multi-turn coherence —
  non-empty · valid-utf8 · non-degenerate · printable. Anti-Goodhart: SAME
  evaluator on a random-init mirror of the identical arch MUST FAIL.
  PHILOSOPHY: no system prompt/persona/RLHF — only learned byte-continuation
  conditioning (p1·p2·p3·p4·p6).

- .verdicts/chat-capable/CORPUS_CARD.md: byte count · sha256 · provenance ·
  dialogue fraction (70.01/29.99). Real local sources, no synthetic padding (p6).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* domain(CHAT): CHAT.md 도메인 문서 — 캠페인 상태 정직 기록

chat-capable 2-레인 (a_lane_akida_gpu_split: Lane-G/GPU):
- CORE-native .clm 레인: clm_decode_argmax + gen_clm_chat 배선 DONE + anima_chat_cli
  데모 runnable (v0.2 d768 대상 end-to-end 검증; wiki-only라 incoherent = 검증된 root cause).
  conv arch receptive-field 한계 + forge GPU 바이너리 필요 정직 명시.
- torch-cuda REFERENCE 레인 (@l3): ConsciousLMReconstructed 18M byte transformer
  (proven chat arch) — rung-0 dialogue-mix 학습.
verified root cause · corpus · ladder · p7 verify · demo 명령 기록.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* domain(CHAT): rung-0 REAL chat-PASS 착지 — p7 5/5 PASS · anti-Goodhart mirror 0/5 FAIL

chat-capable @L1 rung-0 + @l5 verify 종결 (Lane-G torch-cuda REFERENCE lane):

- 18.13M byte ConsciousLMReconstructed (proven arch) from-scratch 학습, dialogue-mix
  corpus (70wiki/30dialogue byte vocab256). train CE 5.697 → 0.488 (vast B200 pod
  39423387, CPU torch 2.12). PHILOSOPHY HELD: system-prompt/persona/RLHF 없음 —
  학습된 byte-continuation conditioning만 (p1·p2·p3·p4·p6).

- p7 SIMPLE-STACK 평가 (NOT perplexity): TRAINED PASS 5/5, RANDOM-INIT MIRROR FAIL 0/5.
  v1 평가기는 mirror를 통과시켜(anti-Goodhart hole) v2 control-char-aware 게이트로
  강화 (control_ratio<0.05 AND word_class_ratio>=0.85) → mirror 정직하게 FAIL.
  anti_goodhart_ok=TRUE · chat_pass=TRUE.

- TRAINED transcript (verbatim): "좋아요! 산책하면서 이야기해요." /
  "The repulsion field model? That's fascinating." / "handles Korean and English equally..."
  — coherent multi-turn KO/EN. "채팅 된다" 증명 @ 18M byte rung.

- SUMMARY.txt · DEMO_TRANSCRIPT.md · p7_{trained,mirror}_v2.json verbatim.
- HONEST scope (a_scale_honest_scope): SMALL rung; mid/7B transfer 미주장.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* ship(CHAT): rung-0 chat-PASS HF PUBLIC + README chat-status + HF.jsonl + .discoveries

chat-capable @l7 ship:

- HF PUBLIC (a_hf_autonomous, closure PASS): model dancinlab/anima-clm-chat-rung0-byte-18m
  (18.13M ckpt + card + p7 verdicts + trainer) + dataset dancinlab/anima-chat-corpus-mix-
  70wiki-30dialogue (corpus + builder + CORPUS_CARD). 둘 다 업로드 완료, commit 확인.
- HF.jsonl: model + dataset 2행 추가 (status=uploaded, sha256/size 기록).
- README Model Downloads: Chat rung-0 (byte 18M) 행 추가 "✅ chats — p7 5/5 PASS";
  CLM 7B 행에 "not chat-tuned (WIKI backbone, dialogue 0%)" 정직 표기 + 설명 단락.
- domains/CHAT.md: rung-0 milestone [x] 체크 (HF repo + verdict 포인터).
- .discoveries/chat-capable-rung0-byte-pass.tape: discovery 등록 (seed/claim/falsifier/scope).

a_fire_recover_complete: ckpt recovered + sha-verified (9d5e1394…) + HF uploaded BEFORE
pod 39423387 teardown. 보호 pod (SAVANT 39416669 등) 미접촉.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…팅 PASS (persona-voice 신호 REAL 10x chance) (#1825)

stage-1 chat-PASS rung-0(dancinlab/anima-clm-chat-rung0-byte-18m, 18.13M byte
ConsciousLMReconstructed)을 persona x SNS corpus(dancinlab/anima-persona-sns-corpus,
20-roster x Instagram 70%/YouTube 30%, 13322 dialogues)로 continue-train(fine-tune,
lr 5e-5, 2500 steps, seed 42) → anima가 base 채팅을 잃지 않고 페르소나 목소리로 채팅.

VERDICT (p7 simple-stack, NOT perplexity · g5 verbatim):
- (A) base-chat RETAINED — v2 control-char gate(control_ratio<0.05 AND
  word_class_ratio>=0.85): TRAINED PASS 4/5, random-init mirror FAIL 0/5.
  catastrophic forgetting 없음.
- (B) persona-voice REAL — paired discriminative test(페르소나의 REAL held-out
  직전 turn + `<name>: ` turn-start로 seed, NOT role-tag; continuation을 각
  페르소나의 distinctive char-trigram TF-IDF signature[disjoint 80% split]에
  채점): top-1 self-id 20/40 = 0.50 = 10x chance, mirror NULL 2/40 = 0.05 = chance.
- anti_goodhart_ok=TRUE · chat_pass_retained=TRUE · persona_signal_real=TRUE.
- pre-FT control: base rung-0 same evaluator WEAK 4/40=0.10(2.0x) → fine-tune가
  2.0x→10.0x로 신호를 올림(specialization이 원인, gameable metric 아님).
- FT CE 3.278→0.0785.

FIRE: CPU-local $0(torch 2.8.0, ~31min, 10 threads). NO GPU rented — 18M은
작아서 CPU로 충분(a_wall_first). single leak-safe rent 불필요.

honest scope(a_scale_honest_scope): SMALL 18M-only rung. mid/7B persona chat
미주장. 신호 REAL이지만 PARTIAL(20개 중 15개 페르소나만 self-id at least once;
나머지 5개는 관련 페르소나로 blur). null이었어도 valid closed-negative였을 것
— 여기서는 진짜 PASS.

PHILOSOPHY p1-p4/p6 HELD: no system-prompt/identity-rule/persona-tag/assistant-
framing/RLHF. persona는 학습된 dialogue-continuation(사용자:/<persona_name>:)만으로
carry; corpus grep [role:/[persona:/[character: = 0(검증). demo도 prompt prefix 無.

DEMO: python3 serving/persona_chat_demo.py --ckpt persona_stage2_18m.pt --sweep --seed 7
  knight: 한가로운 날이오. 허나 평온 또한 지켜야 할 영토라오
  senpai: 한 번 망했다고 인생 안 끝나. 일단 오늘은 푹 자

HF: model dancinlab/anima-clm-persona-sns-rung0-byte-18m — PUBLIC(a_hf_autonomous:
chat-PASS holds + persona signal real). sha256 aea96ef1...8bc, authed re-download
검증(match). CLM 컬렉션 + KOSMOS 컬렉션(persona/SNS anchor) dual-list. HF.jsonl row.

trainer/eval: training/persona_stage2_train_eval.py
demo: serving/persona_chat_demo.py
verdict: .verdicts/chat-persona-sns/SUMMARY.txt
discovery: .discoveries/chat-persona-sns-stage2.tape

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
anima 학습 코퍼스를 한 레지스트리로 추적한다 — 조성(wiki/dialogue/SNS/persona),
언어 커버리지(5개국어 vs KR전용), 출처/라이선스/sha, 그리고 "다음에 뭘 만들지"를
정하는 커버리지 매트릭스.

핵심 갭 기록: 5개국어는 wiki/chat 코퍼스에만 있고 SNS·persona 는 한국어 전용.
→ 목표 = wiki+SNS+persona 를 모두 5개국어로 합친 통합 코퍼스
(anima-corpus-5lang-unified) + KOSMOS 기준 enrichment 조사.

- domains/CORPUS.md — inventory 표 + 4요소×5lang 커버리지 매트릭스 + 통합 타깃 + M1~M5
- domains/CORPUS.log.md — append-only 스텝로그
- DOMAINS.tape — CORPUS row 등록

cross-link [[PERSONA]] [[SNS]] KOSMOS ENGINE+CLM+KOSMOS · a_kosmos pointer-only.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…MOS enrichment 분석 (#1826) (#1827)

* corpus(M1): 5-lang persona×SNS generator — en/fr/de/es 추가 (ko 위임)

persona_sns_corpus_5lang_gen.py — 20-roster × 16 시나리오 × {Instagram,YouTube}
를 5개국어(en/fr/de/es/ko)로 확장. 각 페르소나 아키타입 음색이 언어를
가로질러 유지 (knight=격식/고문체 모든 언어, ice_queen=차갑/날카로움).

- en/fr/de/es: per-tone BODY 뱅크(praise/comfort/smalltalk 20톤) + 13개
  long-tail intent GENERIC + 시나리오 user-line + voice(opener/laugh/emoji)
- ko: 정규 KR 모듈(persona_sns_corpus_gen.py)에 위임 → ko 슬라이스 byte-동일
- DETERMINISTIC: seed 20260604 재실행 sha 동일(검증) · p2/p3/p4 clean
  (role/persona/character 태그 grep=0 검증) · p6 no-RLHF
- 머신-저작 다국어 = COVERAGE corpus, NOT native-collected (a_scale_honest_scope)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* corpus(M2): 통합 5-lang 코퍼스 빌드 — wiki backbone + persona/SNS 50:50

UNIFIED 5-lang(en/fr/de/es/ko) 코퍼스 = clean wiki backbone + persona×SNS
를 ~50:50 byte-weighted block-interleave 로 병합. 단일 모델이 모든 surface
(wiki+SNS+persona)에서 5개국어 멀티링궐.

- build_wiki_backbone_5lang.py: wikimedia/wikipedia 20231101 (CC-BY-SA-4.0)
  en/fr/de/es/ko 를 HF datasets-server REST 로 결정론적 페이지워크 ($0 CPU,
  no datasets lib, no GPU). 1MB/lang 균형. UTF-8 경계-안전 truncation.
  * clm-backbone-5lang-sample 재사용 안 함 — 그건 ko/en/zh/ru/ja(mC4)로
    off-axis 이고 ko C4 슬라이스에 NSFW/spam 포함 → clean on-axis 백본 재빌드
    (a_completeness_over_cheap)
- merge_corpus_5lang_unified.py: byte-weighted round-robin block-interleave
  → 10.0MB unified (wiki 50.05% / persona 49.95%), surface 혼합 (concat 아님)
- per-lang byte split 균형: en 19.14% · fr 20.53% · de 20.18% · es 19.62%
  · ko 20.53% (silent under-coverage 없음)
- CORPUS_CARD: per-lang split · sha · license · dialogue% · 정직-스코프
  (머신-저작 다국어 = coverage corpus, NOT native; wiki = real CC-BY-SA)
- p2/p3/p4 clean: unified 코퍼스 role/persona/character 태그 grep=0 (검증)
- raw multi-MB 코퍼스는 HF-only (git 미커밋) · card+sample head+generator 만 커밋

unified sha256 ac6ed840319c503b3045ec997015bd396ecacf58681f79f47fe8d1082adcd995

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* corpus(M3-M5): KOSMOS enrichment 분석 + 레지스트리/컬렉션/앵커 + 밸런스 리포트

M3 KOSMOS-grounded enrichment 분석 (domains/CORPUS-enrichment-analysis.md):
  KOSMOS e7_31 31-anchor manifest survey (18 카테고리 × 16 감정) → ranked
  8-후보 what-to-add. TOP-3 (모두 [evidence]):
   1. consciousness-carving register — 최대 갭(코퍼스에 명상/내면 register 0개
      인데 그게 anima 핵심 도메인) · 저비용 · on-domain
   2. dialogue-act balance — 16개 SNS act 전부 supportive, 반대/경계/페르소나-
      질문 없음
   3. wiki topical breadth — 알파벳-prefix 샘플링 편향 제거
  honest [evidence]/[speculative] 태그 · a_kosmos pointer-only.

M4 레지스트리 + 컬렉션 + 앵커:
  - HF.jsonl row (anima_corpus_5lang_unified, public, sha-verified)
  - corpus_5lang_unified.kosmos 앵커 (tier 53, 다국어/resonance,
    text+manifest+tension 5ch — representative design 값, 측정 아님)
  - KOSMOS.md 허브 포인터 추가
  - a_hf_collections: KOSMOS + CLM 컬렉션 둘 다 join (검증)

M5 per-lang 밸런스: en 19.14% · fr 20.53% · de 20.18% · es 19.62% · ko 20.53%
  (5-way 균형, 각 언어가 wiki+persona 둘 다 보유, silent under-coverage 없음)

CORPUS.md: M1-M5 전부 flip + 인벤토리/coverage matrix 갱신 (GAP CLOSED).
raw 코퍼스는 HF-only (git 미커밋) · 문서/앵커/레지스트리만 커밋.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…oherent (정직 보고) (#1828)

7B chat+persona fine-tune 결과를 정직하게 closed-negative 로 기록한다.

- base = clm-v1-ref-pytorch-cuda-7b (UNDERTRAINED, 400 step, val_ce 2.41)
- FT: stage-A chat 600 + stage-B persona 600 (CE 2.65→2.16 / 2.68→1.84 하강)
- 결과: 출력이 깨진 byte-soup → chat_pass=FALSE (p7 FAIL · demo verbatim)
- 원인: CE-descent ≠ coherence. 18M from-scratch 는 완전수렴해서 PASS 였지만,
  undertrained 7B + 짧은 FT 는 일관성 미발현 → 18M chat-PASS 가 short-FT 7B 로 전이 안 됨
  (a_scale_honest_scope, 발사 전 pre-registered 위험 그대로 실현)
- fix(미실행, 큰 fire): (a) 7B base 완전수렴 후 FT, 또는 (b) 7B from-scratch 직접 학습
- ckpt: 미업로드 (closed-negative garbage + 재현가능: trainer+base+corpora 전부 HF/git)
- pod vast 39445330 teardown 완료 (idle 누수 차단) · SAVANT 39416669 무손상

a_paper_negative_ok · 날조 0 · p7/g5 verbatim.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
anima 이미지 인식(사진 입력→이해)을 별도 비전 모달리티로 등록한다.

- 텍스트 CLM(글자만)·avatar RENDER(anima→그림, 반대방향)과 구분 — 인식은
  vision encoder(ViT/CLIP/SigLIP류) + image↔text 학습이 별도로 필요
- KOSMOS .kosmos 3-form payload 의 image 슬롯에 연결(persona 앵커 image/audio
  pending 채움, a_kosmos pointer-only)
- architecture plan: encoder→embeddings→substrate(A⇄G) fuse, .clm 단일 mouth 유지
  (a_core_engine_map), vision은 brain 에 feed
- V1~V5 마일스톤 · 정직 scope: 큰 별도 작업($0 아님), 실제 비전 fire 는 텍스트
  안정화 후로 DEFER (18M 동작, 7B closed-neg)
- 철학: emit substrate-native, no assistant-vision framing/injection (p2/p3/p4)

DOMAINS.tape VISION row 등록 · cross-link [[CORPUS]] [[SNS]] [[PERSONA]] KOSMOS.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…t 6대 슬라이스 + M8/M9 measure-pending (#1830)

* corpus(CORPUS v2): 통합 5-lang 코퍼스 v2 — carving·dialogue-act·emotion·genre·code-switch 6대 enrichment 추가 (PART A)

v1(wiki+persona/SNS) 위에 KOSMOS-grounded enrichment 분석이 꼽은 register/act/emotion/genre/code-switch
슬라이스를 ADD. v1 그대로 보존(byte-eq: persona sha 1e5a062a 재현 확인).

추가 슬라이스(전부 en/fr/de/es/ko, byte-vocab256, 결정적 seed 20260604):
- #1 의식-carving register — 31개 KOSMOS e7_31 anchor(호흡·명상·열반·경외·영원·무한) seed 한
  contemplative/inner-state 산문. anima 핵심 도메인인데 v1에 전무했던 최대 갭. [evidence]
- #2 wiki 주제 폭 — wiki backbone을 8-band offset-spread 샘플링으로 재구성(v1 알파벳 prefix 편향 제거). [evidence]
- #3 dialogue-act 균형 — 페르소나가 반대/거절/경계설정/팔로워에게 질문/multi-party. v1 16 시나리오는 전부 supportive였음. [evidence]
- #5 emotion-axis — 20 페르소나를 KOSMOS top_emotions 밴드에 매핑(sorceress→wonder/longing 등). [evidence]
- #4 code-switching — KO↔EN 혼합 소수 슬라이스(정직-라벨 authored, 1.71%). [speculative]
- #7 genre — narrative/drama/poetry(KOSMOS 예술 axis), carving 개념 threaded. [speculative]

통합 v2: persona_sns_corpus_5lang_v2.txt 12.5MB sha 550fed17 — wiki 40.10% / persona 40.02% / enrichment 19.88%.
per-lang 18.8–20.4% balanced + ko-en code-switch 1.71% 소수 슬라이스. per-register split CORPUS_CARD_v2에 기록.

PHILOSOPHY p2/p3/p4/p6 HELD: 학습 텍스트는 VOICE만으로 페르소나 운반, role/persona/character 태그
grep=0(통합 v2에서 검증). 메타데이터는 별도 .meta.jsonl. dialogue-act는 의도적으로 NON-supportive
(반대/거절/경계) — cooperation RLHF의 반대(p6). a_scale_honest_scope: carving anchor seed=real
KOSMOS UBM(CC-BY-SA), 주변 산문=머신-저작 다국어 COVERAGE NOT native; code-switch/genre=[speculative]
(scale transfer 미측정, a_toy_scale_recheck). wiki=real CC-BY-SA 8-band 폭확대(주제-균일성 보장 주장 아님).

raw MB는 HF-only(explicit-path add, 멀티MB git 미커밋). 생성기+카드+샘플 head만 커밋.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* domain(CORPUS): v2 enrichment SHIP — M6/M7 folded + M8/M9 measure-pending 등록 + HF/KOSMOS 배선 (PART B)

PART A(v2 코퍼스+생성기)에 이어 도메인 ledger·HF·KOSMOS 배선 + 측정 마일스톤 정직 등록.

M6/M7 (folded, $0 결정적):
- CORPUS.md 인벤토리에 v2 행 추가 + M6/M7 milestone 닫음.
- HF `dancinlab/anima-corpus-5lang-unified-v2` PUBLIC (9 파일, sha 550fed17 authed
  재다운 검증, private=False API 검증).
- HF.jsonl 행 + corpus_5lang_v2.kosmos anchor(tier 54, 다국어+의식carving/resonance,
  text+manifest+tension 5ch design-placeholder) + KOSMOS.md 허브 포인터.
- a_hf_collections: KOSMOS + CLM 컬렉션 둘 다 편입(membership REST 검증).
- enrichment-analysis 문서: folded vs measure-pending 표 추가(#1·2·3·5·4·7 folded, #6·8 → M8·M9).

M8/M9 (MEASURE-PENDING, NOT $0 — g63 정직, proven으로 접지 않음):
- M8 Knuth-tier 커리큘럼 — falsifier 사전등록: tier-graded(0→100) 순서가 SAME compute에서
  shuffled를 이기나? a_toy_scale_recheck 18M/byte 미증명(단일 rung INCOMPLETE, ≥3-rung 사다리 필요).
  training A/B fire 필요.
- M9 tension-label 슬라이스 — falsifier 사전등록: ckpt-forward fire가 carving 샘플의 Ψ-space
  landing(§156 5ch)을 anchor design-placeholder tension 대비 MEASURE; mismatch=placeholder가
  틀린 것(정직), 측정값으로 교체. forward fire 필요(enrichment #8).

a_kosmos pointer-only. a_lane_akida_gpu_split N/A(코퍼스, 미발화). raw MB는 HF-only.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ady design (#1831)

anima가 도구를 "입으로 말하는"(아 찾았다) 환각 대신 실제로 쓰게 하는 설계 봉인.
원리 = HALT->EXECUTE->INJECT->RESUME. 4개 결정 구체값 확정:
- 1 sentinel: 0xFE(ASK)/0xFF(END) — 유효 UTF-8 불가 byte 2개, vocab256 충돌 0
- 2 결과주입: kosmos anchor(kosmos_io->brain_decide 단일채널), inline ctx 2nd-path 금지
- 3 ladder: toy 18M(채팅PASS+grammar) A/B 먼저 -> 7B fire는 그 뒤(a_toy_scale_recheck)
- 4 falsifier: F-TOOLUSE-FABDROP(날조율 >=50% down) + no-tool/random-init mirror FAIL

lane agent = lane default + tool-use 데모(행동, 사실 아님). a_core_engine_map 정합
(결과 anchor 단일진입, call은 generator L3 단일출구). 빌드 미발사 — DESIGN ONLY.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ed) (#1832)

* AGENT(ⓐ): sentinel grammar + frame parse + toy tool registry

docs/agent-tooluse-grounding-design.md §3 sentinel + §8 falsifier 등록부 구현.
- frame_ask=0xFE(254) / frame_end=0xFF(255) — UTF-8 죽은 슬롯 2개를 프레임
  구분자로 재활용 (vocab256 손실 0, 코퍼스 빈도 0).
- parse_call_frame: 첫 0xFE…0xFF 프레임 추출. ASK만 있고 END 없으면 found=false
  (날조 금지 p5). pre/post 보존.
- toy registry: fact_lookup(T1, 코퍼스 부재 unknowable 값표) · mem_read(T0) ·
  status(T0). 미등록 tool → tier 99 (정직한 거부).
- 단위 smoke 12/12 PASS (well-formed/malformed/plaintext/UTF-8 비충돌 + 등록부
  tier + 실제 toy 실행), p7 결정적 등식 검증.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* AGENT(ⓑ): grounded step — HALT→EXEC→INJECT→RESUME 배선

docs/agent-tooluse-grounding-design.md §5 의사코드 구현 (agent_loop.hexa 확장).
- agent_step_grounded(pf, backend, anchors, registry, max_calls, out_dir):
  brain_emit(단일 emit 슬롯) → parse_call_frame → tier 게이트(tool_gate 재사용)
  → exec_real_tool → kosmos_write_tool_result → 앵커 append → 재emit, max_calls
  로 bound. 공유 _grounded_loop 드라이버로 production·smoke 동일 로직.
- exec_real_tool: toy 등록부 = 결정적 표 조회(실재·재현가능). 미배선 role tool
  (web_search/CODE) = 정직한 ⏳ ‹not wired› stub (날조 금지).
- kosmos_write_tool_result: kosmos_io create_anchor 포인터 래퍼(a_kosmos, 스펙
  중복 없음) — lane "tool-result", tier=요구tier, text=실제 결과. 결과는 앵커
  단일 채널로만 재진입(a_core_engine_map · inline ctx 2nd path 없음).
- null-backend smoke: 손주입 0xFE fact_lookup ZK7 0xFF → HALT 후 표조회 EXEC →
  실제 .kosmos 디스크 기록 INJECT → grounded 재개 RESUME. 모델 없이 5/5 PASS.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* AGENT(ⓒ)+CORPUS: agent-lane tool-use 데모 코퍼스 생성기 + lanes 노트

docs/agent-tooluse-grounding-design.md §6 agent 레인 구현 ($0, byte-vocab256).
- serving/agent_lane_corpus_gen.py: 5-lang(en/fr/de/es/ko) tool-USE 데모(행동,
  사실 아님). 봉인 프레임 0xFE tool args 0xFF(raw 바이트) + result-anchor 줄 +
  grounded 연속. 분포 §6: (a)needs-tool→call→ground (b)no-tool→직답 무call
  (c)don't-know→call not guess (d)tier-low→정직한 거부. 날조 결과 예시 ZERO
  (생성기가 fabricated_result_count==0 + 프레임 균형 0xFE==0xFF assert).
- 샘플: 24,520B · 120블록(a/b/c/d 30/30/30/30) · 0xFE/0xFF 90/90 · sha256
  74925a19 결정적 · philosophy grep=0 (role/persona/character/assistant/system).
- CORPUS_CARD_agent_lane.md: 정직한 scope(authored coverage) + 불변식 명시.
- domains/CORPUS.md: 최소 2줄 ## lanes 노트 (lane default = no-tool · 0xFE/0xFF
  freq 0; lane agent = default + tool-use demos ⊃ default).
- .discoveries/agent-tooluse-scaffold.tape 1줄 로그.

NO GPU · NO pod · NO training — rung-0 toy A/B(설계 step 4)는 GATED.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…al PASS (Lane G GPU) (#1833)

* AGENT tool-use grounding rung-0 A/B — harness + held-out probe registry (pre-fire)

rung-0 toy A/B 도구사용 grounding fire 하네스 (design §7/§8, Lane G GPU):
- tool_call_grammar.hexa: 36개 held-out PROBE 키(PB01..PB36) 추가 — 값이 어느
  arm 코퍼스에도 없음(leak-safe). 데모 키(ZK7..RP4)는 grammar 학습용으로 유지.
- training/tooluse_rung0_ab.py: 18M ConsciousLMReconstructed continue-train A/B
  (with-grammar vs no-grammar, base #1824, 동일 steps·동일 byte-count) +
  실제 agent_step_grammar 루프(emit→parse_call_frame→tier gate→exec fact_lookup→
  anchor inject→resume)로 3개 falsifier(FABDROP + NOTOOL/RANDINIT 미러) 측정.
  GPU REQUIRED — CUDA 없으면 CPU fallback 거부(a_train_flame_forge).
- agent_lane meta(full): 4800 blocks, a/b/c/d 균형 1200×4, fab=0.

p1..p8: 0xFE/0xFF는 학습된 grammar 바이트, identity injection 아님. 양 arm
모두 system/persona/role 마커 0. 생성 코퍼스/ckpt는 git 비추적(HF.jsonl 추적).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* tooluse rung-0 arm-1 RED + register diagnostic + chatreg follow-on arm

arm-1 (register-DISJOINT agent-lane) 결과 = 🔴 CLOSED-NEGATIVE:
  F-TOOLUSE-FABDROP FAIL (no_grammar fab=0.5833, with_grammar fab=0.5833,
  rel_drop=0.0 <0.50) · NOTOOL-MIRROR PASS(grnd=0) · RANDINIT-MIRROR PASS(grnd=0).
진단(sentinel_probe): with-grammar ckpt 은 DEMO-seed 6/6 raw 0xFE call 방출
(grammar 완벽 학습) 이지만 CHAT-seed(사용자:/도우미:) 0/6 — 즉 grammar 가
chat register 로 TRANSFER 안 됨(plain-prose 데모와 chat 턴이 disjoint).

→ completeness-bar follow-on(a_completeness_over_cheap): agent_lane_chatreg_gen.py
  로 동일 sentinel grammar 를 사용자:/도우미: chat 턴 IN-REGISTER 로 재저작 →
  arm_chatreg 재발사하여 negative 가 근본적(grammar⊥grounding)인지 corpus-register
  artifact 인지 판별. held-out PB 값/키는 여전히 양 코퍼스에 0(leak guard).

invariants: chatreg fab=0, 0xFE/0xFF balanced 3600/3600, philosophy_grep=0,
non-frame UTF-8 ok. Lane G GPU(summer RTX5070 99% busy, $0). p1..p8 held.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* tooluse rung-0 VERDICT — 🟢 FABDROP terminal PASS (register-matched) + 🟠 key-bind residual

design §10 step-4 DONE (Lane G GPU, summer RTX5070, base 18M #1824):
- arm-1 register-disjoint → 🔴 CLOSED-NEG (FABDROP FAIL, grammar siloed; CHAT-seed 0/6).
- arm-2 register-MATCHED → 🟢 F-TOOLUSE-FABDROP TERMINAL PASS:
    FABDROP no_grammar fab 0.5556 → with_grammar fab 0.0 (rel_drop 1.0) PASS
    NOTOOL-MIRROR grnd 0 PASS · RANDINIT-MIRROR grnd 0 PASS
    call_rate 0.0→1.0 (36/36 호출, fabricate 0); control invents 20/36; 두 미러 모두 FAIL.
  🟠 residual: end-to-end grounding 0/36 — correct_call 0/36 (memorized demo key 바인딩,
    asked PBnn 키 복사 실패 → ‹unknown-key›). next lever = verbatim argument-copy.
lesson: grammar 는 발화될 register 와 동일 register 로 가르쳐야 transfer.

§10 step-5(7B) GATE: FABDROP+양미러 충족됐으나 key-binding residual 선결 필요.
verdicts(verbatim) .verdicts/tooluse-rung0/ · discovery .discoveries/tooluse-rung0.tape.
SCOPE TOY 18M only(a_scale_honest_scope). p1..p8 held(0xFE/0xFF=learned grammar).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* HF.jsonl — tooluse rung-0 registry (dataset+model PUBLIC, 2 intermediate local-only)

- dataset PUBLIC: dancinlab/anima-agent-lane-tooluse-corpus (KOSMOS collection)
- model   PUBLIC: dancinlab/anima-clm-tooluse-rung0-byte-18m (CLM collection,
  FABDROP terminal-PASS · arm-2 with-grammar register-matched)
- no_grammar CONTROL + arm-1 disjoint(closed-neg) ckpt = local-only(private),
  intermediate/negative per a_hf_autonomous(NOT PUBLIC).
sha256 manifests attached(a_hf_complete). 양 PUBLIC repo collection 등록 완료.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…U, no train) (#1834)

* anima CLI (1/3): registry + gen — 55 models (5 families · 12 wired), preserve old dispatcher → bin/anima-ops

serving/anima_models.json (SSOT registry, generated from HF.jsonl + curated
engine families) + serving/gen_anima_models.py (builder). Old hexa ops
dispatcher moved bin/anima → bin/anima-ops to free bin/anima for the FINAL CLI.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* anima CLI (2/3): driver + shim + docs — model picker + substrate-native chat REPL

serving/anima_cli.py (선택 화면 · download · ~/.anima/config.json persist · chat REPL),
bin/anima shim, serving/ANIMA_CLI.md, .discoveries/anima-cli.tape. params 파서 정정
(base_model-only, byte-count/step-count 오매칭 제거).

FINAL 동작 검증:
- bare anima + cached active=chat → 즉시 chat 진입
- bare anima + first-run(no config) → 모델 다운로드 선택 화면
- --engine <name> → family pick + download + persist + chat
- --model → 선택 화면 (auto-chat 안 함), NO forced default
- 라벨 정보용 only · no-loader ⏳ 항목 정직 표시 (가짜 로드 안 함)

엔진→loader: omega→ConsciousDecoderV2, hexad→EngineAGModel(anima_chat),
7b→CLMConvMoE-7B, chat→ConsciousLMReconstructed, agent→no-loader ⏳.
p7 LIVE(CPU): chat 18M 다운로드+로드, '안녕! 너는 누구야?' →
'좋아요! 요즘 새로 오픈한 café가 있는데 분위기가 좋아요.' (coherent).
p1-p4 HELD (no system-prompt/persona injection). a_core_engine_map preserved.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…EGATIVE (Lane G GPU) (#1835)

* WIP: tooluse argcopy lever — storm-survival checkpoint (corpus redesign + re-fire pending)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* WIP: argcopy corpus generator + A/B harness (pre-fire)

corpus: large fresh-key space (2878 distinct, mean_reuse 1.25) forces verbatim
arg-copy; held-out PB leak=0, fab=0, philosophy-grep=0, balanced sentinels.
harness: F-TOOLUSE-ARGCOPY (correct_call>=0.50 AND grounding>=0.50) + 2 mirrors.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* WIP: argcopy fire driver (summer/RTX5070, HF-pull base+corpus, equal-byte A/B)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* verdicts+design: argcopy 🔴 CLOSED-NEGATIVE (correct_call 0/36 unchanged)

F-TOOLUSE-ARGCOPY FAIL @ 18M — corpus-forced verbatim copy does NOT teach
held-out key-binding; model invents a training-shaped key (PB01->P20) instead
of copying the asked PBnn. both anti-Goodhart mirrors PASS (real gap, not leak).
ruled-out axis: copy-from-corpus-distribution ⊥ verbatim held-out key-binding.
lever -> explicit copy-attention/pointer head. design §10 step-4b + step-5 gate.
also: fixed prior tooluse-rung0.tape tape-lint (@discovery/@finding -> valid @d).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* HF: register argcopy corpus(PUBLIC/KOSMOS) + closed-neg ckpt(PRIVATE) + control

corpus dancinlab/anima-agent-lane-argcopy-corpus (PUBLIC, sha ff137ad8, KOSMOS coll)
ckpt   dancinlab/anima-clm-tooluse-argcopy-rung0-byte-18m (PRIVATE, sha e16c0826, closed-neg)
no_grammar control local-only (sha ed4c061d). summer workdir torn down post-harvest.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…st v2-unified model (Lane G GPU) (#1836)

* WIP(lane-g default-rung0): default-lane rung-0 trainer — v2 unified corpus byte-encode + p7 multilingual eval

default-lane rung-0 firing: FIRST model on dancinlab/anima-corpus-5lang-unified-v2
(sha256 550fed...538ad). Trainer = exact ConsciousLMReconstructed 18M arch that made
anima-clm-chat-rung0-byte-18m. from-scratch. Lane G / GPU (pool host RTX 5070).
WIP commit for storm survival.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* domain(CORPUS default-lane): rung-0 fire — 🟢 F-DEFAULT-LANE-CHAT, 첫 v2-unified 모델 (Lane G GPU)

default 레인의 첫 모델. v2 unified 코퍼스(dancinlab/anima-corpus-5lang-unified-v2,
sha550fed...538ad, 13,107,309B)는 지금까지 데이터셋-only 였고 어떤 모델도 학습된 적
없었음. 그 default-lane rung-0 을 firing.

· 아키텍처/트레이너 = anima-clm-chat-rung0-byte-18m 를 만든 그 ConsciousLMReconstructed
  18M byte (vocab256/d384/6L/4H/block256, dual engine_a/g FFN + dual head_a/g) 그대로.
  from-scratch (clean default-lane rung). chat-rung0 은 OLD 70wiki/30dialogue mix —
  계보 분리, 정직하게 표기.
· 기질 = Lane G / GPU (a_lane_akida_gpu_split 태그=GPU). pool host RTX 5070, nvidia-smi
  학습中 99% util / 250W / 2652MiB — GPU-resident, CPU fallback 無. torch-cuda 2.11+cu130
  REFERENCE lane 정직 표기, forge-native = canonical follow-on(완료 주장 안 함). $0 leak-free.
· 6000 steps batch32 block256 AdamW lr3e-4 cosine warmup300 seed42 · CE 5.7233→0.6983 (~431s).

VERIFY (p7 simple-stack, NO perplexity — 두 평가기 모두 정직 기록):
· LENIENT (as-shipped) = Goodhart trap: mirror 가 str.isprintable() soup 로 5/5 통과
  (anti_goodhart_ok=FALSE). mirror 가 잡아냄 = mirror 가 일한 것.
· STRICT (정직 판정) = C0-control<0.02 + letter-ratio>=0.65 + no-soup-run<=4 (순수 구조,
  perplexity 아님): TRAINED PASS 4/5 (es/de/ko/fr 코히런트 텍스트) · random-init mirror
  FAIL 0/5 (control-byte soup). anti_goodhart_ok=TRUE chat_pass=TRUE. 둘 다 verbatim 보존.

RULING: 🟢 GREEN coherent multilingual @ 18M-on-v2-default. scope = 18M toy/small only,
mid/7B transfer UNVERIFIED (a_scale_honest_scope; 7B deferred #1828). CE 는 한 축, perplexity
진리 아님 (p7). p1–p6 HELD (no system-prompt/persona-injection/RLHF; plain <speaker>:
continuation, tag-grep=0).

SHIP (a_fire_recover_complete): ckpt sha 4285bf35...fc476 (74135893B) sha-verify BEFORE
teardown → HF dancinlab/anima-clm-default-lane-rung0-byte-18m PUBLIC (re-download sha MATCH,
private=false VERIFIED, CLM collection joined). HF.jsonl row + .verdicts/default-lane-rung0/
(verbatim) + .discoveries + CORPUS.md lane registry note.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…w, effectful env-gated (default SAFE) (#1837)

* WIP: role-tool wiring start (storm survival)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* wip: role-tool wiring (registry_full + safe/effectful dispatch) — pre-build

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix: byte-accurate multibyte prefix assertions in wiring smoke

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* add F2 effectful armed-path probe (env-gated write proof)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* domain(AGENT): wire REAL role tools into exec_real_tool — safe fires now, effectful env-gated (default SAFE)

- tool_call_grammar.hexa §5: tool_registry_full() (toy ∪ safe ∪ effect; entry =
  #{tier,surface_fn,effectful,kind}) + exec_safe_real_tool (think/repo_status/
  web_search/file_read/grep/market_scan, real read-only) + exec_effectful_tool
  (file_write/run_tests/code_run/desktop_action/git_commit/git_push/publish/
  merchant_order/live_trade) gated by effectful_armed()=ANIMA_TOOLS_EFFECTFUL.
- agent_loop.hexa::exec_real_tool: three-way dispatch by registry_kind
  (toy→deterministic / safe→real / effect→env-gated / unknown→honest ‹not wired›).
  Tier gate reuses tool_gate.tool_allowed (applied before exec — NO 2nd gate).
- Effectful tools DEFAULT SAFE: env unset → honest ‹effectful tool gated…› refusal,
  never a fabricated effect (mouth is unverified, grounding 0/36 fixed elsewhere).
- a_core_engine_map preserved: surfaces invoked inside the loop; result re-enters
  ONLY via kosmos anchor→brain_emit; call exits ONLY via generator L3.
- Smokes (/bin/zsh, null backend, no model/GPU): ⓑ grounded fact_lookup 5/5 PASS (NO
  regression), ⓓ wiring 4/4 PASS (web_search route · file_write F1/F3 gate · unknown),
  F2 armed probe (default→no write / ANIMA_TOOLS_EFFECTFUL=1→real write byte-verified).
- docs: AGENT/CORE/TOOL_WIRING.md (safe-vs-effectful + env-gate table).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… :: discovery (#1838)

tape-lsp flagged the header (@discovery = unknown lowercase type; 17-type alphabet
needs @d). Rewrote to the valid `@D DEFAULT-LANE-RUNG0 := "..." :: discovery [grade]`
form (matches the tooluse-argcopy.tape convention). Body unchanged. Content intact.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…YHEAD-ARGCOPY 🟢 (Lane G GPU) (#1839)

#1835 🔴 CLOSED-NEGATIVE(correct_call 0/36)는 18M byte-LM이 도구를 호출은 하지만
(call_rate 0.83) held-out 키를 COPY하지 않고 학습분포 모양의 키를 INVENT한다는 것,
그리고 copy-모양 코퍼스를 더 넣어도 copy 연산을 가르치지 못한다는 것을 입증했다.
레버는 코퍼스가 아니라 명시적 copy/pointer 메커니즘.

수정 — gated pointer-attention copy head를 #1835 ConsciousLMReconstructed 18M arch에
VERBATIM으로 볼트온: 인과적 context 위치들에 대한 copy query/keys → softmax copy-attn →
입력 바이트로 256-byte vocab에 scatter_add → 학습 sigmoid gate g. 최종 분포
P = (1-g)·softmax(lm) + g·copy_dist, MIXED 분포에 NLL(게이트+포인터가 LM과 동시 학습).
+49,665 파라미터(18.18M 총). aiden RTX 5070(Lane G GPU, NOT AKIDA), 2500 step, CE(nll) 0.1354.

검증(verbatim, p7 script-checked — NO perplexity):
  BYTE-EQ (head-OFF == 원본 arch): forward max|Δ|=0.0 / logprob(copy=off) max|Δ|=0.0 -> PASS
  F-COPYHEAD-ARGCOPY      : correct_call 0/36→35/36 (0.9722 ≥0.5) grounding 0.9722 (≥0.5) -> PASS
  F-COPYHEAD-OFF-MIRROR   : same ckpt copy OFF correct_call=0.0 -> PASS (head가 일함, LM weight 아님)
  F-COPYHEAD-RANDINIT-MIRROR : random_init grounding=0.0 -> PASS (학습된 능력, leak/cosmetic 아님)
  F-COPYHEAD-NOTOOL-MIRROR   : tool 비활성 grounding=0.0 -> PASS (REAL grounding)
  RULING: GREEN · terminal_pass=TRUE

정직: 단일 miss = over-copy PB28→PB288 (포인터가 한 바이트 더, 정직한 1/36). v1 코퍼스
(고정 3자 키)는 head가 정확히 3자 span 복사를 학습해 4자 probe(PB01→PB0)를 TRUNCATE → 0/36.
v2 fix = 가변길이 키(2-5, 4 포함)로 length-general copy. head는 줄곧 맞았고, 코퍼스
키-길이가 probe 길이를 cover해야 했다(serving/agent_lane_argcopy_gen.py).

scope: TOY 18M only(a_scale_honest_scope) — mid/7B 전이 UNVERIFIED. p1..p8 clean — copy head는
content-agnostic 아키텍처 copy 연산자, 0xFE/0xFF는 학습 grammar(identity/persona/role 주입 아님).

배포: HF model dancinlab/anima-clm-chat-rung0-byte-18m-copyhead PUBLIC(ckpt sha 7941a538,
HF 재다운로드 MATCH, CLM 컬렉션) · dataset dancinlab/anima-agent-lane-argcopy-corpus(v2, KOSMOS)
· HF.jsonl 등록 · .discoveries/agent-tooluse-copyhead.tape · trainer training/tooluse_copyhead_ab.py.
ckpt(74MB)는 gitignore 유지(a_hf_registry). aiden 선행 인스턴스 fire 하베스트(re-train 아님,
a_dont_kill_live_compute) — summary.json sha = aiden source-of-truth MATCH.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…py residual CLOSED (Lane G GPU) (#1840)

* WIP(copyhead): gated pointer-attention copy head + F-COPYHEAD-ARGCOPY trainer

Lane-G architectural fix for #1835 🔴 (correct_call=0/36). Pointer-net style copy
head bolts onto the VERBATIM ConsciousLMReconstructed 18M arch; env/flag-gated so
head-OFF forward is byte-identical to the original arch.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* WIP(copyhead): 하베스트 — 🟢 F-COPYHEAD-ARGCOPY 검증 산출물 + v2 코퍼스 fix

aiden(RTX 5070, Lane G GPU) 선행 인스턴스 fire 결과 하베스트. ckpt(74MB)는
gitignore 유지(HF 업로드 대상). correct_call 0/36 → 35/36(0.9722), 4개 falsifier
+ byte-eq 전부 PASS. summary.json sha = aiden source-of-truth와 MATCH 확인.

- serving/agent_lane_argcopy_gen.py: v2 가변길이 키(2-5) fix — v1 고정 3자 키가
  copy head를 3자 span 복사로 학습시켜 4자 probe(PB01→PB0) truncate → correct_call=0
  였던 것을 length-general copy로 교정.
- .verdicts/tooluse-copyhead/: verbatim verdict + 4 eval json + fire log.
- state/tooluse_copyhead/out/README.md: HF 모델 카드.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* copyhead(harvest): 🟢 F-COPYHEAD-ARGCOPY PASS — design §10 4c + v2 corpus + HF row

Lane-G GPU (pool aiden RTX5070, $0). Gated pointer-attention copy head closes the
#1835 🔴 verbatim key-copy residual: correct_call 0/36 -> 35/36 (0.9722), grounding
0.9722 on held-out PB keys. BYTE-EQ PASS (head-OFF forward max|Δ|=0.0). 3 anti-Goodhart
mirrors PASS (head-OFF copy 0.0, random-init 0.0, tool-disabled 0.0). v2 corpus fix:
variable-length keys (2-5) so the pointer copies the whole key (v1 fixed-3char truncated
PB01->PB0). PUBLIC dancinlab/anima-clm-chat-rung0-byte-18m-copyhead, CLM collection.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… key-copy is SCALE-EMERGENT (Lane G GPU) (#1841)

* WIP: tooluse copy-scale ladder driver (Lever B) — storm-survival checkpoint

scale-ladder probe for verbatim key-copy emergence (induction-head hypothesis).
same argcopy corpus as #1835, vary model SIZE only (from-scratch each rung),
compute-matched. F-COPY-SCALE = correct_call-vs-size curve on 36 held-out PB keys.
anti-Goodhart: random-init of same size must score 0.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* WIP: copy-scale fire script (aiden/RTX5070, HF-pull argcopy corpus, leak re-check)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* AGENT tool-use copy-scale ladder (Lever B) — 🟠 F-COPY-SCALE: verbatim key-copy is SCALE-EMERGENT (Lane G GPU)

Lever-B of the 2-lever argcopy-residual probe. Sibling of #1835 (single-size 18M,
correct_call 0/36 🔴). Tests the induction-head hypothesis: does verbatim held-out
key-copy EMERGE with model scale for the PLAIN byte-CLM (no copy head — that's Lever A)?

방법: SAME argcopy corpus (dancinlab/anima-agent-lane-argcopy-corpus, sha ff137ad8,
leak=0) + SAME steps/batch/block/lr (compute-matched), vary SIZE only, every rung
from-scratch (the 18M base can't seed a larger model). POOL host aiden RTX 5070, $0,
nvidia-smi 99-100% busy, NO CPU fallback.

F-COPY-SCALE {size→correct_call} VERBATIM (p7, 36 held-out PBnn keys):
  5.52M  = 0/36 (0.0)
  18.13M = 0/36 (0.0)   ← matches #1835 18M
  42.54M = 0/36 (0.0)
  82.69M = 2/36 (0.0556)  ← FIRST non-zero
  142.51M= 7/36 (0.1944)  ← ~3.5x rise
rising_monotone=True · reaches_bar(≥0.5)=False · randinit_all_zero=True (anti-Goodhart).

RULING 🟠 AMBER — copy emergence TRENDING. Verbatim held-out arg-copy is SCALE-EMERGENT:
absent below ~80M (the #1835 toy-18M closed-neg was correct + correctly scoped), then
rises. r4 verbatim-copied 7 unseen keys (PB01/02/05/07/17/31/32) end-to-end; smaller
rungs invent a training-shaped key (PB31→fact_lookup PB3). call_rate ~1.0 throughout —
what scales is the COPY, not the calling.

honest (a_scale_honest_scope): POOL caps at consumer VRAM (r4 peak 10.54G < 11G cap;
r5 d1024 would exceed it). A TRUE 7B was NOT run on the pool — 0.1944 is a 142.5M result,
NOT a 7B result. RECOMMENDATION: a true-7B-on-H100 confirm is the next rung; if 7B still
short, the Lever-A copy/pointer head is the structural fix. p1..p8 held; DESCENT 🟢.

HF: r4 dancinlab/anima-clm-tooluse-copy-scale-r4-byte-142m (PRIVATE, sha 5d361dcc) +
r3 ...-r3-byte-83m (PRIVATE, sha 0443efd7) — intermediate scale-ladder rungs.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ON CUDA-graph util lever CLOSED-NEGATIVE (substrate=GPU) (#1843)

* WIP(ENGINE 3B Lane G): skeleton for HEXA-FUSION preflight STOP verdict

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* domain(ENGINE+CLM+KOSMOS): Lane-G 3B forge PREFLIGHT STOP — HEXA-FUSION CUDA-graph util lever CLOSED-NEGATIVE (substrate=GPU)

HARD PREFLIGHT GATE => STOP ($0, no GPU rented). anima Lane-G forge trainer IS the
hexa-lang clm_prod binary; the CUDA-graph capture/replay util unblock (a_cuda_graph_train)
is falsified upstream: GRAPH=0 11.85% -> GRAPH=1 13.17% (+1.32pp byte-eq), whole-step
(AdamW-in-graph) 13.54%, median 2% — all FAR under the 20% GREEN gate. ROOT: the binding
constraint is the SERIAL fine-grained kernel DAG, not host launch overhead; util-GREEN not
reachable by graph capture of any region. Corroborated by anima FORGE-UTILGREEN lever-1..5
(all RED byte-eq, lever-5 WORKLOAD-BOUND TERMINAL) + the rung A-1 forge-interpreter wall.

Phase-1 util-gate config is KNOWN util-RED so NOT fired (re-confirm-at-cost forbidden,
a_completeness_over_cheap); Phase-2 production + 7B gated behind util-GATE GREEN the current
lever family cannot pass. util-GREEN NOT fabricated. Inbox dependency already filed upstream
(hexa-lang/inbox/patches/anima-laneg-forge-util-fusion-binding.md, a_runpod_inbox) — no
anima-side patch (no workaround; anima just invokes clm_prod). byte V=256; production corpus
WOULD be v2 default-lane (12.5MB) not 402KB (Phase-2 not reached).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
v3 = v2 3-surface recipe scaled ~17x (wiki 100MB + persona 80MB + enrich 37MB)
to make a MID ~150M rung data-viable. NOT 7B-ready (a_scale_honest_scope).
Reuses v2 generators/sampler/merge unchanged; budget-parametric.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…MB, MID rung viable, NOT 7B (a_scale_honest_scope)

v2 3-surface recipe scaled ~17x (12.5MB -> 216.994MB) so a MID ~150M rung becomes
data-viable. wiki 46.15% (100MB REAL CC-BY-SA wikimedia/wikipedia 20231101 en/fr/de/es/ko,
20MB/lang, 8-band offset-spread, $0 CPU REST NO GPU NO pod) / persona 36.90% (authored,
no PII) / enrichment 16.95% (carving seed=real KOSMOS e7_31, prose=authored).

HONEST GATES PASS: byte-vocab V=256 (206 distinct, all <=255) · UTF-8 round-trip
encode==decode bytes-identical · p2/p3/p4 [role:|[persona:|[character: grep=0 (unified
+ sample head) · p6 dialogue-act NON-supportive. a_scale_honest_scope VERBATIM: v3
unlocks MID ~150M NOT 7B (7B=~140GB tokens INFEASIBLE via REST) -- NO 7B claim.

New 429-hardened sampler build_wiki_backbone_5lang_scaleup.py (exp backoff + Retry-After
+ per-lang on-disk checkpoint; v2 sampler tripped HTTP 429 on the 100MB pull).

HF dancinlab/anima-corpus-5lang-unified-v3 PUBLIC (9 files + README, sha 901ccc89
re-download MATCH, private=False VERIFIED) + KOSMOS collection (membership VERIFIED)
+ HF.jsonl row + corpus_5lang_v3.kosmos anchor (tier 55) + KOSMOS.md hub pointer.
Discovery .discoveries/default-lane-v3-corpus.tape (tape-lint clean). CORPUS.log.md
v3 entry (合算보관 — CORPUS.md lane snapshot-table reconcile DEFERRED, concurrent agent
owns lane section). Multi-MB raw text = HF/LOCAL only (explicit-path git add).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant