feat: protlabel EAT engine + protspace transfer subcommand by tsenoner · Pull Request #55 · tsenoner/protspace

tsenoner · 2026-06-11T21:18:18Z

Summary

Backend for Embedding Annotation Transfer (EAT) — the engine from #54, packaged so the conference users' proximity-mining workflow becomes a thin layer on top rather than a parallel reimplementation.

New protlabel package (numpy/scipy/h5py only, strict no-protspace-imports boundary): kNN in true pLM embedding space + goPredSim reliability index (RI = 0.5/(0.5+d), Eq. 5) + a persistable .npz lookup sidecar. Ships as a second top-level package in this repo (built into the wheel); a future standalone PyPI split is mechanical.
New protspace transfer subcommand: classifies query vs reference proteins (ID-prefix / col~substr, no hardcoded biology), transfers each query's missing annotation value from its nearest annotated reference, and writes a per-cell overlay into the bundle.
Overlay format: appends <col>__pred_value (string), <col>__pred_confidence (float32, RI in [0,1]), <col>__pred_source (string) — the curated <col> is left untouched, and the bundle keeps its protein_id id column, so existing web readers stay compatible.
Defaults: Euclidean (cosine opt-in via --metric), k=1. Distances are computed in the original embedding space (HDF5), not in the 2-D/3-D projection (DR is non-isometric).
Storage: the reference matrix is a rebuildable sidecar, never shipped in the bundle (sizing/feasibility in the spec); brute-force kNN is laptop-feasible to full Swiss-Prot, with adaptive per-chunk memory bounding.

Design & scope

Spec: docs/superpowers/specs/2026-06-11-eat-annotation-transfer-design.md
Plan: docs/superpowers/plans/2026-06-11-eat-transfer-backend.md
Out of scope (follow-ups): the web frontend rendering (separate protspace_web PR — a value-level "predicted-by-transfer" layer orthogonal to PR #272's column-level badge), optional gating/consensus/EDD elbow, neighborhood mining, HTML report, faiss-cpu accelerator, ProtTucker learned distance.
Implements the backend scope of [FEATURE] EAT — Embedding Annotation Transfer (protlabel lookup table) #54.

Test plan

uv run pytest tests/ -m "not slow" → 545 passed
protlabel boundary: no protspace imports
uv run ruff check src/ tests/ clean
End-to-end: real protein_id bundle round-trip through the CLI (load_h5 → transfer → write) — overlay values correct, projection + settings parts preserved byte-for-byte
Reviewer: sanity-check on a real ProtT5 dataset (RI is ProtT5-calibrated; monotone-but-uncalibrated for other embedders)

🤖 Generated with Claude Code

…entation plan

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…tion Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Add `add_overlay_columns()` in `src/protspace/data/io/predictions.py` that appends three aligned Arrow columns (`COL__pred_value`, `COL__pred_confidence`, `COL__pred_source`) from a list of `protlabel.Prediction` objects, leaving the curated column untouched. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Implements Task 9: the EAT orchestration core (run_transfer) and the 'protspace transfer' Typer CLI command, wiring classification, nearest- neighbour lookup (protlabel.eat), and overlay-column writing into a single pipeline for filling missing annotation values from pLM embedding space. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…rrors - Normalize protein_id→identifier before run_transfer and rename back after so real bundles (produced by protspace prepare) no longer KeyError. - Add ValueError when no bundle proteins match any embedding key. - Correct misleading comment in test_run_transfer_predicts_for_query_with_missing_value. - Add end-to-end regression test exercising the protein_id rename path. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ty, robustness Resolve issues found in code review of the EAT transfer backend (PR #55): - predictions: make the overlay idempotent — drop existing <col>__pred_* columns before re-appending, so re-running transfer replaces them instead of producing a duplicate-column bundle that can no longer be read back - bundle: atomic writes (temp file + os.replace) in write_bundle and the replace_* helpers, so an interrupted in-place overwrite (-b X -o X) can no longer destroy the bundle; reject the reserved delimiter in serialized parts - backends: replace scipy.cdist with a pure-numpy BLAS GEMM path and recompute the surviving top-k distances in float64 (precise for near-identical vectors); guard cosine against zero-norm NaN - lookup: store float32 + unicode arrays, load with allow_pickle=False (no pickle/RCE surface; lossless round-trip) - transfer/classification: materialize only the needed columns (no full to_pylist); deterministic RI tie-break; translate input errors to BadParameter - cli: colon/Windows-safe -e/-i parsing via a shared split_h5_spec helper - docs/notebook: qualify the reliability-index formula per metric and k Adds tests for protlabel engine, overlay idempotency, atomic write, spec parsing, and CLI error handling. Full suite: 572 passed; ruff clean. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…onfidence The per-cell prediction overlay now writes only <col>__pred_value and <col>__pred_confidence. The reference id (source) is noise as a colour feature, so it is dropped from the bundle; it remains available on protlabel's Prediction. A legacy <col>__pred_source is dropped on re-run so older bundles are cleaned up. Keeping confidence as a separate numeric column lets the web frontend colour and threshold by reliability (gradient legend) — which inline label|score values do not enable (those render tooltip-only). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

t03i · 2026-06-25T08:19:48Z

+
+The reliability index depends on the `--metric` and `--k` used during transfer:
+
+- **Default (`--metric euclidean`, `--k 1`):** `0.5 / (0.5 + distance)`.


default should be cosine.

Done in 7ea9eeb — protspace transfer now defaults to --metric cosine (bounded, interpretable reliability index); euclidean stays opt-in. cli.md / annotations.md / the notebook are updated.

t03i · 2026-06-25T08:22:30Z

+Verified against primary sources (goPredSim / Littmann et al. *Sci Rep* 2021; EAT tool / Heinzinger et al. *NAR Genom Bioinform* 2022):
+
+- **Space:** original pLM embedding space (mean-pooled per-protein vectors). **Not** DR coordinates.
+- **Metric:** **Euclidean (L2)**, default. *Nuance (verifier correction):* the strong "Euclidean beats cosine for pLM embeddings" statement is from the **2022** paper (citing prior work); the **2021** paper only found cosine "changed little." Euclidean is still the right default because it is the canonical tool default and the documented 2022 finding — but the basis is "tool convention + 2022 claim," not "both papers." Cosine stays an opt-in `--metric`.


This is outdated. Most of the RAG body of work suggests cosine.
While there are clear weaknesses should still be the default

Default flipped to cosine. The design doc was rewritten to as-built and now states cosine-by-default with the rationale (7ea9eeb).

t03i · 2026-06-25T08:24:24Z

+  ```
+
+  For the default k=1 this collapses to `RI = 0.5/(0.5 + d)`. The `(1/k)·Σ_{neighbours carrying p}` term *is* the multi-neighbour agreement weighting; report `RI` directly as the `[0,1]` confidence.
+- **Distance→accuracy calibration (reference point, ProtT5/CATH):** at Euclidean distance ≤ 1.1, ~75% coverage with ~90% accuracy at CATH H-level; ProtTucker (contrastive) reaches ~76% H-level vs raw ProtT5 EAT ~64% and HMMER ~77%. **Caveat (critical):** the `0.5` constant in `s(d)` and the `1.1` threshold are **ProtT5-specific**. ProtSpace supports 12 embedders (320–2560 dim) with different distance scales — RI stays *monotone* (good for ranking) but is **not a calibrated probability** for other models without re-validation. Document this loudly.


Also this only holds for one specific dataset. This is not a reliable foundation to base reasoning on. We'd need to do own experiments or just loosely refer to this.

Removed the dataset-specific accuracy numbers. The rewritten spec only notes euclidean RI is monotone-but-not-calibrated and points to the small measured sanity-check in data/eat_demo/ (7ea9eeb).

t03i · 2026-06-25T08:25:07Z

+
+**Output contract (mirror eat.py for interoperability):** per query → `query_id`, transferred `label`, `source_id` (nearest reference), `source_label`, `distance`, `reliability`. Accept goPredSim's 2-column `id → comma-separated labels` lookup-label file so existing EAT/goPredSim lookups drop in.
+
+**Optional upgrade path (documented, not built first):** ProtTucker-style contrastive projection or CLEAN-style EC centroids as a future "learned distance" mode. Ship raw-embedding Euclidean EAT first — it needs no training and is the published baseline.


This goes far beyond eat and is not planned for protspace.

Dropped — the ProtTucker/CLEAN learned-distance path is removed and now listed as an explicit non-goal (7ea9eeb).

t03i · 2026-06-25T08:25:32Z

+    ├── protlabel/                     # NEW second top-level package — the EAT engine (issue #54)
+    │   ├── __init__.py                # public API: eat(), Lookup, Prediction
+    │   ├── reliability.py             # goPredSim distance→[0,1] reliability transform
+    │   ├── backends.py                # brute-force (default) | faiss (optional, later) NN search


The brute force as it is rn works and is fast. However we might want to implement query batching to speed up/ parallelize computation.
Faiss is not the right alternative: https://pypi.org/project/usearch/
However my preliminary tests on resource constraint box suggest brute force is faster. Worth testing a bit more but this is only relevant if we support large lookup sets.

Removed faiss from the design. The shipped engine is exact brute-force with query batching; usearch is noted as the future ANN option if scale ever requires it. A dedicated brute-force-vs-usearch scaling study is planned next (7ea9eeb).

t03i · 2026-06-25T09:03:58Z

+**Brute-force kNN is laptop-feasible across the entire range, including full Swiss-Prot.** Measured (Apple Silicon, chunked numpy GEMM + argpartition; reproduced by an independent verifier within ~10–25%):
+
+| Query batch × references × dim | wall time |
+|---|---|
+| 1,000 × 100K × 1024 | ~0.8–0.9 s |
+| 1,000 × 573K × 1024 | ~4–4.6 s (~4 ms/query) |
+| 1,000 × 573K × 2560 | ~6 s (~6 ms/query) |
+| single query × 573K | ~4–6 ms |
+


This is highly parallelized across many cores with plenty ram. A realisitc target has 4cores and 4GB ram intel CPU Virtual machine.
Also 6s is slow for a deployed solution. Batching will give vastily better results (e.g. 128 lookups at once in parallel)

Removed the Apple-Silicon benchmark table; the doc no longer makes hardware-specific throughput claims. Query batching is in the shipped engine (7ea9eeb).

Re-measured on the real envelope: a docker --cpus=4 --memory=4g container, one fresh process per config. Full Swiss-Prot (570K x 1024): ~9.7 ms/query euclidean, ~7.4 ms/query cosine on 4 arm64 cores (expect ~2-3x on a slower Intel VM, still fine for batch transfer). Query batching is in the engine (a whole query block per GEMM). Reproducible benchmark + numbers in docs/superpowers/research/2026-06-29-usearch-vs-bruteforce.md (d5023ae).

t03i · 2026-06-25T09:07:32Z

+| 1,000 × 573K × 2560 | ~6 s (~6 ms/query) |
+| single query × 573K | ~4–6 ms |
+
+**The binding constraint is RAM (to hold the reference matrix), not compute.** Mitigation: load the reference as fp16 and upcast per chunk, chunk the N axis so the Q×N distance block never materializes at full size. This stays within a 16 GB laptop at D=1024 and is borderline-but-workable at D=2560. Older Intel/CI machines run ~2–5× slower but stay sub-minute for a few queries at Swiss-Prot scale.


Not relevant we're working with 4GB deployed/ 64G colab

Removed the 16GB framing — no longer relevant in the rewritten doc (7ea9eeb).

Tested against the 4GB target directly, which surfaced a real issue: the cosine path held the reference matrix twice (raw + normalized copy), so cosine at Swiss-Prot / dim 1024 needed ~4.7 GB and would OOM a 4 GB box. Fixed in d5023ae by folding the per-reference norm into the dot product, so cosine now holds 1x references like euclidean. Measured in a 4 GB container: full Swiss-Prot (570K x 1024) fits at ~3 GB peak for both metrics. dim 2560 (ESM2-3B) references are ~5.8 GB and still exceed 4 GB (smaller model or fp16 there).

t03i · 2026-06-25T09:09:56Z

+## 9. Frontend representation (extends PR #272, does not duplicate it)
+
+**Two orthogonal axes — codify this mental model:**
+
+- **Axis A (existing, #272): column-level provenance** — "this whole column is a model output" (Biocentral / Phobius / TED). Keep `AnnotationMeta.isPredicted`, the ⚡ dropdown/legend badge, and the info-popover **unchanged**.
+- **Axis B (new, EAT): cell-level provenance** — "this specific protein's value was *transferred from a neighbour*, confidence X, source Y." New visual language below. Never overload the ⚡ badge to mean both.
+
+### 9.1 Scatter plot — the primary cue is *shape*, not colour
+
+- **Observed/curated cells → filled markers** (current behaviour). **EAT-imputed cells → hollow (outline-only) markers in the same category hue**, so cluster identity is preserved while provenance reads at a glance. This is an established convention (filled = observed, open = imputed) and satisfies "never colour-only" (accessibility; ~4% CVD).
+  - Implementable in the existing WebGL renderer: add a per-point `a_predicted` float attribute (mirror the existing `a_shape` plumbing) and a ring-only branch reusing the current edge-distance/outline math (`strokeWidth = 0.15`, `webgl-renderer.ts`). No shader rewrite.
+- **Confidence → redundant opacity (and optional size) ramp on imputed points only.** `alpha = lerp(0.25, 0.9, confidence)`; observed points stay at `baseOpacity 0.9`. Optionally scale size by `sqrt(confidence)`. For very low confidence (<0.3), desaturate toward grey (lightweight VSUP). Hooks: `getOpacity`/`getBaseOpacity`/`getPointSize` in `style-getters.ts`.
+
+### 9.2 Tooltip — per-point provenance line
+
+Extend `AnnotationBlock` + `renderAnnotationBlock` (`protein-tooltip.ts`) with an EAT row, distinct from observed values:
+
+> ⚡ **Predicted:** Neurotoxin (82%) — transferred from **P12345** via ProtT5, k=1
+
+with an inline confidence bar and the source id as a **click target** that selects/centres that reference in the scatter. Observed values render exactly as today (no chip).
+
+### 9.3 Legend — a separate "Predicted (transferred)" sub-section
+
+When the active annotation has any imputed cells, render a small group with two swatches — **filled = "Observed"**, **hollow = "Predicted by EAT"** — and a note "Faint = low confidence", plus live counts ("1,204 shown / 380 below threshold"). Add as a new optional block in `legend-renderer.ts` (alongside `renderHeader`). **Do not** merge into the ⚡ header badge (that is Axis A).
+
+### 9.4 Global control — one "Predicted annotations" group near the dropdown/legend
+
+- **Toggle "Show predicted annotations"** (off → imputed cells render neutral/N-A; only the curated layer shows).
+- **Confidence-threshold slider** 0–100% with conventional bands (High >80 / Med 50–80 / Low <50); below-threshold imputed points **fade** (`fadedOpacity 0.15`) rather than vanish, preserving layout context.
+- Feed `showPredicted` + `minConfidence` into `StyleConfig`; persist in `LegendPersistedSettings` so the choice survives reload/export. Keyboard-operable with `aria-valuetext`.
+
+### 9.5 Data-model extension (frontend)
+
+Mirror the existing parallel-array pattern (`annotation_scores`, `annotation_evidence` in `types.ts`):
+
+```ts
+// VisualizationData (optional, populated only when the bundle carries the overlay)
+annotation_predicted?:        Record<string, (PredictedCell | null)[]>;
+// PredictedCell = { confidence: number; sourceId: string; k?: number; method?: string }
+```
+
+Loader (`data-loader/utils/bundle.ts`) pivots the sparse `predicted_annotations` table into these arrays at parse time. Backward compatible: old bundles lack the table → no overlay; the parser already tolerates unknown columns/parts.
+
+### 9.6 Frontend gotchas to respect
+
+- Multi-label cells: treat a cell as imputed **only if all its values were transferred**; otherwise show observed with a tooltip note.
+- Selection opacity must override confidence dimming (a clicked low-confidence point stays visible).
+- Grayscale/PNG export: hollow-vs-filled must be the load-bearing cue (opacity alone is ambiguous in print). The export path renders the same shader, so hollow survives export — verify at 570K points.
+- This is a **separate frontend PR** (depends on the backend emitting the overlay) and warrants its own OpenSpec change in `protspace_web`, building on #272's `annotation-metadata`/`annotation-presentation` capabilities.
+


This is speced in the wrong repo. Better for protspace_web based on the stabilized column api

Removed the entire frontend section; it belongs in protspace_web. The spec now lists frontend work as a non-goal (7ea9eeb).

t03i · 2026-06-25T09:14:09Z

+def similarity(distance: float, metric: str) -> float:
+    """Per-neighbour distance->similarity (the goPredSim reliability transform)."""
+    if metric == "euclidean":
+        return 0.5 / (0.5 + distance)


This RI computation is unclear. The distances can routinely be very large even for close neighbors making it hard to interpret. Also distance can be negative but similarity has to be 0-1. Not accounted for here.

Fixed in reliability.py (7ea9eeb): similarity() clamps to [0,1], treats negative distance as 0, and maps non-finite (NaN/inf) to 0 so an invalid neighbour cannot produce a high confidence. Cosine is now the default (naturally bounded). Regression tests added.

Follow-up, to be precise: the backend never actually emits a negative distance (euclidean is a clamped sqrt; cosine distance is in [0,2]), so the negative-distance clamp is purely defensive for direct callers of similarity() — now stated in the docstring (d5023ae). On the pLM-dependence you flagged: the default is now cosine, whose distance is scale-invariant (depends only on direction, in [0,2] for every embedder), so the large-distance problem does not apply to the default path. The euclidean 0.5/(0.5+d) constant stays ProtT5-calibrated and is documented as a monotone-but-uncalibrated ranking; optional per-pLM euclidean calibration is tracked in #62.

t03i · 2026-06-25T09:19:52Z

    "requests>=2.32.4",
    "typer>=0.24.1",
    "rich>=14.3.3",
+    "scipy>=1.10",


why? Seems not used anywhere in the project and extremely heavy

Removed via uv — scipy was unused (only a docstring + a test comment referenced it). The kNN path is pure numpy (7ea9eeb).

…I, protlabel as uv workspace member Addresses reviewer (t03i) feedback on the EAT backend: - Default metric for `protspace transfer` is now cosine (bounded, interpretable confidence); euclidean stays opt-in. The protlabel engine keeps goPredSim-canonical euclidean as its primitive default. - Reliability index clamps to [0,1], guards negative distance, and maps non-finite (NaN/inf) distances to 0 so an invalid neighbour can't yield a high confidence. (NaN->1.0 bug found by our own xhigh review; redundant clamp dropped.) - Drop the unused, heavy scipy dependency (only a docstring/test comment referenced it). - Extract protlabel into a uv workspace member (packages/protlabel) with its own pyproject + dependencies (numpy only), published as its own distribution; protspace depends on protlabel>=4.4.0. No-protspace-imports boundary enforced by a test; lock-step versioning via semantic-release; CI + Docker build both packages. - Move protlabel's engine tests into the member (packages/protlabel/tests); a bare `uv run pytest` covers both via testpaths. - Rewrite the design spec to as-built reality (cosine default, brute-force + query batching, workspace architecture); drop the frontend (-> protspace_web), the ProtTucker/faiss speculation, and hardware-specific benchmarks. Verified: 576 tests pass, ruff clean, `uv build --all-packages` produces both wheels with a clean dependency boundary (protlabel requires only numpy). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…ucible benchmark Substantiates the brute-force-default decision (PR #55 review): an empirical benchmark (packages/protlabel/benchmarks/bench_knn.py) of protlabel's exact chunked-GEMM kNN vs usearch HNSW across n_refs {1K,10K,100K} x dim {320,1024}, plus literature context and a recommendation. Finding: brute-force wins end-to-end for protspace transfer's one-shot/batch usage (exact, no build, sub-ms to low-ms/query through Swiss-Prot scale). usearch only pays off for a persisted index reused across tens of thousands of queries, or as a memory lever (i8/f16 quantization) at full Swiss-Prot on a 4GB box. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…ependency removal The scipy dependency was removed earlier; backends.py and a test comment still named scipy.cdist as the comparison baseline. Reword to neutral phrasing so no scipy reference remains in the tree (the kNN path is pure numpy). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…y 4GB deploy fit The cosine path in backends.nearest held the reference matrix twice (raw + a normalized copy), so cosine at full Swiss-Prot / dim 1024 needed ~4.7 GB and would OOM a 4-core/4 GB deployed box. Fold the per-reference norm into the dot product (cos = q.r / (||q|| ||r||)) instead of storing a normalized copy, so cosine holds 1x references like euclidean. Behaviour preserved (existing cosine equivalence + zero-vector tests stay green); _l2_normalize is now unused and removed. Measured in a docker --cpus=4 --memory=4g container (one fresh process per config): full Swiss-Prot (570K x 1024) now fits at ~3 GB peak for both metrics, ~7-10 ms/query on 4 arm64 cores. Adds packages/protlabel/benchmarks/bench_memory.py and folds the results into the research doc. Also clarifies in reliability.py that the backend never emits negative distances (euclidean is a clamped sqrt; cosine distance in [0,2]) — the guard is defensive. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

tsenoner · 2026-06-30T16:36:43Z

Re-review ready — consolidated summary

All 12 inline threads have in-thread replies; here's the whole picture across 4 commits (7ea9eeb, a7792de, bb0cfc2, d5023ae).

Code / behaviour

Default metric → cosine (bounded, scale-invariant, interpretable confidence); euclidean stays opt-in.
Reliability index clamps to [0,1], guards negative distance, and maps non-finite (NaN/inf) → 0, so an invalid neighbour can no longer report high confidence. (The backend never actually emits negative distances — euclidean is a clamped sqrt, cosine is in [0,2] — so that guard is purely defensive.)
Dropped the unused scipy dependency — the kNN path is pure numpy.

Architecture (the uv-workspace point)

protlabel is now a uv workspace member with its own pyproject.toml + dependencies (numpy only), built and published as its own distribution; protspace depends on protlabel>=4.4.0. The no-protspace-imports boundary is enforced by a test; lock-step release; CI + Docker build both packages.

Memory / the 4-core·4 GB deploy target

Testing against the 4 GB target surfaced a real issue: the cosine path held the reference matrix twice (~4.7 GB at Swiss-Prot/dim-1024 → would OOM). Fixed by folding the per-reference norm into the dot product → cosine holds 1× references like euclidean (behaviour preserved).
Measured in docker --cpus=4 --memory=4g (one fresh process per config): full Swiss-Prot (570K × 1024) now fits at ~3 GB peak for both metrics, ~7–10 ms/query on 4 cores. (dim 2560 references are ~5.8 GB → needs fp16 / a smaller model there.)

Docs / scope / analyses

Design spec rewritten to as-built (cosine default, brute-force + query batching, workspace) — dropped the wrong-repo frontend section, the ProtTucker/faiss speculation, and the misleading Apple-Silicon benchmark table.
Frontend is tracked in [FEATURE] Frontend: render EAT predicted-by-transfer annotations (value-level overlay) protspace_web#277 (updated to as-built — 2 overlay columns, cosine caveat).
usearch vs brute-force study added (docs/superpowers/research/2026-06-29-usearch-vs-bruteforce.md + reproducible benchmarks): brute-force is the right default; usearch only pays off for a persisted index serving ≫10K lookups. faiss stays rejected.
Optional per-pLM euclidean calibration captured as a deferred follow-up: Optional: per-pLM calibration of the euclidean reliability index (EAT) #62.

CI is green and the Docker image is verified to install both packages. Ready for another look 🙏

tsenoner and others added 19 commits June 11, 2026 19:17

chore(docs): add EAT annotation-transfer design spec + backend implem…

355cd3f

…entation plan

chore(protlabel): scaffold EAT engine package + scipy dep

70881d7

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat(protlabel): goPredSim reliability index transform

ee482ba

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat(protlabel): chunked brute-force kNN backend

4e99e8d

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

fix(protlabel): bound kNN per-chunk memory adaptively; guard k>=1

d494242

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat(protlabel): kNN label transfer with reliability index

c07aef5

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

test(protlabel): document RI tie-break and cover nearest-source selec…

4b39cb8

…tion Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat(protlabel): persistable Lookup sidecar + public API

796e5b1

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat: query/reference classifier for annotation transfer

ae7fcc2

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

test: cover neither-match exclusion and multi-prefix OR in classifier

bc8837e

test: cover empty-predictions and unknown-id overlay edge cases

05194bf

feat: replace annotations part of a parquetbundle in place

5093f66

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

docs: document protspace transfer + prediction overlay columns

0ee1354

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

docs: correct transfer --metric options (euclidean, cosine only)

21d508c

feat(transfer): warn on zero transfers; validate --metric/--k early

a05e977

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

chore(docs): remove EAT build plan + superseded draft; keep design spec

98b42f6

tsenoner mentioned this pull request Jun 12, 2026

[FEATURE] Frontend: render EAT predicted-by-transfer annotations (value-level overlay) tsenoner/protspace_web#277

Open

7 tasks

tsenoner and others added 2 commits June 16, 2026 17:20

tsenoner marked this pull request as draft June 16, 2026 17:37

Merge branch 'main' into feat/eat-transfer-backend

72fa7b7

tsenoner requested review from peymanvahidi and t03i June 17, 2026 13:52

tsenoner marked this pull request as ready for review June 17, 2026 18:26

t03i requested changes Jun 25, 2026

View reviewed changes

tsenoner and others added 2 commits June 29, 2026 14:04

tsenoner mentioned this pull request Jun 30, 2026

Optional: per-pLM calibration of the euclidean reliability index (EAT) #62

Open

4 tasks

tsenoner requested a review from t03i June 30, 2026 16:35


		The reliability index depends on the `--metric` and `--k` used during transfer:

		- Default (`--metric euclidean`, `--k 1`): `0.5 / (0.5 + distance)`.


		Output contract (mirror eat.py for interoperability): per query → `query_id`, transferred `label`, `source_id` (nearest reference), `source_label`, `distance`, `reliability`. Accept goPredSim's 2-column `id → comma-separated labels` lookup-label file so existing EAT/goPredSim lookups drop in.

		Optional upgrade path (documented, not built first): ProtTucker-style contrastive projection or CLEAN-style EC centroids as a future "learned distance" mode. Ship raw-embedding Euclidean EAT first — it needs no training and is the published baseline.

Conversation

tsenoner commented Jun 11, 2026

Summary

Design & scope

Test plan

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

t03i Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tsenoner commented Jun 30, 2026

Re-review ready — consolidated summary

Code / behaviour

Architecture (the uv-workspace point)

Memory / the 4-core·4 GB deploy target

Docs / scope / analyses

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

t03i Jun 25, 2026 •

edited

Loading