feat(research-areas): refresh all ResearchAreas pages for matrix completeness by benjibromberg · Pull Request #35 · tucca-cellag/caail

benjibromberg · 2026-06-02T18:21:09Z

What & why

The ResearchAreas/*.md deep-dives behind the Papers.md matrix had drifted into two tiers: four mature pages and four near-empty stubs that covered only a fraction of their matrix columns (e.g. Cellular Engineering's column spans ~32 papers; its prose covered ~6). This PR brings all eight pages up to the mature template and reconciles each against its matrix column, so every paper assigned to a column is now represented in that column's page.

How it was built (grounding + adversarial review)

Every factual claim is grounded in the paper's full text, pulled from the locally-synced caail Zotero group library (localhost:23119) rather than abstracts — per the repo's AI-agent convention. Every drafted/edited page then passed the repo's two read-only adversarial reviewer subagents before commit (writer ≠ reviewer):

caail-citation-reviewer — every (… ref #N) anchor resolves to the right Papers.md ID with matching author/year, and every Software.md/Databases.md/Datasets//Talks.md cross-link resolves.
caail-claim-reviewer — every count, metric, method description, and identity claim checked against source full text.

The review caught and fixed real issues, e.g. a contradicted CFD cross-validation figure (Bioprocess #29), a GA-objective drift (#31), an unsupported "first" superlative (Media #1), incomplete enumerations (Sensory #171, AIEval #164 SciHorizon-Gene), and a King-2004 headword ("Adam" → "The Robot Scientist").

Per-page coverage

Page	Work	Matrix cells now covered
Scaffolding	full rewrite to template	refs #19, #20, #34, #35
Media Optimization	full rewrite	#1–3, #15–18, #21, #23–25, #58, #169, #170
Bioprocess Control	full rewrite	#7, #29–33, #59, #61, #62
Cellular Engineering	full rewrite (~32 papers)	single-cell ML, biological foundation models, perturbation prediction, genetic design, agents
Sensory Prediction	+taste/off-flavor + texture	added #102–105, #171, #195
AI Evaluation & Benchmarking	+frontier & gene benchmarks	added #155–159, #164, #196
AI Tooling / Methodology	+13 papers across clusters	added #133, #151–154, #160–163, #166, #167, #182, #197
Metabolic Modeling	verified (no matrix column)	all cross-links confirmed; no changes needed

Verification

Anchor integrity: all Papers.md#N anchors across the 8 pages resolve to existing IDs (0 dangling).
Column completeness: every matrix-column paper now appears in its page's prose (0 missing across all 7 column-backed pages).
Site build: pnpm --dir site build succeeds — all 8 /research-areas/* pages render.

Follow-ups (out of scope here — not modifying `Papers.md`)

Grounding surfaced a few likely matrix-placement nuances to consider in a separate PR: #34 Andrews 2025 sits in the SVM row but uses a genetic algorithm + DL surrogate (no SVM); #6 DNABERT sits in Deep Learning though it is a masked-LM model; #60 Mathieu and #145 MetaGEM are placed among foundation models but are an interactome perspective and a metabolic-model reconstruction respectively. The prose describes each accurately; the matrix cells were left untouched.

🤖 Generated with Claude Code

…ations Expand the Scaffolding stub into the full deep-dive template: thematic sections (mould/scaffold geometry surrogates; mechanical & print-quality prediction), clickable Papers.md anchors for refs #19/#20/#34/#35, a Tools and data section, an open-challenges analysis, and a Further reading footer. Every claim grounded in the cited papers' full text and passed both adversarial reviewers (citation + claim).

…ded citations Reconcile the page against its full matrix column (refs #1-3, #15-18, #21, #23-25, #58, #169-170): thematic clusters (Bayesian optimization, hybrid surrogate/evolutionary search, active learning, explainable feature selection), clickable Papers.md anchors, Tools and data and open-challenges sections, and a Further reading footer. Kanda (#16) is framed as protocol- not media-optimization per its full text. Every claim grounded in source full text via the caail Zotero library; passed both adversarial reviewers (the one flagged 'first' superlative removed).

…ded citations Reconcile against the matrix column (refs #7, #29-33, #59, #61-62): thematic sections (soft sensors; CFD-surrogate acceleration; small-data fermentation prediction; reduced metabolic models/digital twins; autonomous experimentation), clickable Papers.md anchors, Tools and data and open-challenges sections, and a Further reading footer. The microbial-vs-mammalian framing is grounded in the corpus. Ref #33's claim is held to its catalogued title (full text paywalled). Passed both adversarial reviewers; #29 (CFD error) and #31 (GA objective) corrected to match source full text.

…unded citations Expand the 3-paragraph stub to cover the full ~32-paper matrix column: single-cell characterization, biological foundation models (MLM / autoregressive / cross-species / LLM-native), perturbation-response prediction, genetic-part design, and autonomous cell-engineering agents, plus a cross-cutting section. DNABERT (#6) framed as a masked-LM model and Mathieu (#60) surfaced as the one explicitly cultivated-meat study, both per full text. Clickable Papers.md anchors, Software/Datasets/ Databases cross-links, Tools and data, open-challenges, Further reading. Passed citation review and both halves of the claim review.

…y Prediction Weave in the matrix-column papers missing from the page: a new 'Computational prediction of taste and off-flavor' section covering the Niv-lab bitterness lineage (BitterPredict #102, BitterIntense #103, BitterMatch #104, BitterMasS #105), plus texture (#171) and image-based freshness QC (#195) in the applied section. Grounded in source full text; passed citation review and claim review (model-list enumeration corrected to be non-exhaustive).

…I Evaluation Cover the matrix-column benchmarks missing from the page: a new 'General-purpose frontier benchmarks (capability context)' subsection (SWE-bench #155, GPQA #156, MMLU-Pro #157, Humanity's Last Exam #158, FrontierScience #159) framed honestly as capability context rather than biology evals, plus SciHorizon-Gene #164 (cell-state benchmarks) and MeatScan #196 (domain-specific). Cross-linked to existing Datasets/Benchmarks.md and Databases.md leaderboard entries. Passed citation review and claim review (SciHorizon-Gene axes corrected to all four, 'literature influence' restored).

Weave the missing AI Tooling-column papers into their clusters: discovery agents (Co-Scientist #153, Robin #154, ERA #166), domain-specific agents (Talk2Biomodels/T2KG #167), robot scientists (the original King 2004 Robot Scientist #182), agent infrastructure (BioContextAI #133, Aviary #160, SciAtlas #162), chemistry agents (ether0 #161, MoleCode #163), and the 'other methodology' section (Epicure #197 plus the Gu CHI human-AI-verification studies #151/#152, framed accurately as HCI studies, not agents). Also fix a pre-existing broken Further-reading anchor (Talks section lives in Talks.md, not OtherResources.md). Passed citation review and claim review (King-2004 headword corrected from 'Adam' to 'The Robot Scientist'; agent order fixed).

benjibromberg added 7 commits June 2, 2026 13:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(research-areas): refresh all ResearchAreas pages for matrix completeness#35

feat(research-areas): refresh all ResearchAreas pages for matrix completeness#35
benjibromberg wants to merge 7 commits into
mainfrom
worktree-feat+research-areas-refresh

benjibromberg commented Jun 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

benjibromberg commented Jun 2, 2026

What & why

How it was built (grounding + adversarial review)

Per-page coverage

Verification

Follow-ups (out of scope here — not modifying Papers.md)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Follow-ups (out of scope here — not modifying `Papers.md`)