diff --git a/.claude/skills/cell2location-context/SKILL.md b/.claude/skills/cell2location-context/SKILL.md
new file mode 100644
index 00000000..a1b6791e
--- /dev/null
+++ b/.claude/skills/cell2location-context/SKILL.md
@@ -0,0 +1,166 @@
+---
+name: cell2location-context
+description: "Capture and persist the scientific scope + technical decisions of a cell2location spatial-mapping project. Auto-discovers SPATIAL_MAPPING_CONTEXT.md in cwd / .claude/ / ~/.claude/plans/; if missing, runs a focused interview (--science = goals, reference, populations, success/failure criteria; --technical = Phases 1–8 decision sweep with gap-flagging). Called automatically by /spatial-mapping at the start (--science) and before launch (--technical). Standalone-invocable via /cell2location-context to update goals between runs."
+user-invocable: true
+---
+
+# cell2location-context — persistent scope + decision memory for spatial mapping
+
+This skill owns the **`SPATIAL_MAPPING_CONTEXT.md`** file: discovery, summarization, creation, updates. It is called automatically by [spatial-mapping/SKILL.md](../spatial-mapping/SKILL.md) (twice: once at the start with `--science`, once before launch with `--technical`) and can also be invoked standalone to edit goals without running the full mapping workflow.
+
+The file is the durable project memory the workflow consults on every run: scientific goals, single-cell reference, target populations, success / failure criteria, plus the technical decisions made through Phases 1–8. It is also read by [cell2location-troubleshooting/SKILL.md](../cell2location-troubleshooting/SKILL.md) so symptom-matching can lean on the user-declared failure criteria.
+
+---
+
+## How this skill works
+
+```
+                       ┌─ file found → summarize → ask: Use / Re-interview / Skip
+                       │
+1. Auto-discover ─────┤
+                       │
+                       └─ no file  → ask: Interview / Import handoff / Skip
+                                          │        │           │
+                                          │        │           └→ emit “no context” warning marker; return "skipped"
+                                          │        └→ ask path; persist stub with `## Source:` line; return its path
+                                          └→ ask input-mode (one-time) → ask location → run rubric → persist → return path
+```
+
+The skill always returns one of: a file path the caller should load, or the string `"skipped"`.
+
+## File location resolution (fixed candidate order)
+
+Check, in order, on every invocation:
+
+1. `$CWD/SPATIAL_MAPPING_CONTEXT.md` (top-level in the user’s spatial project — default for new files)
+2. `$CWD/.claude/SPATIAL_MAPPING_CONTEXT.md` (hidden, project-scoped)
+3. `~/.claude/plans/SPATIAL_MAPPING_CONTEXT_<dataset>.md` (centralized; `<dataset>` derived from the spatial AnnData filename, or asked from user)
+
+First match wins. On **first creation**, ask the user via `AskUserQuestion` which of the three locations to save to. Default to option 1.
+
+When called by `spatial-mapping`, the caller passes `$CWD` (the user’s data project) and, if known, the spatial AnnData filename (used for the `<dataset>` slug in option 3).
+
+---
+
+## Mode `--science` — scope interview
+
+Run when invoked at the start of `spatial-mapping`, or standalone with no mode argument.
+
+### Rubric (7 question groups)
+
+Each group has a **primary question** + **grill-me-style follow-up triggers** for incomplete answers. Full text + recommended-answer examples in [reference/scope_interview_rubric.md](reference/scope_interview_rubric.md). Summary here:
+
+| # | Group | Primary question | Follow-up trigger | Maps to |
+|---|---|---|---|---|
+| 1 | **Scientific goal** | Why are you running cell2location on this dataset? What biological question are you answering? | <20 words OR no concrete outcome named | Phase 0, all phases |
+| 2 | **Single-cell reference** | What scRNA reference are you using? Path, source, tissue, cell count, label count. | Source unclear, tissue mismatch with spatial, <40 cells/type, <10 or >200 cell types | Phase 1 |
+| 3 | **Annotation methodology** | How were the reference cell-type labels assigned? (markers / automated / manual / mixed) Confidence? Known weak spots? | "I just used what came with the atlas" — ask which subclusters were validated | Phase 1, troubleshooting |
+| 4 | **Target populations** | Which specific cell-type populations MUST be spatially resolvable for this analysis to be useful? List as many as relevant. | Fewer than 3 populations named, or populations not in the reference labels | Phase 1 granularity check, Phase 3 |
+| 5 | **Granularity check** | For each target population in #4, does the reference’s `labels_key` distinguish it from neighbours? (Agent auto-checks against `adata_ref.obs[labels_key].unique()` if reference is available; user confirms.) | Any target population lumped with others in the reference labels | Phase 1 (re-annotate?), Phase 3 |
+| 6 | **Success criteria (3)** | What 3 concrete, measurable results would let you conclude "the analysis worked"? | Fewer than 3 listed; or any criterion that is vague ("looks reasonable") rather than measurable | Phase 8 QC, troubleshooting |
+| 7 | **Failure criteria (5–10)** | What 5–10 outcomes must NOT happen for the result to be trustworthy? (Examples: "all cell types appear everywhere", "endothelial cells in white matter", "abundance varies 10× across visually similar regions".) | Fewer than 5 listed; or any criterion that does not describe an observable pattern in the output | Phase 8 QC, troubleshooting, refusal logic |
+
+### Skill behavior
+
+1. **Adapts to existing answers.** If the file is present (auto-discovery path) and the user chose "Re-interview", iterate group by group asking "Keep / Edit / Replace" for each.
+2. **Grill follow-ups.** When the trigger fires, ask probing follow-ups one at a time via the chosen input mode — the same pattern as [/grill-me](/Users/kleshcv/.claude/claude-shared-skills/skills/grill-me/SKILL.md) but with the rubric kept inside this skill so questions stay cell2location-specific.
+3. **Maps to workflow importance.** Each persisted answer is annotated with a `_→ affects Phase N_` tag from the table above. The user sees why the question was asked.
+4. **Skill-driven recommendation.** For each question, the skill shows a recommended answer based on what is already known from the data (e.g. for #2 it can pre-fill the path if found in cwd; for #5 it can pre-fill the answer if the reference was inspected).
+
+### Output
+
+Writes the populated `## Scientific scope` block of the context file. Returns the file path.
+
+---
+
+## Mode `--technical` — implementation-completeness sweep
+
+Run when invoked **after Phase 8 / before Phase 9 launch** of `spatial-mapping` (caller passes `--technical`). Sweeps the Phase-1-through-8 decisions, surfaces any unanswered slot, and cross-checks against the `## Scientific scope` block.
+
+### What it does
+
+1. **Loads** `## Scientific scope` (must exist; if not, route back to `--science` first).
+2. **Reads** the in-progress notebook parameter dict the caller passes in (or, if missing, reads the workflow-state markdown the spatial-mapping skill emits between phases).
+3. **Sweeps each technical slot** (one per row in the `## Technical decisions` block of the file schema). For each slot that is empty:
+   - If the slot has a defensible default the skill applies it and notes the default in `## Outstanding gaps`.
+   - If the slot has NO default (e.g. `labels_key` cannot be guessed), asks the user.
+4. **Cross-checks scope vs decisions.** Examples:
+   - User listed "distinguish oligodendrocyte subtypes" in #4 but `labels_key` resolved to a broad-cluster column → flag, link to [issue #395](https://github.com/BayraktarLab/cell2location/issues/395), ask to re-annotate.
+   - Failure criterion "abundance varies 10× across visually similar regions" present but `detection_alpha=200` chosen → warn that `200` will not regularize away that variation, suggest `20`.
+   - Per-location segmentation hinted at in #5 but Phase 6 chose master branch → flag, link to hires_sliding_window install instructions.
+5. **Persists** `## Technical decisions` + `## Outstanding gaps` blocks.
+
+Full cross-check matrix in [reference/technical_completeness_rubric.md](reference/technical_completeness_rubric.md).
+
+### Output
+
+Writes the populated `## Technical decisions` + `## Outstanding gaps` blocks. Returns the file path.
+
+---
+
+## Input-mode toggle (AskUserQuestion vs printed prompts)
+
+On **first run**, ask once:
+
+> How should I ask interview questions?
+> 1. **Use the `AskUserQuestion` tool** — click-button choices, best in VSCode / Claude Desktop.
+> 2. **Print questions in the conversation and pause for your typed response** — better for TUI / SSH / headless Claude Code where the AskUserQuestion UI is awkward.
+
+Persist the choice as the first non-comment line under `## Input mode` at the top of `SPATIAL_MAPPING_CONTEXT.md`. Honor it on subsequent runs without re-asking. The user can edit the file or invoke `/cell2location-context --reset-input-mode` to change.
+
+In **autonomous mode** (no `AskUserQuestion` tool exposed and no human attached), skip both interview modes; emit a notebook markdown cell `# No SPATIAL_MAPPING_CONTEXT.md found; ran with defaults. Provide a context file next time for guided runs.` and return `"skipped"`.
+
+### Printed-prompt mode (mode 2)
+
+Render each question as:
+
+```
+=== Scope question 3 / 7  — Annotation methodology  (→ affects Phase 1, troubleshooting) ===
+
+How were the reference cell-type labels assigned? (markers / automated / manual / mixed)
+What is your confidence in the labels? Are there known weak spots?
+
+Recommended template:
+  Method:      manual + marker-based, validated by ...
+  Confidence:  high for major lineages, medium for subtypes
+  Weak spots:  endothelial subtypes were not separated
+
+>>> Type your answer below, then send the message <<<
+```
+
+Then stop and wait for the user’s next message. Treat the next user message as the answer; parse it; advance to the next question.
+
+---
+
+## Schema of `SPATIAL_MAPPING_CONTEXT.md`
+
+The exact template lives in [templates/SPATIAL_MAPPING_CONTEXT.template.md](templates/SPATIAL_MAPPING_CONTEXT.template.md). Copy it into place on first creation and fill it incrementally as the user answers.
+
+---
+
+## Mapping from scope answers to workflow phases (quick reference)
+
+| Scope answer | Spatial-mapping phase that consumes it |
+|---|---|
+| Goal (#1) | Phase 0 demo-vs-real branch; framing of all warnings |
+| Reference path + source (#2) | Phase 1 `ref_h5ad_path` pre-fill; Phase 1 reference-too-small / too-coarse warnings |
+| Annotation method + confidence (#3) | Phase 1 `labels_key` selection; Phase 1 categorical_covariate_keys suggestion |
+| Target populations (#4) | Phase 1 granularity check; Phase 3 (whether per-location N̂ is needed for the populations of interest) |
+| Granularity check (#5) | Phase 1 warning about re-annotation; Phase 6 (hires branch decision) |
+| Success criteria (#6) | Phase 8 QC plot interpretation; troubleshooting routing |
+| Failure criteria (#7) | Phase 4 `detection_alpha` choice (if abundance-variation criterion present); Phase 6 branch choice; troubleshooting routing |
+
+The technical interview uses this table in reverse: for any technical decision whose corresponding scope answer is missing or stale, it re-asks.
+
+---
+
+## Companion skills
+
+- [spatial-mapping/SKILL.md](../spatial-mapping/SKILL.md) — the caller. Invokes this skill at the start (`--science`) and before launch (`--technical`).
+- [cell2location-troubleshooting/SKILL.md](../cell2location-troubleshooting/SKILL.md) — reads the `## Scientific scope` (esp. failure criteria) before suggesting symptom matches.
+
+## Suggested reading (not auto-loaded)
+
+- [reference/scope_interview_rubric.md](reference/scope_interview_rubric.md) — full question list, recommended-answer examples, grill-me follow-up triggers, examples of well-formed answers.
+- [reference/technical_completeness_rubric.md](reference/technical_completeness_rubric.md) — the slot-by-slot technical sweep + the scope-vs-decision cross-check matrix.
+- [templates/SPATIAL_MAPPING_CONTEXT.template.md](templates/SPATIAL_MAPPING_CONTEXT.template.md) — the empty file schema.
diff --git a/.claude/skills/cell2location-context/reference/scope_interview_rubric.md b/.claude/skills/cell2location-context/reference/scope_interview_rubric.md
new file mode 100644
index 00000000..eeba393d
--- /dev/null
+++ b/.claude/skills/cell2location-context/reference/scope_interview_rubric.md
@@ -0,0 +1,237 @@
+# Scope interview rubric (`/cell2location-context --science`)
+
+This is the full rubric the [SKILL.md](../SKILL.md) summarises in its 7-row table. Read this when running the interview.
+
+For each group: **what to ask** → **recommended-answer template** → **grill-me follow-up triggers** → **how to persist** → **how it maps to the workflow**.
+
+Two rendering modes: `AskUserQuestion` (button choices) and printed-and-pause (typed text). Both modes ask the same primary question; the printed mode also shows the "recommended template" verbatim so the user can edit-and-send. Honor the `## Input mode` line in `SPATIAL_MAPPING_CONTEXT.md`.
+
+---
+
+## 1. Scientific goal — _affects Phase 0, all phases_
+
+### Primary question
+> Why are you running cell2location on this dataset? What biological question are you answering? 2–4 sentences. Concrete is better than abstract.
+
+### Recommended answer template
+```
+We want to map <cell types of interest> across <tissue> sections from <experimental
+groups>. The biological question is whether <population A> is enriched in <region X>
+vs <region Y>, and whether that enrichment changes between <condition 1> and
+<condition 2>. The outcome will feed a <downstream figure / hypothesis / drug
+target validation step>.
+```
+
+### Grill-me follow-up triggers
+- **<20 words.** Ask: "What's the biological question you'd ask a reviewer the result *answers*?"
+- **No populations named.** Ask: "Which cell-type populations would you need to see distributed differently for the result to be interesting?"
+- **No downstream named.** Ask: "What figure or decision in your paper / project does the cell2location output feed into?"
+- **Mentions a clinical decision but no biological mechanism.** Ask: "What spatial pattern would distinguish a true-positive prediction from a false-positive in this clinical context?"
+
+### Persist into
+`## Scientific scope → ### Scientific goal` (free text).
+
+### Maps to
+- Phase 0: helps decide demo-vs-real branch and framing of all warnings.
+- Phase 1 granularity warnings: anchored to "populations of interest" mentioned here.
+
+---
+
+## 2. Single-cell reference — _affects Phase 1_
+
+### Primary question
+> What scRNA-seq reference are you using? Provide path, source, tissue/organism, number of cells, number of cell-type labels.
+
+### Recommended answer template
+```
+- Path:        /path/to/snrna_reference.h5ad   (or URL)
+- Source:      <atlas name / in-house / published — DOI>
+- Tissue:      <e.g. mouse cortex + hippocampus, 8 wk C57BL/6>
+- N cells:     <e.g. 41,000>
+- N labels:    <e.g. 47> in column `<labels_key>`
+```
+
+### Grill-me follow-up triggers
+- **Tissue does not match the spatial tissue.** Ask: "Your reference is <X> but your spatial is <Y>. Is there a tissue-matched reference available? If not, name the cell types you expect to be present in <Y> that aren't in <X>."
+- **Source unclear.** Ask: "Was this reference produced in-house or downloaded? If downloaded, what's the DOI or atlas name?"
+- **<40 cells per type in any target population.** Warn: "Cell types with <40 cells will have noisy signatures and unreliable spatial estimates." Ask if they have a denser version.
+- **<10 cell-type labels.** Warn: "cell2location works best with 20–50 well-characterised types. With <10, all types tend to appear everywhere (issue #395)."
+- **>200 cell-type labels.** Warn: "Many types with overlapping signatures cause identifiability issues. Consider merging clusters."
+
+### Persist into
+`## Scientific scope → ### Single-cell reference` (structured fields).
+
+### Maps to
+- Phase 1: pre-fills `ref_h5ad_path`; drives reference-too-small / too-coarse warnings.
+
+---
+
+## 3. Annotation methodology — _affects Phase 1, troubleshooting_
+
+### Primary question
+> How were the reference cell-type labels assigned? (markers / automated / manual / mixed) What is your confidence in the labels? Are there known weak spots?
+
+### Recommended answer template
+```
+Method:      manual annotation by <person/lab>, validated by <markers / overlap with
+             prior atlas / orthogonal modality>
+Confidence:  high for major lineages (neuronal / glial); medium for subtypes;
+             low for <specific lineage>
+Weak spots:  endothelial subtypes were not separated; clustered as a single label.
+```
+
+### Grill-me follow-up triggers
+- **"I just used what came with the atlas."** Ask: "Which subclusters in this atlas were validated by orthogonal evidence vs assigned by clustering only?"
+- **No weak spots mentioned.** Probe: "Are there cell types in your target list (§4) for which the reference label is the result of a single marker gene? Those tend to be the noisy ones in spatial maps."
+- **Mixed methods but no per-lineage confidence.** Ask: "For each target population, is your confidence high / medium / low? Anything <high should be called out in the failure criteria."
+
+### Persist into
+`## Scientific scope → ### Annotation methodology`.
+
+### Maps to
+- Phase 1: drives `labels_key` selection and the `categorical_covariate_keys` recommendation (low-confidence subtypes argue for keeping the level coarser).
+- Troubleshooting: when a population doesn't appear where expected, this section is the first place to check.
+
+---
+
+## 4. Target populations — _affects Phase 1 granularity check, Phase 3_
+
+### Primary question
+> Which specific cell-type populations MUST be spatially resolvable for this analysis to be useful? List as many as relevant — at least 3.
+
+### Recommended answer template
+```
+- <population A>: why it matters — <e.g. defines the cortical layer boundary>.
+- <population B>: why it matters — <e.g. marks the disease vs control divergence>.
+- <population C>: why it matters — <e.g. negative control: should be absent from white matter>.
+- <population D>: …
+```
+
+### Grill-me follow-up triggers
+- **Fewer than 3 populations named.** Ask: "If only one cell type was correctly mapped and the others were noise, would the analysis still be useful? If not, which 3+ populations do you need correctly mapped?"
+- **Populations not in the reference labels.** Cross-check against `adata_ref.obs[labels_key].unique()` if available. If a population is missing, ask: "<population> is not in the reference's `<labels_key>`. Is it lumped with another label, or absent entirely?"
+- **Only positive populations mentioned.** Probe: "What's a *negative* control population — a cell type whose presence in a specific region would be a sign of a wrong map?"
+
+### Persist into
+`## Scientific scope → ### Target populations` (bulleted list).
+
+### Maps to
+- Phase 1: drives the granularity warnings.
+- Phase 3: if any target population needs sub-spot resolution, that argues for per-location N̂ + hires branch.
+
+---
+
+## 5. Granularity check — _affects Phase 1 re-annotation decision, Phase 6_
+
+### Primary question
+> For each target population in §4, does the reference's `labels_key` distinguish it from neighbours?
+
+### Skill-driven assistance
+If the reference is available (path from §2), inspect:
+
+```python
+import scanpy as sc
+adata_ref = sc.read_h5ad(ref_h5ad_path)
+labels = adata_ref.obs[labels_key].unique().tolist()
+for pop in target_populations:
+    matches = [l for l in labels if pop.lower() in l.lower()]
+    print(pop, "→", matches or "MISSING")
+```
+
+Pre-fill the table the user confirms.
+
+### Grill-me follow-up triggers
+- **Target population lumped with others.** Ask: "<A> is lumped with <X> in `<labels_key>`. Options: (a) accept the lumped label and lose subtype resolution, (b) re-annotate at a finer level using a marker-based or supervised tool (Celltypist / scANVI). Which?"
+- **Target population missing entirely.** Ask: "<A> is not in the reference. Is there a different column in `adata.obs` that has it, or do you need a different reference?"
+
+### Persist into
+`## Scientific scope → ### Granularity check` (table).
+
+### Maps to
+- Phase 1: re-annotation warning if needed.
+- Phase 6: if sub-spot resolution is needed (e.g. distinguishing two subtypes that often co-occupy a spot), the hires branch becomes attractive.
+
+---
+
+## 6. Success criteria — 3 measurable outcomes — _affects Phase 8 QC, troubleshooting_
+
+### Primary question
+> What 3 concrete, measurable results would let you conclude "the analysis worked"? Each criterion should be observable in the output (a number, a spatial pattern, a statistical test).
+
+### Recommended answer template
+```
+1. I can rank cortical layers L1–L6 by relative GABAergic interneuron subtype
+   abundance, with ≥3 subtypes showing layer-specific enrichment at p<0.01.
+2. The known marker-gene-based spatial pattern of <population A> is recovered:
+   q05 abundance map of <A> visually matches the marker gene `<marker>` ISH
+   reference in <atlas link>.
+3. Inter-sample variability in <population B> abundance correlates with the
+   experimental condition (rho > 0.5 across N=<N> samples).
+```
+
+### Grill-me follow-up triggers
+- **Fewer than 3.** Ask one by one until 3 are listed.
+- **Vague criterion** ("looks reasonable" / "matches biology"). Ask: "How would you measure that? What number / pattern would let you decide yes vs no?"
+- **Criterion that depends on cell2location's own output being self-consistent** ("ELBO converges"). Reject: "Convergence is necessary but not sufficient. Restate as a biological pattern that should be visible."
+
+### Persist into
+`## Scientific scope → ### Success criteria` (numbered 1–3).
+
+### Maps to
+- Phase 8 QC: the success criteria become explicit checks in the QC notebook.
+- Troubleshooting: failure to meet a criterion routes to a specific symptom in the corpus.
+
+---
+
+## 7. Failure criteria — 5–10 outcomes that must NOT happen — _affects Phase 4, Phase 6, refusal logic, troubleshooting_
+
+### Primary question
+> What 5–10 outcomes must NOT happen for the result to be trustworthy? Each criterion describes an observable wrong-pattern in the output.
+
+### Recommended answer template
+```
+1. All cell types appear everywhere — "3-fingered glove on a 5-fingered hand"
+   (issue #395). Diagnostic: max(q05 abundance per type per location) < 2× median.
+2. Endothelial cells appear in white matter where there is no vasculature.
+3. Total cell abundance varies >10× across visually similar regions of the same
+   sample (sign that detection_alpha is mis-set).
+4. Negative cell types — types not present in the reference but spatially detected.
+5. Posterior predictive QC log-log plot deviates from y=x by more than one decade.
+6. Per-sample mean abundance of <population A> varies more across replicates than
+   the biological signal of interest (rho < condition-rho).
+7. <…>
+```
+
+### Grill-me follow-up triggers
+- **Fewer than 5.** Probe one by one — show the user the corpus examples above as templates.
+- **Vague criterion** ("looks wrong"). Ask: "Describe the spatial pattern that would make you mistrust the result. What would a reviewer flag?"
+- **No "abundance varies across visually similar regions" criterion.** Ask: "If two replicates of the same tissue type showed 10× different total abundance, would you trust the map? If not, add that as a criterion." (This drives Phase 4 detection_alpha.)
+- **No "cell types appear everywhere" criterion.** Add it as a default — this is the most common failure mode (issue #395).
+
+### Persist into
+`## Scientific scope → ### Failure criteria` (numbered 1–N).
+
+### Maps to
+- Phase 4 `detection_alpha`: if "abundance varies 10×" is a criterion, `detection_alpha=20` is mandatory.
+- Phase 6 branch: if "subtype mixing" is a criterion AND segmentation is available, hires branch is mandatory.
+- Refusal logic: the skill should refuse to launch if any chosen technical decision would obviously violate a stated failure criterion.
+- Troubleshooting: when the user later reports a symptom, match it against this list first.
+
+---
+
+## How the skill picks the next question
+
+After each answer is captured, before moving to the next group:
+
+1. **Update the in-memory context.** Reflect what's now known.
+2. **Re-evaluate the trigger list.** Some triggers reference later groups — e.g. "populations not in the reference labels" is a trigger for group 4 but requires knowing the reference from group 2.
+3. **Adapt subsequent recommended templates.** If group 2 revealed a 41k-cell mouse-brain reference with 47 labels, group 4's template should use the actual label names as examples, not generic placeholders.
+4. **Track group completion.** A group is "done" when (a) the primary question is answered AND (b) no triggers are firing.
+
+When all 7 groups are done, write the file and return.
+
+---
+
+## Examples of well-formed scope blocks
+
+See [examples/](.) — not yet populated; add real examples here as the skill is used in practice.
diff --git a/.claude/skills/cell2location-context/reference/technical_completeness_rubric.md b/.claude/skills/cell2location-context/reference/technical_completeness_rubric.md
new file mode 100644
index 00000000..ae766015
--- /dev/null
+++ b/.claude/skills/cell2location-context/reference/technical_completeness_rubric.md
@@ -0,0 +1,86 @@
+# Technical-completeness rubric (`/cell2location-context --technical`)
+
+Sweeps the Phase-1-through-8 technical decisions, fills any gaps with defaults, asks for the un-default-able ones, and **cross-checks the chosen decisions against the `## Scientific scope` block** (especially failure criteria). Persists into `## Technical decisions` and `## Outstanding gaps`.
+
+Read [scope_interview_rubric.md](scope_interview_rubric.md) first to understand what the scope block contains.
+
+---
+
+## Sweep procedure
+
+For each technical slot in the table below:
+
+1. **Load current value** from (a) the in-progress notebook parameters the caller passed in, or (b) the existing `## Technical decisions` block if re-running, or (c) "EMPTY" if neither.
+2. **If EMPTY and a default exists**: apply the default, record `default applied` in `## Outstanding gaps`.
+3. **If EMPTY and no default**: ask the user. Honor the input-mode preference.
+4. **If set**: confirm with the user (one-tap "keep" in AskUserQuestion mode, or "Press enter to keep" in printed mode).
+5. **Run the cross-check** in the next section. If a check fires, route to the relevant remediation.
+
+---
+
+## Slot table
+
+| Phase | Slot | Default if EMPTY | Source rule |
+|---|---|---|---|
+| 1 | `labels_key` | none — must ask | Phase 1 rubric in spatial-mapping/SKILL.md |
+| 1 | `batch_key` (ref) | `sample` if present | Phase 1 |
+| 1 | `categorical_covariate_keys` | `[]` | Phase 1; ask if multi-tech detected |
+| 1 | `continuous_covariate_keys` | `[]` | Phase 1 |
+| 1 | gene filters | `cell_count_cutoff=15, cell_percentage_cutoff2=0.03, nonz_mean_cutoff=1.12` | Fig S1 |
+| 1 | `max_epochs` (ref) | `min(round(20000/n_cells * 400), 400)` | Phase 1 |
+| 2 | `spatial_h5ad_path` | none — must ask | Phase 2 |
+| 2 | `spatial_technology` | autodetect from `adata.uns['spatial']` etc. | Phase 2 autonomous rules |
+| 2 | `batch_key` (spatial) | `sample` if present | Phase 2 |
+| 2 | `total_counts_min / max` | 1000 / 200000 | Phase 2 |
+| 2 | gene-filter thresholds (spatial) | `cell_count_cutoff=15, cell_percentage_cutoff2=0.15, nonz_mean_cutoff=1.11` | Phase 2 |
+| 3 | `N_cells_per_location` value | Fig S27 cell-size fallback by technology | Phase 3 autonomous tree |
+| 3 | `N_cells_per_location_alpha_prior` | `1` (master) / `1000` (hires) | Phase 6 |
+| 3 | `N_cells_mean_var_ratio` | `1` (global N̂) / `10` (per-location N̂_s) | §1.2 |
+| 4 | `detection_alpha` | autodetect: `20` if any sample 90/10 ratio > 10× OR FFPE/Cytassist/Visium-HD; else `200` | Phase 4 |
+| 5 | `n_chunks` | from `chunk_size_max = available_gpu_memory_bytes / (n_vars × 8 × 4)` | Phase 5 |
+| 5 | stratification | stratified random by `batch_key` | Phase 5 |
+| 6 | branch | `hires_sliding_window` if per-location N̂_s chosen; else `master`; `WTA` if Nanostring | Phase 6 |
+| 6 | `use_proportion_factorisation_prior_on_w_sf` | `True` on hires, else absent | Phase 6 |
+| 6 | `use_n_s_cells_per_location_limit` | `True` on hires, else absent | Phase 6 |
+| 7 | `n_groups` (R) | 50 | §1.2 |
+| 7 | `A_factors_per_location` | 7 | §1.2 |
+| 7 | `B_groups_per_location` | 7 | §1.2 |
+| 7 | `w_sf_mean_var_ratio` | 5 | §1.2 |
+| 7 | `use_per_cell_type_normalisation` | `False` | §1.2 |
+| 8 | `max_epochs` (spatial) | small (<5k loc) → 5000; medium (10k–40k) → 30000; large → 30k–80k per chunk | §1.3 |
+| 8 | `lr` | 0.002 | §1.3 — never override |
+| 8 | `batch_size` | `None` (full-batch) | §1.3 — REFUSE if set |
+| 8 | `use_quantiles` (export) | `True` if `n_obs > 100000` else `True` | issue #278 |
+| 8 | `add_to_obsm` | `['means', 'q05', 'q50', 'q95']` | §1.3 |
+| 9 | backend | autodetect (`bsub` → LSF, `sbatch` → Slurm, `nvidia-smi -L` → local, else error) | Phase 9 |
+
+---
+
+## Cross-check matrix (scope ↔ technical)
+
+For each scope answer, the matching technical-decision check:
+
+| Scope element | Technical check | Action if violated |
+|---|---|---|
+| Target population lumped in `labels_key` (granularity check §5) | Phase 1 `labels_key` resolves to coarse column | Flag in `## Outstanding gaps`: "Target population <X> is lumped with <Y> in chosen `labels_key=<L>`. See issue #395. Recommend: re-annotate to a finer level OR remove <X> from target list." |
+| Failure criterion "all cell types appear everywhere" | Reference has <10 unique labels in `labels_key`, OR labels_key chosen at very coarse level | Flag: "Failure criterion #N risks triggering. Reference has only <K> labels. Re-annotate at finer level (Celltypist / scANVI) before training." |
+| Failure criterion "abundance varies 10× across visually similar regions" | `detection_alpha=200` chosen | Flag + suggest: "Failure criterion #N would not be regularised by `detection_alpha=200`. Set `detection_alpha=20` (per Fig S27 lower flow) to allow the model to absorb the within-sample variation." |
+| Failure criterion "subtype mixing within spots" + segmentation hinted at (§3 mentions n_cell/occupancy) | Phase 6 chose `master` branch | Flag: "Per-location segmentation is available but master branch was selected. Per-location N̂_s is only a soft prior in master. Install hires_sliding_window to make it a deterministic multiplier. See SKILL.md Phase 6." |
+| Success criterion "abundance correlates with experimental condition" | `n_chunks > 1` AND chunk assignment is NOT stratified by `batch_key` | Flag: "Cross-sample comparisons require stratified chunking. Confirm `n_chunks > 1` chunks are stratified so each chunk contains all samples." |
+| Success criterion "marker-gene spatial pattern recovered" | None directly; record as a Phase 8 QC todo | Add to `## Outstanding gaps`: "QC step: overlay q05 abundance of <population> with marker `<gene>` ISH reference; visually confirm match." |
+| Failure criterion "posterior predictive QC log-log deviates from y=x by >1 decade" | Phase 8 `mod.plot_QC()` not in notebook | Flag: "Failure criterion #N requires a QC plot that isn't in the notebook. Add `mod.plot_QC(summary_name='q05')` after training." |
+| Failure criterion "negative cell type appears" | None directly | Add to `## Outstanding gaps`: "QC step: enumerate the top-3 cell types in regions where they should be absent; cross-check against tissue annotation." |
+| Reference has multi-tech (§3 method = "mixed: 10X v2 + v3") | Phase 1 `categorical_covariate_keys` does NOT include the technology column | Flag: "Reference is multi-tech but per-gene tech regression is off. Add the technology column to `categorical_covariate_keys` to enable per-gene detection adjustment (see `_reference_module.py:178`)." |
+| Annotation confidence is "low" on a target population (§3) | None directly | Add to `## Outstanding gaps`: "Annotation confidence is LOW on <population>. Treat the spatial map of this population as preliminary; consider orthogonal validation." |
+
+---
+
+## Output
+
+Three things are written, all into the same `SPATIAL_MAPPING_CONTEXT.md`:
+
+1. `## Technical decisions` — every slot filled with either the user's value, the default, or `<unanswered>` (rare; only when the user explicitly skipped a slot that has no default).
+2. `## Outstanding gaps` — every cross-check that fired, every default that was applied, every QC step the success/failure criteria implicitly require.
+3. A return value to the caller (`spatial-mapping`): the file path, AND a boolean `safe_to_launch` (`False` if any cross-check that has a hard remediation — re-annotation, branch switch, detection_alpha mismatch — fired without being addressed).
+
+If `safe_to_launch=False`, the caller (spatial-mapping Phase 8.5 → Phase 9) must ask the user one more time before launching: "Cross-check flagged N issues. Launch anyway, fix and re-run, or abort?"
diff --git a/.claude/skills/cell2location-context/templates/SPATIAL_MAPPING_CONTEXT.template.md b/.claude/skills/cell2location-context/templates/SPATIAL_MAPPING_CONTEXT.template.md
new file mode 100644
index 00000000..8b66c231
--- /dev/null
+++ b/.claude/skills/cell2location-context/templates/SPATIAL_MAPPING_CONTEXT.template.md
@@ -0,0 +1,134 @@
+# Spatial mapping project context
+_Created: <YYYY-MM-DD> by /cell2location-context_
+_Last updated: <YYYY-MM-DD>_
+
+> This file is owned and maintained by the `cell2location-context` skill. The `spatial-mapping` skill auto-discovers it on every run (see candidate paths in [.claude/skills/cell2location-context/SKILL.md](.claude/skills/cell2location-context/SKILL.md)). Edit by hand or via `/cell2location-context`.
+
+## Input mode
+
+<!-- One of: prefer-asktool | prefer-printed-prompts.  Controls how the skill asks questions on subsequent runs. -->
+prefer-asktool
+
+---
+
+## Scientific scope
+_Filled by `/cell2location-context --science`. Re-run that command to update._
+
+### Scientific goal
+<!-- → affects Phase 0 framing and all phase warnings -->
+<2–4 sentences. What biological question does this analysis answer? Why are you spatially mapping cell types into this tissue?>
+
+### Single-cell reference
+<!-- → affects Phase 1 (`ref_h5ad_path`, reference-too-small / too-coarse warnings) -->
+- **Path:** <absolute path OR URL>
+- **Source:** <atlas name | in-house | published — cite>
+- **Tissue / organism:** <e.g. "mouse brain (cortex + hippocampus), 8 wk C57BL/6">
+- **Number of cells:** <N>
+- **Number of cell-type labels:** <K> (column `<labels_key>` in `adata.obs`)
+
+### Annotation methodology
+<!-- → affects Phase 1 (`labels_key` selection, `categorical_covariate_keys` suggestion) and troubleshooting -->
+- **Method:** <markers | automated (scANVI / Celltypist) | manual | mixed>
+- **Confidence:** <high | medium | low>, possibly per-lineage
+- **Known weak spots:** <e.g. "endothelial subtypes were not separated">
+
+### Target populations
+<!-- → affects Phase 1 granularity check, Phase 3 (per-location N̂?) -->
+Populations that MUST be spatially resolvable for this analysis to be useful:
+
+- **<population A>** — why it matters: <sentence>
+- **<population B>** — why it matters: <sentence>
+- **<population C>** — why it matters: <sentence>
+- …
+
+### Granularity check
+<!-- → affects Phase 1 (re-annotate?), Phase 6 (hires branch decision) -->
+For each target population above, is it distinguished from neighbours in the reference's `labels_key`?
+
+| Target population | In `labels_key`? | Notes |
+|---|---|---|
+| <A> | YES / NO / LUMPED-WITH-<X> | <e.g. "Lumped with B in lvl-3; need lvl-5 re-annotation"> |
+| <B> | YES | |
+| … | | |
+
+### Success criteria (3 measurable outcomes)
+<!-- → affects Phase 8 QC interpretation, troubleshooting routing -->
+The analysis is "done" if all three of these are true:
+
+1. <e.g. "I can rank cortical layers L1–L6 by relative GABAergic interneuron subtype abundance, with at least 3 subtypes showing layer-specific enrichment p<0.01">
+2. <…>
+3. <…>
+
+### Failure criteria (5–10 outcomes that must NOT happen)
+<!-- → affects Phase 4 (detection_alpha), Phase 6 (branch choice), refusal logic, troubleshooting -->
+The result is untrustworthy if any of these is observed:
+
+1. <e.g. "All cell types appear everywhere — \"3-fingered glove on a 5-fingered hand\" pattern (issue #395)">
+2. <e.g. "Endothelial cells appear in white matter where there is no vasculature">
+3. <e.g. "Total cell abundance varies >10× across visually similar regions of the same sample">
+4. <e.g. "Negative cell type — type not present in the reference — appears in tissue">
+5. <e.g. "Posterior predictive QC log-log plot deviates from y=x by more than one decade">
+6. <…>
+7. <…>
+
+---
+
+## Technical decisions
+_Filled by `/cell2location-context --technical`. Re-run that command after each `/spatial-mapping` session to refresh._
+
+### Phase 1 — Reference signatures
+- **labels_key:** <…>
+- **batch_key:** <…>
+- **categorical_covariate_keys:** <…>
+- **continuous_covariate_keys:** <…>
+- **gene filters:** defaults | overridden as <…>
+- **max_epochs:** <…>
+
+### Phase 2 — Spatial QC
+- **spatial_h5ad_path:** <…>
+- **technology:** <visium | visium-hd | cytassist | slide-seq-v2 | stereo-seq | nanostring-wta>
+- **batch_key:** <…>
+- **total_counts_min / max:** <…> / <…>
+- **gene-filter thresholds:** defaults | overridden as <…>
+
+### Phase 3 — N_cells_per_location
+- **S27 path:** Q1=<yes/no> → Q2=<yes/no> → Q3=<yes/no>
+- **N̂ value:** scalar <X> | per-location column `<n_cell_occupancy>`
+- **N_cells_per_location_alpha_prior:** 1 | 1000 (hires)
+- **N_cells_mean_var_ratio (v^n):** 1 | 10
+
+### Phase 4 — detection_alpha
+- **value:** 20 | 200 | per-batch dict <{…}>
+- **rationale:** <e.g. "90/10 percentile ratio = 14× → high variability → 20">
+
+### Phase 5 — Chunking
+- **n_chunks:** <…>
+- **chunk_size:** <…>
+- **stratification:** stratified by `<batch_key>` (each chunk sees all samples)
+
+### Phase 6 — Branch
+- **branch:** master | hires_sliding_window | WTA
+- **use_proportion_factorisation_prior_on_w_sf:** True | False (hires only)
+- **use_n_s_cells_per_location_limit:** True | False (hires only)
+
+### Phase 7 — Other hyperparameters
+- **overrides from defaults:** none | <list>
+
+### Phase 8 — Training + posterior
+- **max_epochs:** <…>
+- **lr:** 0.002
+- **batch_size:** None (full-batch)
+- **use_quantiles:** True | False
+- **posterior add_to_obsm:** ['means', 'q05', 'q50', 'q95']
+
+### Phase 9 — Compute
+- **backend:** LSF (bsub) | Slurm (sbatch) | local | Jupyter interactive
+
+---
+
+## Outstanding gaps
+_Auto-flagged by `/cell2location-context --technical`._
+
+- <e.g. "No paired histology — using Fig S27 fallback N̂=5; flagged as low-confidence. Mitigation: redo with segmentation if absolute abundance estimates are needed for the publication.">
+- <e.g. "Target population 'oligodendrocyte subtypes' (see Scientific scope §Target populations) is LUMPED-WITH-OPC in current `labels_key`; recommend re-annotation before treating subtype maps as biology.">
+- <…>
diff --git a/.claude/skills/cell2location-troubleshooting/README.md b/.claude/skills/cell2location-troubleshooting/README.md
new file mode 100644
index 00000000..1e27d79c
--- /dev/null
+++ b/.claude/skills/cell2location-troubleshooting/README.md
@@ -0,0 +1,5 @@
+# cell2location-troubleshooting skill
+
+Use this skill when the main `spatial-mapping` skill cannot resolve your problem, OR to file a clean cell2location issue with the diagnostic metadata the maintainer normally asks for. Invoke standalone with `/cell2location-troubleshooting`, or load it alongside [../spatial-mapping/SKILL.md](../spatial-mapping/SKILL.md).
+
+Your coding agent loads [SKILL.md](SKILL.md) and walks you through it.
diff --git a/.claude/skills/cell2location-troubleshooting/SKILL.md b/.claude/skills/cell2location-troubleshooting/SKILL.md
new file mode 100644
index 00000000..03ddb2cc
--- /dev/null
+++ b/.claude/skills/cell2location-troubleshooting/SKILL.md
@@ -0,0 +1,212 @@
+---
+name: cell2location-troubleshooting
+description: "Diagnose cell2location problems (convergence failures, OOM, anomalous abundance, biological interpretation). Routes to existing GitHub issues when they match; drafts a clean `gh issue create` body with diagnostic metadata when not. Invoked LAST after main spatial-mapping skill attempts have failed OR when user is dumping an error. Companion to ../spatial-mapping/SKILL.md."
+user-invocable: true
+---
+
+# cell2location-troubleshooting — diagnose problems and file clean issues
+
+This skill helps when something goes wrong with cell2location, or when a user has a question heavy on biological interpretation. It is the companion skill to [../spatial-mapping/SKILL.md](../spatial-mapping/SKILL.md).
+
+**Convention**: when the main `spatial-mapping` skill loads, this skill should be loaded too. But this skill is also usable **standalone** — invoke `/cell2location-troubleshooting` from any project.
+
+## Phase −1 — Load the project context FIRST (if present)
+
+**Before** classifying the symptom, check whether the project has a `SPATIAL_MAPPING_CONTEXT.md` (owned by [../cell2location-context/SKILL.md](../cell2location-context/SKILL.md)). Candidate paths (first hit wins):
+
+1. `$CWD/SPATIAL_MAPPING_CONTEXT.md`
+2. `$CWD/.claude/SPATIAL_MAPPING_CONTEXT.md`
+3. `~/.claude/plans/SPATIAL_MAPPING_CONTEXT_*.md`
+
+If found, **read it and use both blocks**:
+
+- **`## Scientific scope`** — especially the `### Failure criteria` list. The user has already declared what specific outcomes are unacceptable. Match the reported symptom against this list BEFORE going to the generic corpus. If the symptom is "all cell types appear everywhere" and the user listed exactly that as failure #1, the diagnostic context is already half-written.
+- **`## Technical decisions`** — every hyperparameter the user chose, with the rationale stored alongside. Use this to pre-fill the diagnostic template in Phase 3 instead of re-asking the user.
+- **`## Outstanding gaps`** — flags from the technical-completeness cross-check (e.g. "detection_alpha=200 but failure criterion #3 requires 20"). A reported symptom that matches a flagged gap is very likely caused by that gap; jump there first.
+
+If no context file is found, suggest the user create one once the immediate problem is resolved: "Long-term: run `/cell2location-context --science` so future diagnostics have your goals and failure criteria for context."
+
+The context file is read-only here; do not modify it. If a diagnostic conclusion adds to `## Outstanding gaps`, ask the user whether to invoke `/cell2location-context --technical` to persist it.
+
+## What this skill does
+
+1. **Phase 1 — Match the user's symptom** against the harvested issue corpus at [../spatial-mapping/reference/issue_corpus.md](../spatial-mapping/reference/issue_corpus.md). The corpus paraphrases ~25 recurring vitkl answers across the GitHub tracker. Most user-reported problems already have a documented answer.
+
+2. **Phase 2 — Search GitHub** when the symptom doesn't match the local corpus. Uses `gh search issues -R BayraktarLab/cell2location` to find newer issues the corpus may not have.
+
+3. **Phase 3 — Draft a `gh issue create` body** when neither match. Pre-fills the diagnostic metadata vitkl normally asks for: environment, data shape, hyperparameters used, ELBO trajectory, full error trace. Reduces back-and-forth.
+
+The skill **NEVER auto-submits** issues. It drafts; the user reviews and submits.
+
+## When to invoke
+
+- (a) The main `spatial-mapping` skill instructed you to (after exhausting its anti-pattern checks).
+- (b) The user is dumping an error or unexpected result.
+- (c) The user explicitly says "help me file a cell2location issue".
+- (d) The question is heavily about biological interpretation — route to [discourse.scverse.org](https://discourse.scverse.org/c/ecosytem/cell2location/42).
+
+## Phase 0 — Intent classification
+
+Classify the user's question:
+
+- **Convergence / training problem** (ELBO not decreasing, NaN, OOM during training) → Phase 1.
+- **Posterior / export OOM** → Phase 1.
+- **Anomalous abundance** (all cell types everywhere, missing cell type, bad maps) → Phase 1. If no corpus match, route to discourse.
+- **Biological interpretation** ("why does my biology look like X?") → route to discourse directly; do NOT file a GitHub issue.
+- **Novel modality / integration / multi-sample method question** → Phase 1, then Phase 2.
+- **"Help me file an issue"** → Phase 3 directly.
+
+## Phase 1 — Corpus match
+
+Read the issue corpus: [../spatial-mapping/reference/issue_corpus.md](../spatial-mapping/reference/issue_corpus.md). The corpus is organised by topic:
+
+1. Choosing `N_cells_per_location`
+2. Setting `detection_alpha`
+3. Managing large spatial datasets (>40k spots)
+4. Training duration / `max_epochs`
+5. Reference signatures: source and format
+6. Posterior sampling / abundance export OOM
+7. Comparing samples / batch effects
+8. Input data normalization
+9. Handling very low count areas
+10. Cell type granularity / reference annotation level
+11. Merging multiple trained models (chunked)
+12. RegressionModel — reference signature estimation
+13. VisiumHD, Cytassist, non-standard technologies
+14. Saving and loading models
+15. Expected gene expression per cell type
+16. Amortised inference (future / experimental)
+17. ELBO oscillation (should I stop training?)
+
+For each topic, the corpus has: question, default/recommended, decision rule, why it matters, source issue links.
+
+When you match a symptom to a topic:
+- Quote the relevant guidance to the user.
+- Link the source issue numbers so the user can `gh issue view` for full context.
+- DO NOT continue to Phase 2 or 3 — the answer is already public.
+
+## Phase 2 — GitHub fallback
+
+If the symptom doesn't match the local corpus (e.g., a new issue type that postdates the 2026-04-26 corpus snapshot):
+
+```bash
+gh search issues "<symptom keywords>" \
+    -R BayraktarLab/cell2location \
+    --state all \
+    --limit 5 \
+    --json number,title,body,state,author
+```
+
+For high-signal answers, filter by `--author vitkl`.
+
+Present the top 3 matches to the user as a numbered list with titles + URLs. Ask them to confirm whether their issue matches any of these. If yes → done. If no → Phase 3.
+
+## Phase 3 — Issue draft
+
+Compose a `gh issue create` body with the diagnostic metadata vitkl normally has to request. **If `SPATIAL_MAPPING_CONTEXT.md` was loaded in Phase −1, pre-fill the Hyperparameters block from `## Technical decisions` and the Symptom block from `## Scientific scope → ### Failure criteria`** rather than re-asking the user. Use this template:
+
+```markdown
+## Environment
+- cell2location version: <run `pip show cell2location` and paste version>
+- Branch installed from: master / hires_sliding_window / other (specify)
+- scvi-tools version: <run `pip show scvi-tools` and paste version>
+- Python version: <python --version>
+- PyTorch version: <pip show torch>
+- Pyro version: <pip show pyro-ppl>
+- GPU model + memory: <e.g. NVIDIA A100 80GB>
+
+## Data
+- Spatial technology: 10X Visium / Visium-HD / Cytassist / Slide-seq V2 / Stereo-seq / Nanostring WTA/DSP
+- Number of locations (n_obs): ...
+- Number of genes (n_vars): ...
+- Number of batches (samples): ...
+- Reference signature source: scRNA-seq atlas / cluster_averages / user-provided
+- Number of cell types (factors): ...
+- Total counts distribution per sample:
+  - 10th percentile: ...
+  - 50th percentile (median): ...
+  - 90th percentile: ...
+  - 90/10 ratio: ...   (informs detection_alpha choice)
+
+## Hyperparameters used
+- `N_cells_per_location`: <scalar value, OR column name + summary stats if per-location array>
+- `detection_alpha`: <value>
+- `n_groups` (R): ...
+- `A_factors_per_location`: ...
+- `B_groups_per_location`: ...
+- `use_proportion_factorisation_prior_on_w_sf`: <True / False>
+- `use_n_s_cells_per_location_limit`: <True / False>
+- `use_per_cell_type_normalisation`: <True / False>
+- `max_epochs`: ...
+- `batch_size`: None (full-batch) ← MUST be None
+
+## Symptom
+<concise description; paste error trace VERBATIM if applicable>
+
+## ELBO history
+<paste output of `mod.plot_history(5000)` — screenshot or text summary of first / last 1000 epochs>
+- Looks plateaued? <yes / no>
+- Late oscillations? <yes / no — see issue #327: oscillations are expected>
+
+## What I have already tried
+<list anti-patterns from the spatial-mapping skill the user has avoided; reference any relevant SKILL.md phases>
+
+## Reproducer
+<minimal-as-possible code snippet that reproduces the issue, OR pointer to the executed papermill notebook>
+```
+
+Then instruct the user to run:
+
+```bash
+# Create a draft issue (review BEFORE submitting):
+gh issue create \
+    -R BayraktarLab/cell2location \
+    --title "<short title>" \
+    --body-file <(cat <<'BODY'
+<paste the filled-in template above>
+BODY
+)
+```
+
+### Privacy guard
+
+- The diagnostic template asks for SHAPES and SUMMARY STATS, not raw data.
+- Do NOT auto-include `adata.obs.head()`, `adata.X[:10]`, or any user data in the issue body.
+- The user adds biology / dataset names manually if they want; the skill never adds them automatically.
+
+## Routing to discourse instead of GitHub
+
+For these question categories, point the user at [discourse.scverse.org](https://discourse.scverse.org/c/ecosytem/cell2location/42) instead of GitHub:
+
+- "Why does cell type X appear in region Y in my data?" — biological interpretation.
+- "How do I interpret the co-location groups?" — biological interpretation.
+- "Which cell types should be in my reference?" — biology / experimental design.
+
+GitHub issues are for code / model bugs. Biology questions are better-suited to community discussion.
+
+## Anti-patterns this skill REFUSES
+
+- Auto-submitting issues without user review.
+- Including raw user data (`adata.obs.head()`, `adata.X.toarray()`, etc.) in issue bodies.
+- Filing duplicate issues when the corpus or `gh search` already has a match.
+- Filing GitHub issues for pure biology questions (route to discourse).
+
+---
+
+<reference>
+
+## Issue corpus (shared with main skill)
+
+The corpus is at [../spatial-mapping/reference/issue_corpus.md](../spatial-mapping/reference/issue_corpus.md). Snapshot date: 2026-04-26. New issues since then are NOT in the corpus; use Phase 2 (`gh search`) to find them.
+
+## Suggested reading
+
+These are NOT auto-loaded. Use the `Read` tool when needed:
+
+- [../spatial-mapping/reference/issue_corpus.md](../spatial-mapping/reference/issue_corpus.md) — full corpus, by topic.
+- [../spatial-mapping/reference/hyperparameters_extract.md](../spatial-mapping/reference/hyperparameters_extract.md) — supplement §1.2-§1.4 + §2 paraphrase. *Read when:* the user's problem points at a hyperparameter choice and you need the canonical default rationale.
+- [../spatial-mapping/SKILL.md](../spatial-mapping/SKILL.md) — main skill's anti-patterns block. *Read when:* you suspect the user is hitting a known anti-pattern.
+- [../cell2location-context/SKILL.md](../cell2location-context/SKILL.md) — owner of `SPATIAL_MAPPING_CONTEXT.md`. *Read when:* the user has no context file and you need to walk them through creating one, or when they want to add a diagnostic conclusion to `## Outstanding gaps`.
+- [../cell2location-context/reference/technical_completeness_rubric.md](../cell2location-context/reference/technical_completeness_rubric.md) — the cross-check matrix that maps scope-elements ↔ technical-decisions ↔ likely failure modes. *Read when:* you need to explain WHY a chosen hyperparameter would cause the reported symptom.
+
+</reference>
diff --git a/.claude/skills/spatial-mapping/README.md b/.claude/skills/spatial-mapping/README.md
new file mode 100644
index 00000000..a35df633
--- /dev/null
+++ b/.claude/skills/spatial-mapping/README.md
@@ -0,0 +1,5 @@
+# spatial-mapping skill
+
+Do not read this file. Your coding agent (Claude / Cursor / Copilot / Aider) loads [SKILL.md](SKILL.md) and walks you through. If you are reading this without an agent, run your agent in this repo and ask it to do spatial mapping with cell2location.
+
+Companion skill: [cell2location-troubleshooting](../cell2location-troubleshooting/).
diff --git a/.claude/skills/spatial-mapping/SKILL.md b/.claude/skills/spatial-mapping/SKILL.md
new file mode 100644
index 00000000..e5305257
--- /dev/null
+++ b/.claude/skills/spatial-mapping/SKILL.md
@@ -0,0 +1,646 @@
+---
+name: spatial-mapping
+description: "Run cell2location on real spatial transcriptomics data (Visium / Visium-HD / Cytassist / Slide-seq / Stereo-seq / Nanostring WTA). TRIGGER when the user asks to spatially map / deconvolve cell types onto spatial transcriptomics, or to estimate cell type abundance per spot/bin. Forces explicit decisions on N_cells_per_location, detection_alpha, reference signature source, and chunking. Supports both interactive (AskUserQuestion) and autonomous (data-driven defaults from supplementary methods Fig S27 + §1.2) modes. Generates parametrised notebooks + LSF/Slurm/local launchers."
+user-invocable: true
+---
+
+# spatial-mapping — apply cell2location to a real spatial transcriptomics dataset
+
+This skill is the operating manual for running cell2location on the user's own data. It is **single-skill, format-plan-style** (instructions + `<reference>` tag), **dual-mode** (interactive `AskUserQuestion` / autonomous data-driven), and it **forces decisions** the maintainer (vitkl) routinely answers on the issue tracker.
+
+Companion skills (when this skill is loaded, also load both):
+
+- [cell2location-context/SKILL.md](../cell2location-context/SKILL.md) — owns the persistent `SPATIAL_MAPPING_CONTEXT.md` (project goals, reference, target populations, success/failure criteria + technical decisions). Invoked automatically by this skill at **Phase 0a** (`--science`) and **Phase 8.5** (`--technical`). Standalone-invocable via `/cell2location-context` to update goals between runs.
+- [cell2location-troubleshooting/SKILL.md](../cell2location-troubleshooting/SKILL.md) — consults the same issue corpus AND the `SPATIAL_MAPPING_CONTEXT.md` (especially the failure criteria + technical decisions) to match symptoms; helps file `gh issue create` drafts when this skill cannot resolve a problem.
+
+## How this skill works
+
+You walk the user through ten phases, in order. Each phase has TWO branches:
+
+- **`<interactive>`** — when the user is present, ask via `AskUserQuestion`.
+- **`<autonomous>`** — when the user is NOT available (background agent, headless SDK run, biorxiv-style autonomous research agent), inspect the data + tissue + technology and converge on a defaulted value using the rules from [Fig S27](reference/fig_S27_hyperparameters.png) and the supplement [§1.2 hyperparameter rules](reference/hyperparameters_extract.md).
+
+**Hard rule for autonomous mode**: NEVER refuse to proceed for lack of a hyperparameter. Always converge on an informed default and emit a markdown cell into the generated notebook documenting the assumption ("Assumed `detection_alpha=20` because 90/10 percentile ratio of total_counts is 14× > 10×; override if your slides are fresh-frozen single-batch.").
+
+**Hard rule for both modes**: NEVER silently default. In interactive mode, ask. In autonomous mode, document the choice and the inference rule that produced it.
+
+The output of the skill is a **set of generated artifacts**: customised parameter values for the [template notebooks](templates/), a launcher invocation, and a markdown explanation of what was chosen and why. The user can either submit the job via papermill OR open the notebook in Jupyter for interactive use.
+
+---
+
+# Phase 0a — Scientific-scope interview (FIRST STEP)
+
+## Goal
+Capture or load the user's scientific scope **before any technical decision is made**: why are they running cell2location, what reference, which target populations, what counts as success, what must not happen. All downstream phases consult this context.
+
+## Action
+Invoke the [cell2location-context](../cell2location-context/SKILL.md) skill with `--science`:
+
+```
+/cell2location-context --science
+```
+
+(Or call its `SKILL.md` directly when the slash-command channel is not available — read the file, follow its instructions.)
+
+The context skill will:
+
+1. **Auto-discover** `SPATIAL_MAPPING_CONTEXT.md` (candidate paths: `$CWD/`, `$CWD/.claude/`, `~/.claude/plans/SPATIAL_MAPPING_CONTEXT_<dataset>.md`).
+2. **If found**: show the user a summary of the `## Scientific scope` block and ask: **Use** the existing context / **Re-interview** to update / **Skip** for this run.
+3. **If not found**: ask the user: **Run the interview now (recommended)** _"Answering these questions can substantially improve the results from this analysis, get a more useful single-cell reference and spatial map, and lead to new discoveries."_ / **Point me at an existing handoff document** (imports its content as free-form scope) / **Skip — proceed with defaults**.
+4. **Return** either a file path or the string `"skipped"`.
+
+## How to consume the result
+
+- **If a path was returned**: read the file's `## Scientific scope` block. Carry it forward as context for every subsequent phase. In each phase that has a scope-relevant rule (Phase 1 granularity, Phase 3 N̂ choice, Phase 4 detection_alpha, Phase 6 branch), explicitly cite the scope entry that argued for the chosen value.
+- **If `"skipped"` was returned**: continue, but emit a markdown cell into the generated step1 and step2 notebooks:
+  ```markdown
+  # ⚠️ No SPATIAL_MAPPING_CONTEXT.md captured
+  This run proceeded with autonomous defaults because the user skipped the scientific-
+  scope interview. Hyperparameter choices, warnings, and refusals were not grounded in
+  user-declared success/failure criteria. To improve future runs, invoke
+  `/cell2location-context --science` and re-launch the workflow.
+  ```
+
+## Autonomous mode
+If no `AskUserQuestion` channel is exposed, the context skill auto-skips both branches and returns `"skipped"`. This phase records the skip in the notebook (markdown cell above) and proceeds.
+
+## Output
+Either the loaded `## Scientific scope` block (passed forward as context to all phases) or a recorded skip.
+
+---
+
+# Phase 0 — Mode + data-discovery branch
+
+## Goal
+Determine (a) whether the user is available for questions, and (b) whether they already have their own data or need to start from a published demo.
+
+## Interactive
+Ask:
+- **Mode**: am I running in an interactive session with you present (you'll see my AskUserQuestion prompts)? Or am I running autonomously (no user available, agent decides)?
+- **Data status**: (a) "I have my own spatial transcriptomics data" → continue to Phase 1; (b) "I want to test the skill on published data" → run the demo path; (c) "I don't have spatial data yet" → emit data-acquisition guidance.
+
+## Autonomous
+Default to autonomous mode if the skill is invoked without an interactive `AskUserQuestion` channel (e.g. via the Claude Agent SDK without a `permission_mode='askUser'`-equivalent setting). Heuristic: if no `AskUserQuestion` tool is exposed in the current tool list, treat as autonomous.
+
+For data discovery: look for `*.h5ad` files in the workspace root and any `data/` subdirectory. If found → assume "I have my own data". If not → run the demo path with [templates/data/download_mouse_brain.py](templates/data/download_mouse_brain.py).
+
+## Demo path
+For "I want to test" or "no data yet" cases, run:
+```python
+python templates/data/download_mouse_brain.py --output-dir templates/data/
+```
+This downloads the published 5-Visium + 6-8-snRNA mouse-brain dataset from [vitkl/cell2location_paper](https://github.com/vitkl/cell2location_paper). Validates the user's environment and gives them a published-comparable reference.
+
+## Data-acquisition guidance (only emitted on "no data yet" path)
+- For 10X Visium: 10–20 samples typically sufficient for joint modelling; use the maximum samples that share biological/technical conditions of interest.
+- For Visium-HD: choose bin size (2/8/16 μm) based on tissue cell density.
+- Reference scRNA-seq: same tissue ideally; 40k+ cells with 20–50 well-characterised cell types.
+
+---
+
+# Phase 1 — Reference signatures (step 1 notebook)
+
+## Goal
+Set the parameters for the reference-signature `RegressionModel` workflow. Output is a `signatures.csv` (genes × cell_types in linear count scale, batch-corrected) consumed by Phase 7's `Cell2location` model.
+
+Read [reference/hyperparameters_extract.md §2](reference/hyperparameters_extract.md) for the reference-estimation method choice (NB regression vs cluster-averages). The skill defaults to `RegressionModel` (NB regression).
+
+## Interactive
+AskUserQuestion fields:
+- `ref_h5ad_path` — path to scRNA reference AnnData.
+- `labels_key` — column in `adata.obs` for cell-type labels.
+- `batch_key` — column for batch (e.g. `sample` or `sample_id`).
+- `categorical_covariate_keys` — list of extra categorical covariate columns (e.g. `['10x_kit', 'donor']`). These drive per-gene tech regression `detection_tech_gene_tg`. **Use this for multi-technology references** (10X v2 + v3 + Smart-seq, etc.); do NOT lump them all into `batch_key`. See [_reference_module.py:178](../../../cell2location/models/reference/_reference_module.py#L178).
+- `continuous_covariate_keys` — list of continuous covariates (rare).
+- `gene_filter_cell_count_cutoff` — default 15.
+- `gene_filter_cell_percentage_cutoff2` — default 0.03 (Fig S1 rule).
+- `gene_filter_nonz_mean_cutoff` — default 1.12 (Fig S1 rule).
+- `max_epochs` — default formula `min(round(20000/n_cells * 400), 400)`. Surface this number.
+
+## Autonomous
+Inspect `adata_ref.obs.columns` and `adata_ref.obs.dtypes`:
+1. `labels_key` candidates: match against `cell_type`, `cluster`, `annotation`, `celltype`, `cell_type_lvl*`, `leiden`, `louvain`. Pick the first column whose unique-value count is in `[10, 200]` (cell-type cardinality).
+2. `batch_key` candidates: `sample`, `sample_id`, `donor`, `batch`. Pick the first present.
+3. Multi-tech detection: if any obs column has values matching `10x_v2`/`10x_v3`/`smart.?seq.?2?`/`smart_seq2` → add that column to `categorical_covariate_keys`. Also add `donor` if distinct from `batch_key` selection.
+4. Use defaults for gene filters and `max_epochs`.
+5. Emit a notebook markdown cell: `# Auto-detected: labels_key='X' (N=...), batch_key='Y' (N=...). Override if incorrect.`
+
+## Failure-mode warnings (apply to both modes)
+
+- **Reference too coarse**: if `n_unique(labels_key) < 10`, warn user: "Reference has only N cell types. cell2location works best with 20–50 well-characterised types. With too few, all types appear everywhere ('3-fingered glove on a 5-fingered hand' — see issue [#395](https://github.com/BayraktarLab/cell2location/issues/395))."
+- **Reference too small**: if `min_cells_per_type < 40`, warn: "Cell types with <40 cells will have noisy signatures."
+
+## Output
+Customised parameter values for [templates/step1_reference_signatures.ipynb](templates/step1_reference_signatures.ipynb).
+
+---
+
+# Phase 2 — Spatial data inspection + tissue inference + QC
+
+## Goal
+Understand what spatial dataset the user has (technology, sample count, total-counts distribution), filter low-quality spots, decide gene filters.
+
+## Interactive
+AskUserQuestion:
+- `spatial_h5ad_path` — path to spatial AnnData.
+- `spatial_technology` — `visium` / `visium-hd` / `cytassist` / `slide-seq-v2` / `stereo-seq` / `nanostring-wta`.
+- `batch_key` — sample column in `adata.obs` (typically `sample` or `sample_id`).
+- `total_counts_min` — default 1000 (or `np.median(total_counts) / 3` if median is lower).
+- `total_counts_max` — default 200000.
+- `sample_fraction_threshold` — default 0.7 (drop samples where <70% of locations pass QC).
+- Gene-filter thresholds for `filter_genes`: `cell_count_cutoff=15`, `cell_percentage_cutoff2=0.15`, `nonz_mean_cutoff=1.11`.
+
+## Autonomous
+Inspect `adata.uns`, `adata.obsm`, `adata.obs`:
+1. **Technology detection**:
+   - `adata.uns['spatial']` present with `library_id` keys → 10X Visium / Cytassist / Visium-HD. Bin size from `scalefactors_json`.
+   - No `adata.uns['spatial']` but `adata.obsm['spatial']` present → likely Slide-seq / Stereo-seq / general spatial. Inspect spot density: <10 μm spacing → Slide-seq V2 / Stereo-seq; ~55 μm → Visium-like.
+   - `neg_probes` in `adata.obs` columns or `obsm` → Nanostring WTA/DSP → **switch to `Cell2location_WTA` model** (see Phase 7 branch). See [_cell2location_WTA_module.py:247](../../../cell2location/models/_cell2location_WTA_module.py#L247).
+2. **Total-counts variability**: compute 90/10 percentile ratio per `batch_key`-group. Store for Phase 4.
+3. **Tissue inference** (inform Phase 3 cell-size fallback): scan cell-type labels (if reference passed alongside) or look for `tissue` column in `adata.uns`.
+4. Apply default QC thresholds; emit markdown cell documenting them.
+
+## Output
+Customised QC parameter values for [templates/step2_spatial_mapping.ipynb](templates/step2_spatial_mapping.ipynb).
+
+---
+
+# Phase 3 — `N_cells_per_location` (THE MOST IMPORTANT DECISION)
+
+## Goal
+Set `N̂` per the [Fig S27 decision tree](reference/fig_S27_hyperparameters.png). This drives absolute cell abundance estimation; getting it wrong invalidates all downstream comparisons.
+
+Read [reference/hyperparameters_extract.md §1.2 item 1](reference/hyperparameters_extract.md) before this phase.
+
+## Decision tree (Fig S27)
+
+```
+Q1: Paired histology / DAPI image for the same tissue section?
+├── YES:
+│   └── Q2: Per-location nuclei segmentation in adata.obs (e.g. `n_cell`, `nuclei`, `occupancy`)?
+│       ├── YES (ADVANCED) → per-location N̂_s = occupancy × N_nuclei × scaling.
+│       │   * Formula from the embryo workflow: n_cell_occupancy = occupancy * np.quantile(n_cell, 0.99999).
+│       │   * v_n = 10 (high confidence).
+│       │   * EFFECTIVENESS: only deterministic in hires_sliding_window branch
+│       │     with use_proportion_factorisation_prior_on_w_sf=True AND
+│       │     use_n_s_cells_per_location_limit=True. See Phase 6.
+│       │   * Why occupancy × N × scaling > N alone: occupancy accounts for
+│       │     the fraction of the spot covered by tissue, so per-location N̂_s
+│       │     is correctly attenuated for partly-empty spots when used as
+│       │     the hard multiplier in hires.
+│       └── NO → manual count in 10X Loupe browser, 10-20 spots → single tissue-level N̂, v_n = 1
+├── NO → Q3: Same-tissue histology (not paired) available?
+│       ├── YES → same manual-count procedure on similar tissue
+│       └── NO → Fig S27 fallback: cell-size + capture-size formula:
+│                * 10X Visium (55 μm)     → N̂ ≈ 5
+│                * Cytassist Visium       → N̂ ≈ 5 (low confidence)
+│                * Visium-HD              → scale by bin area (2/8/16 μm bin)
+│                * Slide-Seq V2 (10 μm)   → N̂ ≈ 1
+│                * Stereo-seq             → N̂ ≈ 1 (bin-dependent)
+│                * Nanostring WTA/DSP     → use per-region segmentation if available;
+│                                          else WTA model handles via n_nuclei input
+```
+
+## Interactive
+Walk the user through Q1 → Q2 → Q3 via sequential AskUserQuestion blocks. Default to "NO histology" if user doesn't engage.
+
+## Autonomous
+1. Scan `adata.obs.columns` for nuclei segmentation columns: `n_cell`, `nuclei`, `nuclei_count`, `occupancy`, `n_cell_occupancy`, `cell_count`, `n_nuclei`.
+2. If **`n_cell` AND `occupancy` both present**: compute `n_cell_occupancy = adata.obs['occupancy'] * np.quantile(adata.obs['n_cell'], 0.99999)` → set per-location N̂_s; `v_n = 10`; FLAG: user needs hires branch (Phase 6).
+3. If **only `n_cell`** (no occupancy): use per-location `n_cell` as N̂_s; `v_n = 10`; FLAG: occupancy scaling missing → less effective than the occupancy×N formula. User should regenerate segmentation with occupancy.
+4. If **no segmentation columns**: use Fig S27 cell-size fallback by detected technology (Phase 2):
+   - Visium → 5; Cytassist → 5; Visium-HD → adjust by bin area; Slide-seq V2 → 1; Stereo-seq → 1.
+   - `v_n = 1`.
+5. Emit notebook markdown cell documenting the chosen path: `## N_cells_per_location decision\n\nUsed Fig S27 fallback (no nuclei segmentation columns found in adata.obs).\nTechnology detected: Visium → N̂ = 5, v_n = 1.\n\nOverride: provide adata.obs['n_cell_occupancy'] column with per-location nuclei counts, AND install cell2location from the hires_sliding_window branch (Phase 6) to make this constraint deterministic.`
+
+## Output
+Customised `N_cells_per_location_column` OR `N_cells_per_location_scalar` value + `N_cells_per_location_alpha_prior` (1 or 1000) for [templates/step2_spatial_mapping.ipynb](templates/step2_spatial_mapping.ipynb).
+
+---
+
+# Phase 4 — `detection_alpha` (Fig S27 lower flow)
+
+## Goal
+Set the regularisation strength of the per-location detection prior `y_s ~ Gamma(α^y, α^y / y_e)`.
+
+Read [reference/hyperparameters_extract.md §1.2 item 2](reference/hyperparameters_extract.md).
+
+## Decision rule
+
+```
+Strong within-batch RNA-count variation NOT explained by tissue containing more cells?
+├── YES → detection_alpha = 20  (less strict; common for FFPE, Cytassist, Visium-HD, older human)
+└── NO  → detection_alpha = 200 (strict; fresh-frozen single-sample Visium)
+```
+
+Tissue caveat: tissues with intrinsic regions of high cell density (hippocampus, gut lymphoid follicle) explain higher RNA counts by cell density, not by detection variability → `200` may still be appropriate. The model regulariser interprets `y_s` variation as TECHNICAL noise; if it's biological-density-driven, the absolute abundance prior (Phase 3) handles it.
+
+## Interactive
+Show the user a quick histogram of `total_counts` per location, grouped by sample. Ask: "Does within-sample RNA variability look mostly technical (gradients, artifacts) or mostly biological (denser tissue regions)?"
+
+## Autonomous
+1. Compute per-sample 90/10 percentile ratio of `adata.obs['total_counts']`.
+2. If ratio > 10× for any sample → `detection_alpha = 20`.
+3. Technology override: FFPE / Cytassist / Visium-HD → `20` regardless.
+4. Else → `detection_alpha = 200`.
+5. Emit notebook markdown cell with the ratio and the choice.
+
+## Per-batch override
+If user has multiple samples with different qualities (some fresh-frozen, some FFPE), the model accepts a per-batch dict: `detection_alpha = {'sample_a': 20, 'sample_b': 200, ...}`. Both modes should surface this option when a mixed batch is detected.
+
+## Output
+Customised `detection_alpha` (scalar or dict) for [templates/step2_spatial_mapping.ipynb](templates/step2_spatial_mapping.ipynb).
+
+---
+
+# Phase 5 — Chunking strategy (anti-default rule)
+
+## Goal
+Decide whether to train on all data in one chunk, or split into multiple chunks. **Default is 1 chunk if data fits the GPU.** Never silently default to multiple chunks.
+
+Per the user's brief: "not to split data in 5 chunks but to select chunk size that maximises GPU memory use - for many datasets it will be one chunk".
+
+## Sizing formula
+
+```
+chunk_size_max ≈ available_gpu_memory_bytes / (n_vars × 8 bytes/float64 × overhead_factor)
+                                                                            (overhead_factor ≈ 3-5)
+n_chunks = ceil(n_obs / chunk_size_max)
+```
+
+## Interactive
+AskUserQuestion:
+- `gpu_memory_gb` — 40 / 80 / 96 / other.
+
+Then compute and SHOW the user the result before generating code:
+> "Your 312,847-location dataset on an 80GB A100 needs **4 chunks of ~78k each**. Each chunk will train for ~2h via stratified random allocation across your 47 samples. Confirm or override to (1) increase chunks for safety / (2) request more GPU memory."
+
+## Autonomous
+1. Default GPU memory budget: 80 GB (A100/H100).
+2. Compute `chunk_size_max` from formula above with overhead_factor = 4.
+3. If `n_obs ≤ chunk_size_max`: `n_chunks = 1`, `batch_size = None` (full-batch).
+4. Else: `n_chunks = ceil(n_obs / chunk_size_max)`; stratified random allocation across samples (embryo workflow pattern — each chunk sees ALL samples). Emit markdown cell with the chunk plan.
+
+## REFUSALS
+- **Mini-batch training for spatial mapping (`batch_size != None`)** — REFUSE. Cite [issue #356](https://github.com/BayraktarLab/cell2location/issues/356) + supplement §1.3. Mini-batch is dramatically less accurate AND slower in wall-clock (10+ sec/iter → days/weeks vs full-batch ~1–2h per chunk on A100).
+
+## Output
+- `n_chunks`, `training_batch` (0..n_chunks-1, set per launcher invocation), chunk-assignment logic in [templates/step2_spatial_mapping.ipynb](templates/step2_spatial_mapping.ipynb).
+
+---
+
+# Phase 6 — Branch selection (master vs hires_sliding_window vs WTA)
+
+## Goal
+Pick the right cell2location model variant. The choice affects WHETHER `N_cells_per_location` actually constrains the model deterministically (hires) or only as a soft prior (master).
+
+## Decision
+
+```
+Per-location nuclei segmentation provided (Phase 3 chose per-location N̂_s)?
+├── YES → install hires_sliding_window branch + use the proportion-factorisation flags
+└── NO  → use master branch (default)
+
+Nanostring WTA / DSP data?
+└── YES → use Cell2location_WTA (master branch already provides it; takes n_nuclei as forward arg)
+```
+
+## Why this matters — model-mechanism explanation
+
+In **master** ([_cell2location_module.py:275-281](../../../cell2location/models/_cell2location_module.py#L275)):
+
+```python
+n_s_cells_per_location ~ Gamma(N_cells_per_location * N_cells_mean_var_ratio,
+                               N_cells_mean_var_ratio)
+```
+
+This is a **soft Gamma prior**. The variable `n_s_cells_per_location` enters factorization through the `z_sr` rate `n_s_cells_per_location / b_s_groups_per_location`, but `w_sf` can drift away from it. Per-location segmentation counts as soft guidance.
+
+In **hires_sliding_window** (with `use_proportion_factorisation_prior_on_w_sf=True` AND `use_n_s_cells_per_location_limit=True`):
+
+```python
+w_sf ~ Gamma(w_sf_mu * mean_var_ratio, mean_var_ratio)   # samples proportions
+w_sf = w_sf / w_sf.sum(dim=-1, keepdim=True)             # normalize to proportions
+w_sf = w_sf * n_s_cells_per_location                     # multiply by N̂_s
+pyro.deterministic("w_sf", w_sf)                         # deterministic output
+```
+
+`n_s_cells_per_location` is now a **deterministic multiplier** on the final cell abundance. Per-location segmentation directly scales total cell abundance per location.
+
+**The 2022 paper used master-style wiring and reported limited benefit from segmentation.** The hires rewiring is what makes segmentation actually effective. The `occupancy × N_nuclei × scaling` (`n_cell_occupancy`) formula from the embryo workflow further corrects for partly-empty spots by attenuating the multiplier.
+
+## Action
+
+If Phase 3 chose per-location N̂_s:
+```
+Instruct the user (interactive) OR auto-install (autonomous):
+    pip install --force-reinstall git+https://github.com/BayraktarLab/cell2location.git@hires_sliding_window
+Set in step2 parameters:
+    use_proportion_factorisation_prior_on_w_sf = True
+    use_n_s_cells_per_location_limit = True
+    N_cells_per_location_alpha_prior = 1000.0
+```
+
+Else (master is fine):
+```
+pip install cell2location          # standard PyPI install
+Set:
+    use_proportion_factorisation_prior_on_w_sf = False  (or leave unset on master)
+    use_n_s_cells_per_location_limit = False
+```
+
+For Nanostring WTA/DSP:
+```
+Use cell2location.models.Cell2location_WTA(adata, ...)
+Pass n_nuclei as a per-observation column via setup_anndata's "nuclei" key.
+```
+
+## Output
+Branch install command + step2 parameter values reflecting branch features.
+
+---
+
+# Phase 7 — Model hyperparameters (supplement §1.2 defaults)
+
+## Goal
+Set the remaining model hyperparameters. All have defaults from supplement §1.2; users rarely override.
+
+Read [reference/hyperparameters_extract.md "Other priors"](reference/hyperparameters_extract.md).
+
+| Hyperparameter | Default | When to override |
+|---|---|---|
+| `n_groups` (R) | 50 | Almost never |
+| `A_factors_per_location` (Â) | 7 | Almost never |
+| `B_groups_per_location` (B̂) | 7 | Almost never |
+| `w_sf_mean_var_ratio` (v^w) | 5 | Almost never |
+| `N_cells_mean_var_ratio` (v^n) | 1 (global N̂) / 10 (per-location N̂_s) | Per Phase 3 |
+| `N_cells_per_location_alpha_prior` | None (master) / 1000.0 (hires) | Per Phase 6 |
+| `use_per_cell_type_normalisation` | False | Advanced; per-CT library-size adjustment |
+| `detection_hyp_prior['mean_alpha']` | 10 | Almost never |
+
+## Auto-derived hyperpriors (always computed by the template; not user-set)
+
+Per supplement §1.2 item 3 + embryo step2 cells 36-39:
+
+```python
+expected_y_e = (adata_vis.obs[['sample', 'total_counts']].groupby('sample').mean()
+                / (inf_aver.sum(0) * np.mean(N_cells_per_location)).mean())
+mean_alpha_prior = np.round(((expected_y_e.mean() ** 2) / expected_y_e.var()).values[0] / 3, 2)
+detection_cell_type_prior_alpha = np.round(((inf_aver.sum(0).mean() ** 2)
+                                             / inf_aver.sum(0).var()) * 20, 2)
+```
+
+These are passed as `detection_hyp_prior={'mean_alpha': mean_alpha_prior}` and `detection_cell_type_prior_alpha=detection_cell_type_prior_alpha` to the `Cell2location` constructor.
+
+## Interactive
+Surface the defaults table; ask if any override.
+
+## Autonomous
+Use defaults verbatim. Emit a markdown cell listing the chosen values.
+
+## Output
+Hyperparameter values for [templates/step2_spatial_mapping.ipynb](templates/step2_spatial_mapping.ipynb).
+
+---
+
+# Phase 8 — Training + posterior export (supplement §1.3)
+
+## Goal
+Set `max_epochs`, posterior-export options. Both have data-size-dependent defaults.
+
+## `max_epochs` tier
+
+| Dataset size | `max_epochs` |
+|---|---|
+| Small (<5k locations) | 5,000 – 10,000 |
+| Medium (10k – 40k) | 20,000 – 30,000 |
+| Large (>100k, chunked) | 30,000 – 80,000 per chunk |
+
+ELBO oscillations late in training are normal and OK — do not stop on first oscillation. Train to full `max_epochs`. [Issue #327](https://github.com/BayraktarLab/cell2location/issues/327).
+
+## `train()` defaults (verbatim from supplement §1.3)
+
+```python
+mod.train(
+    max_epochs=max_epochs,
+    batch_size=None,           # FULL BATCH — never override
+    lr=0.002,                  # ADAM, fixed per supplement
+    train_size=1,              # use all data
+    accelerator='gpu',
+)
+```
+
+## Posterior export (supplement §1.3 + issues #278/#360)
+
+```python
+adata_vis = mod.export_posterior(
+    adata_vis,
+    use_quantiles=True,        # MANDATORY for n_obs > 100k; default-on otherwise
+    add_to_obsm=['means', 'q05', 'q50', 'q95'],
+    sample_kwargs={
+        'batch_size': int(np.ceil(adata_vis.n_obs / 4)),
+        'accelerator': 'gpu',
+        'return_observed': False,
+    },
+    exclude_vars=['data_target'],
+)
+```
+
+The paper uses `q05` in all figures (slightly more accurate than mean for absolute abundance).
+
+## QC after training
+
+1. `mod.plot_history(5000)` — ELBO curve. Skip first 5k for visibility.
+2. `mod.plot_QC(summary_name='q05')` — posterior predictive log-log plot (per supplement §1.3).
+3. Spatial plot of `y_s` — should resemble total RNA count distribution.
+
+## Interactive
+Surface the tier-chosen `max_epochs`; ask if user wants to override.
+
+## Autonomous
+Use the tier from Phase 5's `n_chunks` × `chunk_size`. Emit markdown cell with the chosen value and the rationale ("`n_obs=78k`, medium tier → `max_epochs=30000`").
+
+## REFUSALS
+
+- `batch_size != None` → REFUSE (already in Phase 5).
+- `num_samples=1000` on `n_obs > 100k` WITHOUT `use_quantiles=True` → REFUSE. OOM. [Issue #278](https://github.com/BayraktarLab/cell2location/issues/278).
+- Log-transforming the spatial counts before model fitting → REFUSE. Breaks NB likelihood. [Issue #386](https://github.com/BayraktarLab/cell2location/issues/386).
+
+## Output
+`max_epochs`, posterior-export sample_kwargs for [templates/step2_spatial_mapping.ipynb](templates/step2_spatial_mapping.ipynb).
+
+---
+
+# Phase 8.5 — Implementation-completeness check (BEFORE LAUNCH)
+
+## Goal
+Sweep every technical decision made in Phases 1–8, confirm all slots are filled, persist them into `SPATIAL_MAPPING_CONTEXT.md`, and **cross-check them against the user's `## Scientific scope`** (especially failure criteria). Block the launch in Phase 9 if any cross-check fires a hard violation that the user has not acknowledged.
+
+## Action
+Invoke the [cell2location-context](../cell2location-context/SKILL.md) skill with `--technical`:
+
+```
+/cell2location-context --technical
+```
+
+Pass it the current notebook parameter dict (or, if running through the formal workflow-state markdown, the per-phase summary the skill has been emitting).
+
+The context skill will:
+
+1. **Sweep each slot** in the Phase 1–8 decision table (see [reference/technical_completeness_rubric.md](../cell2location-context/reference/technical_completeness_rubric.md)). Fill any EMPTY slot with the defensible default OR ask the user when no default is sensible.
+2. **Cross-check** the chosen decisions against the user's failure criteria. Example fires:
+   - Failure criterion "abundance varies 10× across visually similar regions" + `detection_alpha=200` chosen → flag.
+   - Failure criterion "subtype mixing" + per-location segmentation hinted in scope + Phase 6 chose `master` → flag.
+   - Target population lumped in chosen `labels_key` → flag, route to issue #395.
+3. **Persist** the `## Technical decisions` and `## Outstanding gaps` blocks.
+4. **Return** the file path AND a boolean `safe_to_launch`. If `False`, ask the user: "N cross-checks flagged. Launch anyway / fix and re-run / abort?" Default to **fix and re-run**.
+
+## Autonomous mode
+Cross-checks still run; defaults are applied silently. `safe_to_launch=False` becomes a warning markdown cell in the launcher invocation log; the workflow proceeds (autonomous runs cannot block on user input).
+
+## Output
+Updated `SPATIAL_MAPPING_CONTEXT.md` with the technical decisions; a go/no-go signal for Phase 9.
+
+---
+
+# Phase 9 — Compute infrastructure (launch)
+
+## Goal
+Submit the job. Pick the right launcher for the user's compute.
+
+## Choices
+
+- **LSF cluster (Sanger, Crick LSF)** → [templates/bsub.sh](templates/bsub.sh)
+- **Slurm cluster (Crick Slurm, NIH, cloud)** → [templates/sbatch.sh](templates/sbatch.sh)
+- **Single local GPU (laptop / workstation)** → [templates/run_local.sh](templates/run_local.sh)
+- **Interactive Jupyter** (user wants to step through cells manually) → open `templates/step2_spatial_mapping.ipynb` in Jupyter Lab after setting the parameters cell at top.
+
+## Interactive
+Ask which compute the user has.
+
+## Autonomous
+Detect scheduler from environment:
+- `bsub --help` exits 0 → LSF.
+- `sbatch --help` exits 0 → Slurm.
+- `nvidia-smi -L` returns ≥1 GPU → local.
+- Else → emit error: "No compute backend detected; please run the templates manually."
+
+## Output
+Launcher invocation command with all parameters filled. Example for LSF:
+```bash
+bash templates/bsub.sh \
+    --training-batch 0 \
+    --seed 0 \
+    --max-epochs 30000 \
+    --signatures-csv ./signatures_output/ref_signatures/signatures.csv \
+    --spatial-h5ad /path/to/spatial.h5ad \
+    --output-name my_run
+```
+
+For chunked runs (`n_chunks > 1`), emit one invocation per `training_batch` index (0..n_chunks-1).
+
+---
+
+# Phase 10 — Aggregation (combine chunked outputs)
+
+## Goal
+If `n_chunks > 1`, combine per-chunk results into a single AnnData.
+
+Skip if `n_chunks == 1`.
+
+## Interactive
+Confirm chunk outputs are present (`{output_dir}/{output_name}_chunk*/sp.h5ad`). Generate the aggregation notebook invocation.
+
+## Autonomous
+Same; if any chunk is missing → emit error with the missing chunk index.
+
+## Aggregation logic (from [templates/step2_aggregate_chunks.ipynb](templates/step2_aggregate_chunks.ipynb))
+
+```python
+adata_full = sc.read_h5ad(spatial_h5ad_path)
+for key in ['means', 'q05', 'q50', 'q95']:
+    adata_full.obsm[key] = np.zeros((adata_full.n_obs, n_cell_types))
+for i in range(n_chunks):
+    adata_chunk = sc.read_h5ad(f"{output_dir}/{output_name}_chunk{i}/sp.h5ad")
+    idx = adata_full.obs.index.get_indexer(adata_chunk.obs.index)
+    for key in ['means', 'q05', 'q50', 'q95']:
+        adata_full.obsm[key][idx] = adata_chunk.obsm[key]
+    adata_full.uns[f'mod_batch{i}'] = adata_chunk.uns.get('mod', None)
+adata_full.write(output_path)
+```
+
+Reference: [issue #356](https://github.com/BayraktarLab/cell2location/issues/356), [#375](https://github.com/BayraktarLab/cell2location/issues/375).
+
+---
+
+# Common errors — route to troubleshooting skill
+
+The companion [cell2location-troubleshooting/SKILL.md](../cell2location-troubleshooting/SKILL.md) handles symptom-based debugging. Common patterns it covers:
+
+- "ELBO is oscillating" → expected; keep training. [Issue #327](https://github.com/BayraktarLab/cell2location/issues/327).
+- "OOM on `export_posterior`" → `use_quantiles=True`, reduce `sample_kwargs['batch_size']`. [Issue #278](https://github.com/BayraktarLab/cell2location/issues/278).
+- "All cell types appear everywhere" → reference granularity too coarse. [Issue #395](https://github.com/BayraktarLab/cell2location/issues/395).
+- "Comparing two samples" → train ONE joint model with `batch_key`. [Issues #389](https://github.com/BayraktarLab/cell2location/issues/389), [#396](https://github.com/BayraktarLab/cell2location/issues/396).
+- "Model load fails / Pyro params not initialised" → `mod.train(max_epochs=1)` after `load()`. [Issues #365](https://github.com/BayraktarLab/cell2location/issues/365), [#404](https://github.com/BayraktarLab/cell2location/issues/404), [#421](https://github.com/BayraktarLab/cell2location/issues/421).
+
+Load the troubleshooting skill alongside this one. When you can't resolve an issue, instruct the user to invoke `/cell2location-troubleshooting` to draft a clean issue.
+
+---
+
+# Anti-patterns the skill REFUSES (re-stated)
+
+| Anti-pattern | Refusal | Citation |
+|---|---|---|
+| Mini-batch training for spatial mapping (`batch_size != None`) | REFUSE | [#356](https://github.com/BayraktarLab/cell2location/issues/356), supplement §1.3 |
+| Log-transforming reference or spatial counts | REFUSE | [#386](https://github.com/BayraktarLab/cell2location/issues/386), supplement §2 |
+| `num_samples=1000` on `n_obs > 100k` without `use_quantiles=True` | REFUSE | [#278](https://github.com/BayraktarLab/cell2location/issues/278), [#360](https://github.com/BayraktarLab/cell2location/issues/360) |
+| Running [docs/notebooks/cell2location_tutorial.ipynb](../../../docs/notebooks/cell2location_tutorial.ipynb) on real-sized data | REFUSE; point to this skill | design rule |
+| Training separate models per sample to compare conditions | REFUSE; use joint model with `batch_key` | [#389](https://github.com/BayraktarLab/cell2location/issues/389), [#396](https://github.com/BayraktarLab/cell2location/issues/396) |
+| Computing per-cell-type per-gene per-location dense tensor on `n_obs > 100k` | REFUSE; compute per-chunk | [#375](https://github.com/BayraktarLab/cell2location/issues/375) |
+
+---
+
+<reference>
+
+## Always-loaded reference
+
+### Hyperparameter defaults (from supplement §1.2)
+
+See [reference/fig_S27_hyperparameters.png](reference/fig_S27_hyperparameters.png) for the canonical decision tree. See [reference/fig_S1_workflow.png](reference/fig_S1_workflow.png) for the workflow overview.
+
+| Hyperparameter | Default | Source |
+|---|---|---|
+| `N_cells_per_location` | Fig S27 (scalar from histology, or per-location N̂_s from segmentation, or cell-size fallback) | §1.2 item 1 |
+| `detection_alpha` | 200 (low var) / 20 (high var) | §1.2 item 2 + Fig S27 |
+| `n_groups` | 50 | §1.2 |
+| `A_factors_per_location` | 7 | §1.2 |
+| `B_groups_per_location` | 7 | §1.2 |
+| `w_sf_mean_var_ratio` | 5 | §1.2 |
+| `N_cells_mean_var_ratio` (v^n) | 1 (global) / 10 (per-location) | §1.2 |
+| `N_cells_per_location_alpha_prior` | None (master) / 1000.0 (hires + segmentation) | hires-branch convention |
+| `max_epochs` | 5k (small) / 30k (medium) / 30-80k (large per chunk) | §1.3 |
+| `batch_size` | None (full-batch) | §1.3 — never override |
+| `lr` | 0.002 | §1.3 |
+| `use_quantiles` (export) | True for n_obs > 100k | issue #278 |
+
+### Training (§1.3)
+
+- ADVI in Pyro; ADAM `lr=0.002`.
+- 20,000–50,000 iterations; stop at ELBO plateau.
+- Late ELBO oscillations are normal — don't stop early.
+- 1000 posterior samples; q05 is the paper's preferred point estimate.
+- QC: `mod.plot_history(5000)` + posterior predictive log-log + spatial `y_s` consistency check.
+
+## Suggested reading (for when you need more context)
+
+These are NOT auto-loaded. Use the `Read` tool when the user asks "why" or you need a deeper answer.
+
+- [reference/hyperparameters_extract.md](reference/hyperparameters_extract.md) — full paraphrase of supplement §1.2 + §1.3 + §1.4 + §2. *Read when:* you need the full default-by-default rationale, or when a user asks "why this default?".
+- [reference/issue_corpus.md](reference/issue_corpus.md) — paraphrased vitkl-guidance harvest. *Read when:* user reports a symptom that doesn't match an anti-pattern above.
+- [reference/c2l_supplement.pdf](reference/c2l_supplement.pdf) — full supplementary methods (17 pages text + figures). *Read when:* user asks about model features, new modalities, integration across experiments, multiple samples, mathematical guarantees — anything beyond hyperparameter defaults.
+- [../../../cell2location/models/_cell2location_module.py:242](../../../cell2location/models/_cell2location_module.py#L242) — master-branch `forward()`. *Read when:* user asks why per-location `N_cells_per_location` doesn't seem to "stick" (answer: it's a soft prior in master).
+- `cell2location/models/_cell2location_module.py:898` on `origin/hires_sliding_window` (fetch via `git show origin/hires_sliding_window:cell2location/models/_cell2location_module.py | sed -n '898,1100p'`) — hires-branch `forward()`. *Read when:* user is on hires with segmentation and wants to understand exactly how `n_s_cells_per_location` becomes a deterministic multiplier on `w_sf`.
+- [../../../cell2location/models/_cell2location_WTA_module.py:247](../../../cell2location/models/_cell2location_WTA_module.py#L247) — WTA `forward()`. *Read when:* user is on Nanostring WTA/DSP and asks about negative-probe-binding or per-experiment gene calibration.
+- [../../../cell2location/models/reference/_reference_module.py:178](../../../cell2location/models/reference/_reference_module.py#L178) — reference `RegressionModel` `forward()`. *Read when:* user asks why `batch_key` vs `categorical_covariate_keys` matter (answer: `batch_key` drives `detection_mean_y_e` + `s_g_gene_add`; `categorical_covariate_keys` drive per-gene tech regression `detection_tech_gene_tg`).
+
+</reference>
diff --git a/.claude/skills/spatial-mapping/reference/c2l_supplement.pdf b/.claude/skills/spatial-mapping/reference/c2l_supplement.pdf
new file mode 100644
index 00000000..4e1e232f
Binary files /dev/null and b/.claude/skills/spatial-mapping/reference/c2l_supplement.pdf differ
diff --git a/.claude/skills/spatial-mapping/reference/fig_S1_workflow.png b/.claude/skills/spatial-mapping/reference/fig_S1_workflow.png
new file mode 100644
index 00000000..9905e4b3
Binary files /dev/null and b/.claude/skills/spatial-mapping/reference/fig_S1_workflow.png differ
diff --git a/.claude/skills/spatial-mapping/reference/fig_S27_hyperparameters.png b/.claude/skills/spatial-mapping/reference/fig_S27_hyperparameters.png
new file mode 100644
index 00000000..ed19c8f5
Binary files /dev/null and b/.claude/skills/spatial-mapping/reference/fig_S27_hyperparameters.png differ
diff --git a/.claude/skills/spatial-mapping/reference/hyperparameters_extract.md b/.claude/skills/spatial-mapping/reference/hyperparameters_extract.md
new file mode 100644
index 00000000..7862faba
--- /dev/null
+++ b/.claude/skills/spatial-mapping/reference/hyperparameters_extract.md
@@ -0,0 +1,172 @@
+# cell2location hyperparameters — paraphrase of supplementary methods §1.2–§1.4 + §2
+
+This file is the structured paraphrase of the cell2location 2022 paper's supplementary methods. It is the authoritative reference for hyperparameter choice in the [SKILL.md](../SKILL.md) operating manual. Full supplement is at [c2l_supplement.pdf](c2l_supplement.pdf).
+
+For human readers: every default below is from Kleshchevnikov 2022 supplement; the skill applies them verbatim unless a user override is recorded.
+
+---
+
+## §1.2 — Three data-dependent hyperparameters
+
+There are exactly **three** hyperparameters that depend on the user's data. The remaining priors keep their defaults (table below).
+
+### 1. `N̂` — Expected cell abundance per location (`N_cells_per_location`)
+
+Tissue-level global estimate informing the prior on cell-count per spot. **The single most important user-provided value**. See [Fig S27](fig_S27_hyperparameters.png) for the decision tree.
+
+Decision tree (verbatim from Fig S27):
+
+```
+Paired histology / DAPI image available?
+├── YES → Can you segment nuclei per location (e.g. CNN / DAPI segmentation)?
+│        ├── YES (advanced) → Provide per-location N̂_s, with N̂_s + 0.1 pseudocount;
+│        │                    v^n = 10 (high confidence in segmentation).
+│        └── NO  → Estimate per-tissue average by manually counting nuclei in
+│                  10–20 locations (10X Loupe browser). Single tissue-level N̂.
+└── NO  → Same-tissue histology (not paired) available?
+         ├── YES → Same manual-count procedure on similar tissue.
+         └── NO  → Use cell-size + capture-size formula:
+                   • 10X Visium (55 μm capture)    → N̂ ≈ 5
+                   • Slide-Seq V2 (10 μm bead)     → N̂ ≈ 1
+                   • Visium-HD (2/8/16 μm bin)     → scale by area
+                   • Cytassist Visium              → ≈ 5 (low confidence)
+                   • Nanostring WTA/DSP            → per-region segmentation
+```
+
+For all paper analyses, a single tissue-level estimate was used. The orange-highlighted path in Fig S27 (manual counting → single scalar) is applicable to most 10X Visium users.
+
+### 2. `α^y` — Regularisation of within-experiment RNA detection variation (`detection_alpha`)
+
+Controls the prior on per-location detection efficiency variability:
+
+```
+y_s ~ Gamma(α^y, α^y / y_e)
+```
+
+where `y_e` is the per-batch mean sensitivity (latent).
+
+Decision rule (Fig S27, lower flow):
+
+```
+Strong within-batch variation in total RNA count NOT explained by tissue containing more cells?
+├── YES → α^y = 20  (less strict regularisation; allows y_s to vary widely per spot)
+│         Common in:
+│           - FFPE Visium
+│           - Cytassist
+│           - Visium-HD
+│           - Older / degraded human samples
+│         Note: tissues with intrinsic regions of high cell density
+│         (hippocampus, gut lymphoid follicle) explain higher RNA by cell density,
+│         not by detection variability — α^y = 20 still appropriate.
+└── NO  → α^y = 200 (strict; y_s tightly clustered around y_e)
+          Common in:
+            - Fresh-frozen single-sample Visium
+            - High-quality single-batch experiments
+```
+
+If unsure: inspect the per-location `total_counts` distribution; if its 90/10 percentile ratio > 10× → use `α^y = 20`.
+
+### 3. `μ^y` — Expected detection sensitivity (auto-derived)
+
+NOT a user-set value. Computed automatically from data by the model wrapper:
+
+```
+μ^y = (Σ_s Σ_g d_{s,g} / S) / (N̂ × Σ_f Σ_g g_{f,g} / F)
+```
+
+Where `S` = total locations, `F` = total cell types, `d_{s,g}` = spatial counts, `g_{f,g}` = reference signatures. The skill does not surface this to the user.
+
+---
+
+## Other priors — keep at defaults (per §1.2)
+
+| Hyperparameter | Symbol | Default | What it controls |
+|---|---|---|---|
+| Cell types per location | `Â` = `A_factors_per_location` | **7** | Average # of cell types contributing per spot |
+| Tissue zones per location | `B̂` = `B_groups_per_location` | **7** | Average # of co-located cell-type compartments per spot |
+| Co-located cell type groups (R) | `n_groups` | **50** | Latent dimensionality of factorisation; model finds 3–7 substantive groups |
+| Co-abundance prior strength | `v^w` = `w_sf_mean_var_ratio` | **5** | Medium; increasing → over-smoothing; decreasing → cell types independent |
+| N̂ prior strength | `v^n` = `N_cells_mean_var_ratio` | **1** (global N̂) / **10** (per-location N̂_s) | How tightly N̂ constrains the model |
+
+Notes:
+- The model is **robust** to a range of `Â` and `B̂` (sensitivity analysis Fig S5). Default 7 is in the middle of the safe range.
+- The embryo workflow used `B̂ = 5`. This is a project-specific deviation; do not adopt it as a default for other tissues without a stated reason.
+- For laser-capture microscopy (LCM) or other methods with substantially varying location size, provide per-location `N̂_s` with `v^n = 10`.
+
+---
+
+## §1.3 — Inference
+
+- **Framework**: Pyro / ADVI (Automatic Differentiation Variational Inference). Mean-field Gaussian over softplus-transformed parameters.
+- **Optimiser**: ADAM, `lr = 0.002`.
+- **Iterations**: 20,000–50,000 typically. **Stop at ELBO plateau** (`mod.plot_history(5000)` — skip first 5k for visibility).
+- **ELBO oscillations late in training are expected and OK** — do not stop on first oscillation. Some parameters have intrinsic uncertainty; oscillations reflect that, not optimization failure. [GitHub issue #327](https://github.com/BayraktarLab/cell2location/issues/327).
+- **Posterior samples**: 1,000 from the variational distribution → mean, std, q05, q95.
+- The paper used `q05` for all figures (slightly more accurate than mean for absolute cell abundance).
+
+### Recommended QC after training
+
+1. **Posterior predictive check**: `log10(μ_{s,g} + 1)` vs `log10(d_{s,g} + 1)` — fit should be near-diagonal across the count range.
+2. **`y_s` consistency**: spatial distribution of estimated `y_s` should resemble total RNA count `Σ_g d_{s,g}` across locations; AND total cell abundance `Σ_f w_{s,f}` should be consistent with histology (areas with more cells → higher abundance).
+
+### Prior predictive check (optional)
+
+For new tissues/technologies, optionally sample from the prior (without observed data) to verify the priors give plausible synthetic counts before fitting. Implemented in pymc3 backend.
+
+---
+
+## §1.4 — Nuclei segmentation (optional, advanced)
+
+When per-location `N̂_s` is desired:
+
+1. H&E or DAPI image of the spatial tissue section.
+2. CNN ensemble (32 nets, Unet/FPN, trained on dsb2018) — [yozhikoff/segmentation](https://github.com/yozhikoff/segmentation).
+3. Predicted nuclei masks → kd-tree-assigned to spatial-array locations.
+4. Per-location nuclei count column in `adata.obs`.
+
+**Effectiveness caveat** (not in the paper, but documented in the skill): per-location N̂_s is only **deterministically effective** in the `hires_sliding_window` branch with both `use_proportion_factorisation_prior_on_w_sf=True` AND `use_n_s_cells_per_location_limit=True`. In master, N̂_s acts as a soft prior that the model can drift from. The embryo workflow's `n_cell_occupancy = occupancy × N_nuclei × scaling` formula corrects for partly-empty spots and is required for accurate per-location abundance. See [../SKILL.md](../SKILL.md) Phase 6 for branch selection.
+
+---
+
+## §2 — Reference signature estimation
+
+Two approved methods + a user-provided fallback:
+
+### Method 1 (recommended): NB regression via `RegressionModel`
+
+`cell2location.models.RegressionModel(adata, ...)` — full Bayesian regression accounting for batch + per-gene technology effects. Robust across multi-batch, multi-technology references.
+
+User inputs:
+- `labels_key` — cell-type column (drives `per_cluster_mu_fg`, the output signatures).
+- `batch_key` — batch column (drives per-batch detection `detection_mean_y_e` + additive background `s_g_gene_add`).
+- `categorical_covariate_keys` — list of additional categorical covariates (e.g. `donor`, `10x_kit`); drives per-gene per-tech regression `detection_tech_gene_tg`. **Use this for multi-technology references** (e.g. 10X v2 + v3 + Smart-seq); do NOT lump them all into `batch_key`.
+
+### Method 2 (Smart-seq 2 fallback): hard-coded cluster averages
+
+`cell2location.cluster_averages.get_cluster_averages(adata, labels_key)` — fast, no batch correction. Use when:
+- Reference is from one batch / one technology (minimal technical heterogeneity).
+- Smart-seq 2 — distributional assumptions of Method 1 (NB) don't fit Smart-seq 2 well.
+
+### Method 3: user-provided signatures
+
+Any `pd.DataFrame` with shape (genes × cell_types) in **linear count scale** (not log). The model's NB likelihood expects linear counts.
+
+### Gene-selection rule (Fig S1 panel)
+
+Two-cutoff selection at the start of reference estimation:
+1. Selecting genes detected at count > 0 in many cells (> 5% of cells), AND
+2. Selecting genes detected at count > 0 in a few cells (5% > cells > 10) but with mean expression across non-zero cells slightly > 1.0 (e.g. > 1.12).
+
+Defaults: `cell_count_cutoff=15`, `cell_percentage_cutoff2=0.03`, `nonz_mean_cutoff=1.12`. Adjust for very small or very sparse references.
+
+---
+
+## Joint multi-sample modelling — benefits
+
+Three reasons cell2location is run on all samples jointly rather than per-sample:
+
+1. **Detection-sensitivity normalisation across batches** — `y_e` hierarchical prior on `y_s` allows direct comparison of cell abundance across tissue sections with varying sequencing depth.
+2. **Sharing gene-technology effect `m_g`** — improves ability to distinguish low sensitivity from zero abundance.
+3. **Sharing co-location factorisation `x_{r,f}`** — increases sensitivity to which cell types co-locate across experiments.
+
+**Never** train separate models per sample to compare conditions. Use one joint model with `batch_key='sample'`.
diff --git a/.claude/skills/spatial-mapping/reference/issue_corpus.md b/.claude/skills/spatial-mapping/reference/issue_corpus.md
new file mode 100644
index 00000000..588e6b68
--- /dev/null
+++ b/.claude/skills/spatial-mapping/reference/issue_corpus.md
@@ -0,0 +1,352 @@
+# cell2location recurring-issue corpus — paraphrased guidance from the maintainer
+
+This file paraphrases the maintainer's (vitkl) recurring answers across the GitHub issue tracker. It is the canonical fallback the [spatial-mapping skill](../SKILL.md) and [cell2location-troubleshooting skill](../../cell2location-troubleshooting/SKILL.md) consult when a user reports a problem.
+
+**Source**: systematic harvest across [BayraktarLab/cell2location](https://github.com/BayraktarLab/cell2location/issues) and [vitkl/cell2location_paper](https://github.com/vitkl/cell2location_paper). Snapshot date: 2026-04-26. New issues since then are NOT here; an agent matching a symptom should also run `gh search issues -R BayraktarLab/cell2location "<keywords>"` as a freshness check.
+
+**Format**: each topic has: question, default/recommended, decision rule, why it matters, source issues.
+
+---
+
+## 1. Choosing `N_cells_per_location`
+
+**Question**: How many cells in each spatial location?
+
+**Default**: tissue-dependent; typically 5–30 cells per 10X Visium spot. For Visium-HD, similar but scale by bin size.
+
+**How to choose**:
+1. Count nuclei in histology (H&E or DAPI) across 10–20 regions at the same magnification as spots → average.
+2. Provide as a scalar (per-tissue) OR per-location array (`shape: n_obs × 1`) from segmentation.
+3. For early exploration: a rough estimate is fine; the model is robust to similar values.
+4. Without histology: cell-size + capture-size formula (Visium 55 μm → 5; Slide-seq V2 → 1).
+
+**Why it matters**: guides ABSOLUTE cell abundance estimation. Too low → underestimates; too high → overestimates. Essential for cross-section comparability.
+
+**Sources**: [#344](https://github.com/BayraktarLab/cell2location/issues/344), [#383](https://github.com/BayraktarLab/cell2location/issues/383), [#394](https://github.com/BayraktarLab/cell2location/issues/394), supplement eq. 7 + Fig S27.
+
+---
+
+## 2. Setting `detection_alpha`
+
+**Question**: How much does RNA detection sensitivity vary across locations within a batch?
+
+**Default**:
+- `detection_alpha = 200` — low within-slide variability (fresh-frozen single-sample Visium).
+- `detection_alpha = 20` — high within-slide variability (FFPE, Cytassist, Visium-HD, older/degraded, multi-batch).
+
+**How to choose**:
+1. Inspect total-count distribution per location: histogram or violin plot per sample.
+2. Uniform → `200`. Wide / many low-count regions → `20`.
+3. Per-batch dict allowed: `{sample_a: 20, sample_b: 200}`.
+
+**Why it matters**: controls per-location detection prior `y_s ~ Gamma(α, α/y_e)`. Wrong choice masks real technical effects (too high) or misattributes biology to noise (too low).
+
+**Sources**: [#327](https://github.com/BayraktarLab/cell2location/issues/327), [#344](https://github.com/BayraktarLab/cell2location/issues/344), [#381](https://github.com/BayraktarLab/cell2location/issues/381), README + supplement §1.2.
+
+---
+
+## 3. Managing large spatial datasets (>40k spots)
+
+**Question**: How to fit cell2location when data exceeds GPU memory?
+
+**Default**: stratified chunks of ~40–72k spots; train independent models per chunk; merge in post-processing.
+
+**How to choose**:
+1. Estimate memory: `~18k genes × n_locations × 80GB A100 rule`.
+2. If `n_obs > chunk_size_max`: split stratified-by-sample (random assignment across chunks; each chunk sees all samples).
+3. Train each chunk independently (parallel jobs).
+4. Merge: `adata_full.obsm[key][chunk_idx] = adata_chunk.obsm[key]`. Save chunk models as `adata.uns['mod_batch0']`, `mod_batch1`, etc.
+5. **Critical**: each chunk uses `batch_size=None` (full-batch). **Never** mini-batch.
+
+**Why it matters**:
+- Prevents OOM crashes.
+- Full-batch is dramatically more accurate AND faster in wall-clock than mini-batch (full-batch: 1–2h per chunk on A100; mini-batch: 10+ sec/iteration → days/weeks of training).
+
+**Sources**: [#356](https://github.com/BayraktarLab/cell2location/issues/356), [#375](https://github.com/BayraktarLab/cell2location/issues/375), [#365](https://github.com/BayraktarLab/cell2location/issues/365), [#372](https://github.com/BayraktarLab/cell2location/issues/372), [#404](https://github.com/BayraktarLab/cell2location/issues/404), [#421](https://github.com/BayraktarLab/cell2location/issues/421).
+
+---
+
+## 4. Training duration / `max_epochs`
+
+**Question**: How long to train?
+
+**Default**:
+- Moderate dataset (10k–40k locations): `max_epochs = 30000`.
+- Very large (chunked, >100k): may need fewer per chunk (5k–10k).
+- Very small (<5k): may need fewer (1k–5k).
+- Convergence is data-dependent — inspect ELBO.
+
+**How to choose**:
+1. `mod.plot_history(1000)` (skip first 1k for visibility).
+2. Stop when ELBO plateaus.
+3. **Late-stage oscillations are normal and DO NOT harm results**. Higher stochasticity = a few parameters with intrinsic uncertainty.
+4. Don't stop early on oscillations alone.
+5. For smaller/cleaner data: 500–2000 may suffice.
+
+**Why it matters**: oscillations reflect posterior uncertainty, not optimisation failure. Stopping early misses final parameter refinement.
+
+**Sources**: [#327](https://github.com/BayraktarLab/cell2location/issues/327), [#375](https://github.com/BayraktarLab/cell2location/issues/375), supplement §1.3.
+
+---
+
+## 5. Reference signatures: source and format
+
+**Question**: Where should reference signatures come from?
+
+**Default ranking**:
+- **Best**: scRNA-seq from same tissue (e.g. Allen Brain Atlas).
+- **Good**: scRNA-seq from similar tissue/organism.
+- **Poor**: MERFISH/imaging (lower genome coverage).
+- **Avoid**: bulk RNA-seq (too averaged).
+
+**Format**: pd.DataFrame, genes × cell_types, **linear count scale** (not log).
+
+**Cell-count requirements**:
+- Whole-tissue reference: ~40k cells.
+- Per cell type: 100s of cells; minimum 40–50 for rare types.
+- Genome-wide (not just marker genes).
+
+**Critical**: do NOT pre-normalize. The `RegressionModel` handles batch/library-size correction. Pre-normalisation breaks the NB likelihood.
+
+**Sources**: [#292](https://github.com/BayraktarLab/cell2location/issues/292), [#298](https://github.com/BayraktarLab/cell2location/issues/298), [#216](https://github.com/BayraktarLab/cell2location/issues/216), [#219](https://github.com/BayraktarLab/cell2location/issues/219), supplement §2.
+
+---
+
+## 6. Posterior sampling and abundance export
+
+**Question**: How to summarize the posterior for downstream analysis?
+
+**Default**: `use_quantiles=True` (NOW DEFAULT for large datasets).
+
+**Critical**: **NEVER use `num_samples=1000` for n_obs > 100k**. Creates a dense matrix of `n_cell_types × n_obs × 1000` → OOM.
+
+**How to invoke**:
+```python
+adata_vis = mod.export_posterior(
+    adata_vis,
+    use_quantiles=True,
+    add_to_obsm=['means', 'q05', 'q50', 'q95'],
+    sample_kwargs={
+        'batch_size': mod.adata.n_obs // 4,
+        'accelerator': 'gpu',
+        'return_observed': False,
+    },
+    exclude_vars=['data_target'],
+)
+```
+
+**Defaults**:
+- Point estimate: `means` (most commonly used).
+- Confidence: `q05` (lower), `q95` (upper). The paper used `q05` in all figures.
+- Median: `q50` for non-symmetric posteriors.
+
+**Sources**: [#278](https://github.com/BayraktarLab/cell2location/issues/278), [#281](https://github.com/BayraktarLab/cell2location/issues/281), [#360](https://github.com/BayraktarLab/cell2location/issues/360).
+
+---
+
+## 7. Comparing samples / batch effects
+
+**Question**: How to compare cell abundance between two conditions?
+
+**Default**: train ONE joint model with ALL samples concatenated; use `batch_key='sample'` in `setup_anndata`.
+
+**How to choose**:
+1. Concatenate into one AnnData; train one model.
+2. **Never** train separate models per sample for comparison — each model estimates detection differently → abundance estimates not comparable.
+3. For statistical testing: use quantiles from the single joint model.
+
+**Why it matters**:
+- Separate models break comparability.
+- Joint model shares prior on batch effects → meaningful comparisons.
+- Detection variation across batches is REAL and must be jointly accounted for.
+
+**Sources**: [#389](https://github.com/BayraktarLab/cell2location/issues/389), [#396](https://github.com/BayraktarLab/cell2location/issues/396).
+
+---
+
+## 8. Input data normalization
+
+**Question**: Should I normalize/log-transform before cell2location?
+
+**Default**: **NO**. Raw counts only. Cell2location uses NB likelihood, which requires linear counts.
+
+**How to choose**:
+1. Start with raw filtered count matrices (e.g. from CellRanger / SpaceRanger).
+2. Remove cells/spots with <100 UMIs, genes with <5 cells expressing them.
+3. Keep raw counts.
+4. For reference (RegressionModel): include `batch_key`; the model estimates batch effects automatically.
+5. **Do NOT**: log-transform, CPM/TPM normalize, library-size scale.
+
+**Sources**: [#386](https://github.com/BayraktarLab/cell2location/issues/386), supplement §2 + tutorial.
+
+---
+
+## 9. Handling very low count areas
+
+**Question**: Exclude spots with very low RNA?
+
+**Default**: yes — exclude `total_counts < 1000` (or dataset-median / 3).
+
+**How to choose**:
+1. `adata.obs['total_counts'] = adata.X.sum(1)`.
+2. Histogram. If 50–70% of locations <1000 → consider replicating with higher-quality tissue.
+3. If only outlier 5–10% locations low → filter out.
+
+**Why it matters**: low RNA areas cannot distinguish cell types reliably. Including them adds noise without biological signal.
+
+**Sources**: [#344](https://github.com/BayraktarLab/cell2location/issues/344), [#372](https://github.com/BayraktarLab/cell2location/issues/372).
+
+---
+
+## 10. Cell type granularity / reference annotation level
+
+**Question**: How many cell types in the reference?
+
+**Default**: 20–50 well-characterised cell types for comprehensive mapping.
+
+**Anti-pattern — the "3-fingered glove on a 5-fingered hand"**:
+- Too few types (e.g. 7) → all types appear everywhere; model has nowhere to put data not explained by the (too coarse) signatures.
+- Rare types with <40–50 reference cells → noisy signatures → poor mapping.
+
+**How to choose**:
+1. Use biologically motivated subtypes.
+2. Ensure 40–50 cells minimum per type.
+3. If abundance maps don't match biology → may indicate insufficient granularity.
+
+**Sources**: [#395](https://github.com/BayraktarLab/cell2location/issues/395).
+
+---
+
+## 11. Merging multiple trained models (chunked datasets)
+
+**Question**: How to combine results from chunk-trained models?
+
+**Default**: concatenate; copy `obsm` keys back to full adata at chunk indices.
+
+**How to invoke**:
+```python
+# Create full-size adata with zeros for obsm
+adata_full = sc.read_h5ad(original_path)
+for key in ['means', 'q05', 'q50', 'q95']:
+    adata_full.obsm[key] = np.zeros((adata_full.n_obs, n_cell_types))
+
+# Copy chunk results back
+for i, chunk_path in enumerate(chunk_paths):
+    adata_chunk = sc.read_h5ad(chunk_path)
+    idx = adata_full.obs.index.get_indexer(adata_chunk.obs.index)
+    for key in ['means', 'q05', 'q50', 'q95']:
+        adata_full.obsm[key][idx] = adata_chunk.obsm[key]
+    # Save chunk model
+    adata_full.uns[f'mod_batch{i}'] = adata_chunk.uns['mod']
+```
+
+**Don't**: try to merge `adata.uns['mod']` directly — model parameters are chunk-specific (location-dependent params like `w_sf`, `y_s`).
+
+**Sources**: [#356](https://github.com/BayraktarLab/cell2location/issues/356), [#375](https://github.com/BayraktarLab/cell2location/issues/375).
+
+---
+
+## 12. RegressionModel — reference signature estimation
+
+**Question**: How to estimate batch-corrected signatures from scRNA?
+
+**Default**: `cell2location.models.RegressionModel`.
+
+**How to invoke**:
+```python
+RegressionModel.setup_anndata(adata_ref, batch_key='sample', labels_key='cell_type')
+# For multi-technology references, add:
+RegressionModel.setup_anndata(adata_ref, batch_key='sample', labels_key='cell_type',
+                                categorical_covariate_keys=['10x_kit', 'donor'])
+mod_ref = RegressionModel(adata_ref)
+mod_ref.train(max_epochs=300, accelerator='gpu')
+inf_aver = mod_ref.export_posterior(adata_ref, select_quantiles=[0.5])
+# Extract signatures (linear scale, batch-corrected)
+signatures = mod_ref.adata.var[[c for c in mod_ref.adata.var.columns if 'q05_' in c]]
+```
+
+**Critical**: `batch_key` removes coarse batch effect; `categorical_covariate_keys` remove per-gene tech effects. Don't lump all tech covariates into `batch_key`.
+
+**Sources**: [#216](https://github.com/BayraktarLab/cell2location/issues/216), [#219](https://github.com/BayraktarLab/cell2location/issues/219).
+
+---
+
+## 13. VisiumHD, Cytassist, and non-standard technologies
+
+**Question**: Different settings for newer 10X Visium variants?
+
+**Default**: `detection_alpha = 20` (less strict) for all of these.
+
+**Why**: these technologies have higher technical variability than fresh-frozen Visium.
+
+**Per-technology guidance**:
+- **VisiumHD**: estimate `N_cells_per_location` from bin size (2/8/16 μm) and known cell density. Mandatory chunking (150k–650k bins per slide).
+- **Cytassist**: high technical variability → `detection_alpha = 20`.
+- **FFPE**: same as Cytassist.
+- **Xenium, seqFISH+**: similar guidance; inspect total count distribution.
+
+**Sources**: [#356](https://github.com/BayraktarLab/cell2location/issues/356), [#401](https://github.com/BayraktarLab/cell2location/issues/401), README.
+
+---
+
+## 14. Saving and loading models
+
+**Question**: How to save / reload a trained model?
+
+**Default**:
+- Save: `mod.save('model_name', overwrite=True)`.
+- Load: `mod = Cell2location.load('model_name', adata_vis)`.
+
+**Critical caveat**: after `load()`, run `mod.train(max_epochs=1)` to instantiate Pyro parameters. **This is expected behaviour**, not a bug — Pyro params are lazily initialised on first training step. The single epoch is discarded.
+
+**Save the adata separately**: `adata_vis.write('adata.h5ad')` — preserves abundance estimates in obsm.
+
+**Sources**: [#365](https://github.com/BayraktarLab/cell2location/issues/365), [#402](https://github.com/BayraktarLab/cell2location/issues/402), [#404](https://github.com/BayraktarLab/cell2location/issues/404), [#421](https://github.com/BayraktarLab/cell2location/issues/421).
+
+---
+
+## 15. Expected gene expression per cell type (NCEM downstream)
+
+**Question**: Can cell2location output per-cell-type expression per location for downstream analysis?
+
+**Default**: yes, via `mod.compute_expected_nb_param_m_s_g()` or similar (version-dependent).
+
+**For chunked models**: run per-chunk; concatenate as with abundance.
+
+**Use case**: NCEM (neighborhood cell-type-conditioned expression), NMF colocation, custom downstream methods.
+
+**Memory caveat**: per-cell-type per-gene per-location is a 3D tensor; large for big datasets. Compute per-chunk.
+
+**Sources**: [#375](https://github.com/BayraktarLab/cell2location/issues/375), tutorial.
+
+---
+
+## 16. Amortised inference (future / experimental)
+
+**Question**: Can I do fast inference on new data without retraining?
+
+**Default — current state**: no. Full inference required per dataset; no out-of-sample prediction.
+
+**Roadmap**: amortised (VAE-style) inference for 100k–1M+ locations in development. Experimental JAX backend.
+
+**Sources**: README "Future development", experimental JAX branch.
+
+---
+
+## 17. ELBO is oscillating — should I stop?
+
+**Quick answer**: **NO**. Late-stage ELBO oscillations are expected. Train to full `max_epochs`.
+
+See topic #4 (Training duration).
+
+---
+
+## Anti-patterns the skill REFUSES (no exceptions)
+
+These are baked-in refusals in the [SKILL.md](../SKILL.md):
+
+1. Mini-batch training for spatial mapping (`batch_size != None`). Less accurate AND slower wall-clock than full-batch. [#356](https://github.com/BayraktarLab/cell2location/issues/356), supplement §1.3.
+2. Log-transforming counts before cell2location. Breaks NB likelihood. [#386](https://github.com/BayraktarLab/cell2location/issues/386).
+3. `num_samples=1000` on `n_obs > 100k` without `use_quantiles=True`. OOM. [#278](https://github.com/BayraktarLab/cell2location/issues/278).
+4. Running the tutorial notebook on real-sized data.
+5. Training separate models per sample to compare conditions. Use joint model with `batch_key`. [#389](https://github.com/BayraktarLab/cell2location/issues/389).
diff --git a/.claude/skills/spatial-mapping/templates/bsub.sh b/.claude/skills/spatial-mapping/templates/bsub.sh
new file mode 100755
index 00000000..29e8d272
--- /dev/null
+++ b/.claude/skills/spatial-mapping/templates/bsub.sh
@@ -0,0 +1,63 @@
+#!/usr/bin/env bash
+# LSF launcher for cell2location step2_spatial_mapping.ipynb
+# Based on the cell2state_embryo papermill workflow.
+#
+# Usage:
+#   bash bsub.sh <training_batch> <seed> <max_epochs> <signatures_csv> <spatial_h5ad> <output_name> [<n_chunks>]
+#
+# Example (1-chunk run on 80GB A100):
+#   bash bsub.sh 0 0 30000 /path/to/signatures.csv /path/to/spatial.h5ad my_run
+#
+# Example (4-chunk run, submit one per chunk):
+#   for i in 0 1 2 3; do bash bsub.sh $i 0 30000 sig.csv spatial.h5ad my_run 4; done
+
+set -euo pipefail
+
+# === CONFIG (edit for your cluster) ===
+QUEUE="${C2L_QUEUE:-gpu-normal}"
+MEM_MB="${C2L_MEM_MB:-100000}"          # 100 GB host RAM
+GPU_MEM_MB="${C2L_GPU_MEM_MB:-80000}"   # 80 GB GPU
+NCPU="${C2L_NCPU:-4}"
+
+# === PARSE ARGS ===
+training_batch="${1:?usage: bash bsub.sh <training_batch> <seed> <max_epochs> <signatures_csv> <spatial_h5ad> <output_name> [<n_chunks>]}"
+seed="${2:?seed}"
+max_epochs="${3:?max_epochs}"
+signatures_csv="${4:?signatures_csv}"
+spatial_h5ad="${5:?spatial_h5ad}"
+output_name="${6:?output_name}"
+n_chunks="${7:-1}"
+
+# === BUILD PAPERMILL COMMAND ===
+template="$(dirname "$0")/step2_spatial_mapping.ipynb"
+output_dir="${C2L_OUTPUT_DIR:-./spatial_mapping_output}"
+output_nb="${output_dir}/${output_name}_chunk${training_batch}.ipynb"
+mkdir -p "$output_dir"
+
+pmcmd="papermill ${template} ${output_nb} \
+    -p training_batch ${training_batch} \
+    -p n_chunks ${n_chunks} \
+    -p seed ${seed} \
+    -p max_epochs ${max_epochs} \
+    -p signatures_csv ${signatures_csv} \
+    -p spatial_h5ad_path ${spatial_h5ad} \
+    -p output_dir ${output_dir} \
+    -p output_name ${output_name}"
+
+echo "Submitting:"
+echo "  training_batch=${training_batch} / ${n_chunks}"
+echo "  seed=${seed}"
+echo "  max_epochs=${max_epochs}"
+echo "  signatures_csv=${signatures_csv}"
+echo "  spatial_h5ad=${spatial_h5ad}"
+echo "  output_dir=${output_dir}"
+echo "  output_name=${output_name}"
+echo ""
+echo "Command: ${pmcmd}"
+
+# === LSF SUBMISSION ===
+bsub -q "${QUEUE}" -n${NCPU} -M${MEM_MB} \
+    -R"select[mem>${MEM_MB}] rusage[mem=${MEM_MB}] span[hosts=1]" \
+    -gpu "mode=shared:j_exclusive=yes:gmem=${GPU_MEM_MB}:num=1" \
+    -e "${output_dir}/%J.gpu.err" -o "${output_dir}/%J.gpu.out" \
+    ${pmcmd}
diff --git a/.claude/skills/spatial-mapping/templates/data/download_mouse_brain.py b/.claude/skills/spatial-mapping/templates/data/download_mouse_brain.py
new file mode 100644
index 00000000..74825c7e
--- /dev/null
+++ b/.claude/skills/spatial-mapping/templates/data/download_mouse_brain.py
@@ -0,0 +1,134 @@
+#!/usr/bin/env python3
+"""Download the published mouse-brain dataset from Kleshchevnikov et al. 2022.
+
+This is the canonical cell2location demo dataset: 5 Visium sections + paired
+single-nucleus RNA-seq reference from `tutorial/mouse_brain_*` on the
+Sanger cell2location object store.
+
+Use this dataset to validate your environment / pipeline against published
+results before applying cell2location to your own data.
+
+Usage:
+    python download_mouse_brain.py --output-dir ./data
+    python download_mouse_brain.py --output-dir ./data --components snrna
+    python download_mouse_brain.py --output-dir ./data --components visium
+"""
+import argparse
+import sys
+import urllib.request
+from pathlib import Path
+
+BASE = "https://cell2location.cog.sanger.ac.uk/tutorial"
+
+VISIUM_SAMPLES = [
+    "ST8059048",  # Visium-28C
+    "ST8059049",  # Visium-28D
+    "ST8059050",  # Visium-28E
+    "ST8059051",  # Visium-29B
+    "ST8059052",  # Visium-29C
+]
+
+VISIUM_PER_SAMPLE_FILES = [
+    "filtered_feature_bc_matrix.h5",
+    "filtered_feature_bc_matrix/barcodes.tsv.gz",
+    "filtered_feature_bc_matrix/features.tsv.gz",
+    "filtered_feature_bc_matrix/matrix.mtx.gz",
+    "spatial/tissue_lowres_image.png",
+    "spatial/tissue_hires_image.png",
+    "spatial/scalefactors_json.json",
+    "spatial/tissue_positions_list.csv",
+]
+
+SNRNA_FILES = [
+    ("mouse_brain_snrna/all_cells_20200625.h5ad", "snrna/all_cells.h5ad"),
+    (
+        "mouse_brain_snrna/regression_model/"
+        "RegressionGeneBackgroundCoverageTorch_65covariates_40532cells_12819genes/sc.h5ad",
+        "snrna/sc.h5ad",
+    ),
+    (
+        "mouse_brain_snrna/snRNA_annotation_astro_subtypes_refined59_20200823.csv",
+        "snrna/annotation.csv",
+    ),
+]
+
+INDEX_FILES = [
+    ("mouse_brain_visium_data/Visium_mouse.csv", "visium/Visium_mouse.csv"),
+]
+
+
+def download_one(url: str, dest: Path) -> None:
+    dest.parent.mkdir(parents=True, exist_ok=True)
+    if dest.exists() and dest.stat().st_size > 0:
+        print(f"  exists: {dest.relative_to(dest.parent.parent.parent)} ({dest.stat().st_size:,} bytes)")
+        return
+    print(f"  downloading: {url}")
+    try:
+        urllib.request.urlretrieve(url, dest)
+    except Exception as e:
+        raise RuntimeError(f"failed to download {url}: {e}") from e
+    print(f"    wrote {dest.relative_to(dest.parent.parent.parent)} ({dest.stat().st_size:,} bytes)")
+
+
+def download_visium(output_dir: Path) -> None:
+    print(f"Downloading Visium spatial data ({len(VISIUM_SAMPLES)} sections)...")
+    for src, dst in INDEX_FILES:
+        download_one(f"{BASE}/{src}", output_dir / dst)
+    for sample in VISIUM_SAMPLES:
+        print(f"  sample: {sample}")
+        for f in VISIUM_PER_SAMPLE_FILES:
+            url = f"{BASE}/mouse_brain_visium_data/rawdata/{sample}/{f}"
+            dest = output_dir / "visium" / "rawdata" / sample / f
+            try:
+                download_one(url, dest)
+            except RuntimeError as e:
+                # Some auxiliary spatial files are optional; warn but continue.
+                if "spatial/" in f:
+                    print(f"    SKIPPED ({e})")
+                else:
+                    raise
+
+
+def download_snrna(output_dir: Path) -> None:
+    print("Downloading snRNA-seq reference data...")
+    for src, dst in SNRNA_FILES:
+        download_one(f"{BASE}/{src}", output_dir / dst)
+
+
+def main() -> int:
+    p = argparse.ArgumentParser(description=__doc__)
+    p.add_argument(
+        "--output-dir",
+        type=Path,
+        default=Path(__file__).resolve().parent,
+        help="Directory to download into (default: this script's directory).",
+    )
+    p.add_argument(
+        "--components",
+        choices=["all", "visium", "snrna"],
+        default="all",
+        help="Which components to download (default: all).",
+    )
+    args = p.parse_args()
+    out = args.output_dir.resolve()
+    out.mkdir(parents=True, exist_ok=True)
+
+    if args.components in ("all", "snrna"):
+        download_snrna(out)
+    if args.components in ("all", "visium"):
+        download_visium(out)
+
+    print("")
+    print("Downloaded to:", out)
+    print("")
+    print("Usage from the spatial-mapping skill:")
+    print(f"  spatial_h5ad_path = (build adata from {out / 'visium/rawdata/'} via sc.read_visium)")
+    print(f"  ref_h5ad_path     = '{out / 'snrna/all_cells.h5ad'}'")
+    print("  signatures_csv    = (run step1_reference_signatures.ipynb on the reference)")
+    print("")
+    print("Reference paper: Kleshchevnikov et al. 2022. doi:10.1038/s41587-021-01139-4")
+    return 0
+
+
+if __name__ == "__main__":
+    sys.exit(main())
diff --git a/.claude/skills/spatial-mapping/templates/run_local.sh b/.claude/skills/spatial-mapping/templates/run_local.sh
new file mode 100755
index 00000000..2d2093fd
--- /dev/null
+++ b/.claude/skills/spatial-mapping/templates/run_local.sh
@@ -0,0 +1,42 @@
+#!/usr/bin/env bash
+# Local single-GPU launcher for cell2location step2_spatial_mapping.ipynb
+#
+# Usage:
+#   bash run_local.sh <training_batch> <seed> <max_epochs> <signatures_csv> <spatial_h5ad> <output_name> [<n_chunks>]
+#
+# For 1-chunk runs (default):
+#   bash run_local.sh 0 0 30000 /path/to/signatures.csv /path/to/spatial.h5ad my_run
+
+set -euo pipefail
+
+# === CONFIG ===
+export CUDA_VISIBLE_DEVICES="${CUDA_VISIBLE_DEVICES:-0}"
+
+# === PARSE ARGS ===
+training_batch="${1:?usage: bash run_local.sh <training_batch> <seed> <max_epochs> <signatures_csv> <spatial_h5ad> <output_name> [<n_chunks>]}"
+seed="${2:?seed}"
+max_epochs="${3:?max_epochs}"
+signatures_csv="${4:?signatures_csv}"
+spatial_h5ad="${5:?spatial_h5ad}"
+output_name="${6:?output_name}"
+n_chunks="${7:-1}"
+
+template="$(cd "$(dirname "$0")" && pwd)/step2_spatial_mapping.ipynb"
+output_dir="${C2L_OUTPUT_DIR:-./spatial_mapping_output}"
+output_nb="${output_dir}/${output_name}_chunk${training_batch}.ipynb"
+mkdir -p "$output_dir"
+
+echo "Running locally (GPU=${CUDA_VISIBLE_DEVICES}):"
+echo "  template:       ${template}"
+echo "  output:         ${output_nb}"
+echo "  training_batch: ${training_batch} / ${n_chunks}"
+
+papermill "${template}" "${output_nb}" \
+    -p training_batch "${training_batch}" \
+    -p n_chunks "${n_chunks}" \
+    -p seed "${seed}" \
+    -p max_epochs "${max_epochs}" \
+    -p signatures_csv "${signatures_csv}" \
+    -p spatial_h5ad_path "${spatial_h5ad}" \
+    -p output_dir "${output_dir}" \
+    -p output_name "${output_name}"
diff --git a/.claude/skills/spatial-mapping/templates/sbatch.sh b/.claude/skills/spatial-mapping/templates/sbatch.sh
new file mode 100755
index 00000000..bf5ef708
--- /dev/null
+++ b/.claude/skills/spatial-mapping/templates/sbatch.sh
@@ -0,0 +1,60 @@
+#!/usr/bin/env bash
+# Slurm launcher for cell2location step2_spatial_mapping.ipynb
+#
+# Usage:
+#   bash sbatch.sh <training_batch> <seed> <max_epochs> <signatures_csv> <spatial_h5ad> <output_name> [<n_chunks>]
+#
+# Example (1-chunk run):
+#   bash sbatch.sh 0 0 30000 /path/to/signatures.csv /path/to/spatial.h5ad my_run
+#
+# Example (4-chunk run, submit one job array):
+#   for i in 0 1 2 3; do bash sbatch.sh $i 0 30000 sig.csv spatial.h5ad my_run 4; done
+
+set -euo pipefail
+
+# === CONFIG (edit for your cluster) ===
+PARTITION="${C2L_PARTITION:-gpu}"
+MEM="${C2L_MEM:-100G}"
+GPU_MEM_GB="${C2L_GPU_MEM_GB:-80}"
+NCPU="${C2L_NCPU:-4}"
+TIME="${C2L_TIME:-08:00:00}"          # 8h default; bump for large datasets
+GRES="${C2L_GRES:-gpu:1}"              # use `gpu:a100:1` etc. on partitioned clusters
+
+# === PARSE ARGS ===
+training_batch="${1:?usage: bash sbatch.sh <training_batch> <seed> <max_epochs> <signatures_csv> <spatial_h5ad> <output_name> [<n_chunks>]}"
+seed="${2:?seed}"
+max_epochs="${3:?max_epochs}"
+signatures_csv="${4:?signatures_csv}"
+spatial_h5ad="${5:?spatial_h5ad}"
+output_name="${6:?output_name}"
+n_chunks="${7:-1}"
+
+# === BUILD PAPERMILL COMMAND ===
+template="$(cd "$(dirname "$0")" && pwd)/step2_spatial_mapping.ipynb"
+output_dir="${C2L_OUTPUT_DIR:-./spatial_mapping_output}"
+output_nb="${output_dir}/${output_name}_chunk${training_batch}.ipynb"
+mkdir -p "$output_dir"
+
+# === SLURM SUBMISSION (heredoc with sbatch directives) ===
+sbatch <<EOF
+#!/usr/bin/env bash
+#SBATCH --job-name=c2l_${output_name}_chunk${training_batch}
+#SBATCH --partition=${PARTITION}
+#SBATCH --cpus-per-task=${NCPU}
+#SBATCH --mem=${MEM}
+#SBATCH --gres=${GRES}
+#SBATCH --time=${TIME}
+#SBATCH --output=${output_dir}/%j.gpu.out
+#SBATCH --error=${output_dir}/%j.gpu.err
+
+set -euo pipefail
+papermill ${template} ${output_nb} \\
+    -p training_batch ${training_batch} \\
+    -p n_chunks ${n_chunks} \\
+    -p seed ${seed} \\
+    -p max_epochs ${max_epochs} \\
+    -p signatures_csv ${signatures_csv} \\
+    -p spatial_h5ad_path ${spatial_h5ad} \\
+    -p output_dir ${output_dir} \\
+    -p output_name ${output_name}
+EOF
diff --git a/.claude/skills/spatial-mapping/templates/step1_reference_signatures.ipynb b/.claude/skills/spatial-mapping/templates/step1_reference_signatures.ipynb
new file mode 100644
index 00000000..e076ce78
--- /dev/null
+++ b/.claude/skills/spatial-mapping/templates/step1_reference_signatures.ipynb
@@ -0,0 +1,219 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# cell2location \u2014 Step 1: estimate reference cell-type signatures\n",
+    "\n",
+    "This notebook is generated from the cell2location `spatial-mapping` skill.\n",
+    "See [../SKILL.md](../SKILL.md) for the full workflow.\n",
+    "\n",
+    "It runs `cell2location.models.RegressionModel` on an scRNA-seq reference AnnData\n",
+    "to produce a `signatures.csv` (genes x cell_types, batch-corrected, linear scale)\n",
+    "consumed by step 2 ([step2_spatial_mapping.ipynb](step2_spatial_mapping.ipynb)).\n",
+    "\n",
+    "Inputs are configured via the papermill parameters cell below. See\n",
+    "[../reference/hyperparameters_extract.md \u00a72](../reference/hyperparameters_extract.md)\n",
+    "for the gene-filter and signature-estimation rules.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "metadata": {
+    "tags": [
+     "parameters"
+    ]
+   },
+   "source": [
+    "# === PARAMETERS (papermill) ===\n",
+    "ref_h5ad_path = \"\"                              # path to scRNA reference AnnData\n",
+    "batch_key = \"sample\"                            # column for batch (drives detection_mean_y_e + s_g_gene_add)\n",
+    "labels_key = \"cell_type\"                        # column for cell-type labels (drives per_cluster_mu_fg)\n",
+    "categorical_covariate_keys = []                 # extra categorical covariates (e.g. ['10x_kit', 'donor'])\n",
+    "                                                # drive per-gene tech regression detection_tech_gene_tg.\n",
+    "                                                # Use this for multi-technology references.\n",
+    "continuous_covariate_keys = []                  # rare\n",
+    "gene_filter_cell_count_cutoff = 15\n",
+    "gene_filter_cell_percentage_cutoff2 = 0.03\n",
+    "gene_filter_nonz_mean_cutoff = 1.12\n",
+    "max_epochs = None                               # None -> auto: min(round(20000/n_cells * 400), 400)\n",
+    "output_dir = \"./signatures_output\"\n",
+    "output_name = \"ref_signatures\"\n"
+   ],
+   "execution_count": null,
+   "outputs": []
+  },
+  {
+   "cell_type": "code",
+   "metadata": {},
+   "source": [
+    "import os\n",
+    "import numpy as np\n",
+    "import pandas as pd\n",
+    "import scanpy as sc\n",
+    "import anndata\n",
+    "import cell2location\n",
+    "from cell2location.utils.filtering import filter_genes\n",
+    "from cell2location.models import RegressionModel\n"
+   ],
+   "execution_count": null,
+   "outputs": []
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Load reference data"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "metadata": {},
+   "source": [
+    "adata_ref = sc.read_h5ad(ref_h5ad_path)\n",
+    "print(f\"Loaded reference: {adata_ref.n_obs} cells, {adata_ref.n_vars} genes\")\n",
+    "print(f\"labels_key='{labels_key}' has {adata_ref.obs[labels_key].nunique()} unique values\")\n"
+   ],
+   "execution_count": null,
+   "outputs": []
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Gene filtering (supplement Fig S1 rule)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "metadata": {},
+   "source": [
+    "# Two-cutoff gene selection (Fig S1):\n",
+    "#   1. genes with count > 0 in >5% of cells, OR\n",
+    "#   2. genes with count > 0 in 5% > cells > 10 cells AND mean expression > 1.12\n",
+    "selected = filter_genes(\n",
+    "    adata_ref,\n",
+    "    cell_count_cutoff=gene_filter_cell_count_cutoff,\n",
+    "    cell_percentage_cutoff2=gene_filter_cell_percentage_cutoff2,\n",
+    "    nonz_mean_cutoff=gene_filter_nonz_mean_cutoff,\n",
+    ")\n",
+    "adata_ref = adata_ref[:, selected].copy()\n",
+    "print(f\"After filtering: {adata_ref.n_vars} genes\")\n"
+   ],
+   "execution_count": null,
+   "outputs": []
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Register data + train RegressionModel"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "metadata": {},
+   "source": [
+    "# Register data with cell2location's reference model.\n",
+    "# - batch_key drives per-batch detection (detection_mean_y_e) and additive\n",
+    "#   background (s_g_gene_add).\n",
+    "# - categorical_covariate_keys drive per-gene technology regression\n",
+    "#   (detection_tech_gene_tg). Use this for multi-technology references\n",
+    "#   (e.g. 10x v2 + v3 + Smart-seq); do not lump tech into batch_key.\n",
+    "RegressionModel.setup_anndata(\n",
+    "    adata=adata_ref,\n",
+    "    batch_key=batch_key,\n",
+    "    labels_key=labels_key,\n",
+    "    categorical_covariate_keys=categorical_covariate_keys if categorical_covariate_keys else None,\n",
+    "    continuous_covariate_keys=continuous_covariate_keys if continuous_covariate_keys else None,\n",
+    ")\n"
+   ],
+   "execution_count": null,
+   "outputs": []
+  },
+  {
+   "cell_type": "code",
+   "metadata": {},
+   "source": [
+    "# Auto max_epochs if not set: min(round(20000/n_cells * 400), 400)\n",
+    "if max_epochs is None:\n",
+    "    max_epochs = min(round(20000 / adata_ref.n_obs * 400), 400)\n",
+    "print(f\"Training for {max_epochs} epochs\")\n",
+    "\n",
+    "mod_ref = RegressionModel(adata_ref)\n",
+    "mod_ref.view_anndata_setup()\n",
+    "mod_ref.train(max_epochs=max_epochs, accelerator='gpu', train_size=1, batch_size=2500)\n"
+   ],
+   "execution_count": null,
+   "outputs": []
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Export posterior + save signatures"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "metadata": {},
+   "source": [
+    "# Export posterior\n",
+    "adata_ref = mod_ref.export_posterior(\n",
+    "    adata_ref,\n",
+    "    sample_kwargs={'num_samples': 1000, 'batch_size': 2500, 'accelerator': 'gpu'},\n",
+    ")\n",
+    "\n",
+    "# QC: ELBO history\n",
+    "mod_ref.plot_history(20)\n"
+   ],
+   "execution_count": null,
+   "outputs": []
+  },
+  {
+   "cell_type": "code",
+   "metadata": {},
+   "source": [
+    "# Save model + signatures CSV\n",
+    "run_dir = os.path.join(output_dir, output_name)\n",
+    "os.makedirs(run_dir, exist_ok=True)\n",
+    "mod_ref.save(run_dir, overwrite=True)\n",
+    "adata_ref.write(os.path.join(run_dir, \"ref_signatures.h5ad\"))\n",
+    "\n",
+    "# Extract per-cell-type signature means (linear scale, batch-corrected)\n",
+    "# inf_aver is the input to step2's Cell2location model.\n",
+    "if 'means_per_cluster_mu_fg' in adata_ref.varm.keys():\n",
+    "    inf_aver = adata_ref.varm['means_per_cluster_mu_fg'][[\n",
+    "        f'means_per_cluster_mu_fg_{i}' for i in adata_ref.uns['mod']['factor_names']\n",
+    "    ]].copy()\n",
+    "else:\n",
+    "    inf_aver = adata_ref.var[[\n",
+    "        f'means_per_cluster_mu_fg_{i}' for i in adata_ref.uns['mod']['factor_names']\n",
+    "    ]].copy()\n",
+    "inf_aver.columns = adata_ref.uns['mod']['factor_names']\n",
+    "\n",
+    "signatures_csv = os.path.join(run_dir, \"signatures.csv\")\n",
+    "inf_aver.to_csv(signatures_csv)\n",
+    "print(f\"Wrote signatures to {signatures_csv}: {inf_aver.shape}\")\n"
+   ],
+   "execution_count": null,
+   "outputs": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "name": "python",
+   "version": "3.10",
+   "mimetype": "text/x-python",
+   "file_extension": ".py"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
\ No newline at end of file
diff --git a/.claude/skills/spatial-mapping/templates/step2_aggregate_chunks.ipynb b/.claude/skills/spatial-mapping/templates/step2_aggregate_chunks.ipynb
new file mode 100644
index 00000000..d7de9a5f
--- /dev/null
+++ b/.claude/skills/spatial-mapping/templates/step2_aggregate_chunks.ipynb
@@ -0,0 +1,117 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# cell2location \u2014 Step 2 aggregation: combine chunked outputs\n",
+    "\n",
+    "Combine per-chunk results from `step2_spatial_mapping.ipynb` runs into a\n",
+    "single AnnData. Only needed if `n_chunks > 1`; otherwise skip.\n",
+    "\n",
+    "The aggregation pattern follows the cell2state_embryo workflow:\n",
+    "- load the original full spatial AnnData;\n",
+    "- copy each chunk's `obsm` keys back at the chunk's indices;\n",
+    "- optionally save chunk models under `uns['mod_batch{i}']`.\n",
+    "\n",
+    "See [issue #356](https://github.com/BayraktarLab/cell2location/issues/356) and\n",
+    "[issue #375](https://github.com/BayraktarLab/cell2location/issues/375) for the\n",
+    "rationale.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "metadata": {
+    "tags": [
+     "parameters"
+    ]
+   },
+   "source": [
+    "# === PARAMETERS (papermill) ===\n",
+    "spatial_h5ad_path = \"\"              # original full spatial AnnData (BEFORE chunking)\n",
+    "chunk_h5ad_glob = \"\"                # glob pattern, e.g. \"./output/c2l_run_chunk*/sp.h5ad\"\n",
+    "n_chunks = 1\n",
+    "output_path = \"./spatial_mapping_output/full_aggregated.h5ad\"\n",
+    "obsm_keys = [\"means\", \"q05\", \"q50\", \"q95\"]\n",
+    "include_chunk_models = False        # if True, attach each chunk's mod to uns (large!)\n"
+   ],
+   "execution_count": null,
+   "outputs": []
+  },
+  {
+   "cell_type": "code",
+   "metadata": {},
+   "source": [
+    "import os\n",
+    "import glob as _glob\n",
+    "import numpy as np\n",
+    "import scanpy as sc\n"
+   ],
+   "execution_count": null,
+   "outputs": []
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Aggregate chunk outputs"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "metadata": {},
+   "source": [
+    "adata_full = sc.read_h5ad(spatial_h5ad_path)\n",
+    "print(f\"Full AnnData: {adata_full.n_obs} locations, {adata_full.n_vars} genes\")\n",
+    "\n",
+    "chunk_paths = sorted(_glob.glob(chunk_h5ad_glob))\n",
+    "if len(chunk_paths) != n_chunks:\n",
+    "    raise ValueError(f\"Found {len(chunk_paths)} chunks at {chunk_h5ad_glob!r}, expected {n_chunks}\")\n",
+    "\n",
+    "# Initialise obsm slots\n",
+    "for i, chunk_path in enumerate(chunk_paths):\n",
+    "    adata_chunk = sc.read_h5ad(chunk_path)\n",
+    "    # Discover number of cell types from the first chunk's first key\n",
+    "    if i == 0:\n",
+    "        first_key = obsm_keys[0]\n",
+    "        n_cell_types = adata_chunk.obsm[first_key].shape[1]\n",
+    "        for k in obsm_keys:\n",
+    "            adata_full.obsm[k] = np.zeros((adata_full.n_obs, n_cell_types), dtype='float32')\n",
+    "\n",
+    "    # Map chunk indices to full-adata indices\n",
+    "    chunk_idx = adata_full.obs.index.get_indexer(adata_chunk.obs.index)\n",
+    "    if (chunk_idx < 0).any():\n",
+    "        raise ValueError(f\"Chunk {i} has locations not present in the full AnnData. \"\n",
+    "                         f\"Did the chunk h5ad come from a different parent dataset?\")\n",
+    "    for k in obsm_keys:\n",
+    "        if k in adata_chunk.obsm:\n",
+    "            adata_full.obsm[k][chunk_idx] = adata_chunk.obsm[k]\n",
+    "\n",
+    "    if include_chunk_models and 'mod' in adata_chunk.uns:\n",
+    "        adata_full.uns[f'mod_batch{i}'] = adata_chunk.uns['mod']\n",
+    "    print(f\"Aggregated chunk {i}: {len(chunk_idx)} locations\")\n",
+    "\n",
+    "os.makedirs(os.path.dirname(output_path) or '.', exist_ok=True)\n",
+    "adata_full.write(output_path)\n",
+    "print(f\"Wrote aggregated output: {output_path}\")\n"
+   ],
+   "execution_count": null,
+   "outputs": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "name": "python",
+   "version": "3.10",
+   "mimetype": "text/x-python",
+   "file_extension": ".py"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
\ No newline at end of file
diff --git a/.claude/skills/spatial-mapping/templates/step2_spatial_mapping.ipynb b/.claude/skills/spatial-mapping/templates/step2_spatial_mapping.ipynb
new file mode 100644
index 00000000..dab75c5b
--- /dev/null
+++ b/.claude/skills/spatial-mapping/templates/step2_spatial_mapping.ipynb
@@ -0,0 +1,448 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# cell2location \u2014 Step 2: spatial mapping (per chunk)\n",
+    "\n",
+    "This notebook is generated from the cell2location `spatial-mapping` skill.\n",
+    "See [../SKILL.md](../SKILL.md) for the full workflow.\n",
+    "\n",
+    "It is based on the `cell2state_embryo` papermill notebook (the only published\n",
+    "notebook with correct settings for stratified per-sample chunking and the\n",
+    "nuclei-count model), simplified for general use.\n",
+    "\n",
+    "The notebook runs ONE training chunk. For `n_chunks > 1`, submit one instance\n",
+    "per `training_batch` index (0..n_chunks-1) using the launchers in this\n",
+    "directory ([bsub.sh](bsub.sh) for LSF, [sbatch.sh](sbatch.sh) for Slurm,\n",
+    "[run_local.sh](run_local.sh) for local GPU), then aggregate chunk outputs\n",
+    "with [step2_aggregate_chunks.ipynb](step2_aggregate_chunks.ipynb).\n",
+    "\n",
+    "See [../reference/hyperparameters_extract.md \u00a71.2](../reference/hyperparameters_extract.md)\n",
+    "and [../reference/fig_S27_hyperparameters.png](../reference/fig_S27_hyperparameters.png)\n",
+    "for hyperparameter defaults and decision rules.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "metadata": {
+    "tags": [
+     "parameters"
+    ]
+   },
+   "source": [
+    "# === PARAMETERS (papermill) ===\n",
+    "# Set via:\n",
+    "#   papermill step2_spatial_mapping.ipynb out.ipynb -p spatial_h5ad_path ... -p signatures_csv ...\n",
+    "# The spatial-mapping skill walks the user through each value. Reference:\n",
+    "# - decision tree:  ../reference/fig_S27_hyperparameters.png\n",
+    "# - defaults:       ../reference/hyperparameters_extract.md\n",
+    "\n",
+    "# ---- I/O ----\n",
+    "spatial_h5ad_path = \"\"        # spatial AnnData (Visium / Visium-HD / Cytassist / Slide-seq / Stereo-seq)\n",
+    "signatures_csv = \"\"           # step1 output (genes x cell_types, linear scale, batch-corrected)\n",
+    "output_dir = \"./spatial_mapping_output\"\n",
+    "output_name = \"c2l_run\"\n",
+    "\n",
+    "# ---- Chunking ----\n",
+    "training_batch = 0            # 0..n_chunks-1; one notebook execution per chunk\n",
+    "n_chunks = 1                  # 1 = no chunking (full-batch on one GPU); skill computes per Phase 5\n",
+    "seed = 0\n",
+    "\n",
+    "# ---- Training ----\n",
+    "max_epochs = 30000            # medium-tier default; see SKILL.md Phase 8\n",
+    "\n",
+    "# ---- Detection sensitivity (Phase 4 / Fig S27 lower flow) ----\n",
+    "detection_alpha = None        # MUST be set: 20 (high variability) or 200 (low)\n",
+    "\n",
+    "# ---- N_cells_per_location (Phase 3 / Fig S27 upper flow) ----\n",
+    "# Set EXACTLY ONE of the column / scalar options.\n",
+    "N_cells_per_location_column = None   # column in adata.obs with per-location N_s\n",
+    "                                     #   (e.g. \"n_cell_occupancy\" = occupancy * N_nuclei * scaling).\n",
+    "                                     #   REQUIRES installing cell2location from the\n",
+    "                                     #   hires_sliding_window branch (see SKILL.md Phase 6).\n",
+    "N_cells_per_location_scalar = None   # tissue-level scalar from Fig S27 manual count\n",
+    "                                     #   or cell-size fallback (Visium=5, Slide-seq V2=1, ...).\n",
+    "N_cells_per_location_alpha_prior = None  # v^n: 1 (global N) / 1000 (per-location N_s + hires)\n",
+    "\n",
+    "# ---- Branch features (only effective with hires_sliding_window installed) ----\n",
+    "use_proportion_factorisation_prior_on_w_sf = False  # True with per-location N_s\n",
+    "use_n_s_cells_per_location_limit = False            # True with per-location N_s\n",
+    "A_B_per_location_alpha_prior = None\n",
+    "use_per_cell_type_normalisation = False             # advanced\n",
+    "\n",
+    "# ---- Factorisation (supplement \u00a71.2) ----\n",
+    "n_groups = 50\n",
+    "A_factors_per_location = 7.0\n",
+    "B_groups_per_location = 7.0   # NB: embryo used 5; supplement default is 7\n",
+    "\n",
+    "# ---- QC + gene filtering ----\n",
+    "total_counts_min = 1000\n",
+    "total_counts_max = 200000\n",
+    "sample_fraction_threshold = 0.7\n",
+    "gene_filter_cell_count_cutoff = 15\n",
+    "gene_filter_cell_percentage_cutoff2 = 0.15\n",
+    "gene_filter_nonz_mean_cutoff = 1.11\n",
+    "\n",
+    "# ---- Posterior export ----\n",
+    "use_quantiles = True          # MANDATORY for n_obs > 100k; see issue #278\n",
+    "compute_expected = False      # set True if you need per-cell-type per-gene expression layers\n"
+   ],
+   "execution_count": null,
+   "outputs": []
+  },
+  {
+   "cell_type": "code",
+   "metadata": {},
+   "source": [
+    "import os\n",
+    "import sys\n",
+    "import numpy as np\n",
+    "import pandas as pd\n",
+    "import scanpy as sc\n",
+    "import anndata\n",
+    "import scvi\n",
+    "import cell2location\n",
+    "from cell2location.utils.filtering import filter_genes\n",
+    "import matplotlib.pyplot as plt\n",
+    "import pyro\n",
+    "\n",
+    "scvi.settings.seed = seed\n",
+    "np.random.seed(seed)\n"
+   ],
+   "execution_count": null,
+   "outputs": []
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Load spatial data + step 1 signatures"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "metadata": {},
+   "source": [
+    "# Load spatial AnnData and step1 reference signatures\n",
+    "adata_vis = sc.read_h5ad(spatial_h5ad_path)\n",
+    "inf_aver = pd.read_csv(signatures_csv, index_col=0)\n",
+    "print(f\"Spatial:     n_obs={adata_vis.n_obs}, n_vars={adata_vis.n_vars}\")\n",
+    "print(f\"Signatures:  {inf_aver.shape[0]} genes x {inf_aver.shape[1]} cell types\")\n",
+    "print(f\"Samples:     {adata_vis.obs['sample'].nunique() if 'sample' in adata_vis.obs.columns else 'no sample column'}\")\n"
+   ],
+   "execution_count": null,
+   "outputs": []
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## QC filtering (per-spot + gene filtering)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "metadata": {},
+   "source": [
+    "# Per-spot QC filtering (embryo workflow QC, parameterised)\n",
+    "# - drop locations outside [total_counts_min, total_counts_max]\n",
+    "# - drop samples where <sample_fraction_threshold of locations pass QC\n",
+    "ind = (\n",
+    "    (adata_vis.obs['total_counts'] > total_counts_min).values\n",
+    "    & (adata_vis.obs['total_counts'] < total_counts_max).values\n",
+    ")\n",
+    "\n",
+    "# If user has nuclei segmentation column, also require at least one nucleus.\n",
+    "if N_cells_per_location_column is not None and N_cells_per_location_column in adata_vis.obs.columns:\n",
+    "    ind = ind & (adata_vis.obs[N_cells_per_location_column] > 0).values\n",
+    "\n",
+    "# Sample-fraction filter: drop samples where <70% of locations pass\n",
+    "fraction_selected = (\n",
+    "    adata_vis.obs['sample'][ind].value_counts()\n",
+    "    / adata_vis.obs['sample'].value_counts()\n",
+    ").sort_values()\n",
+    "ind = ind & adata_vis.obs['sample'].isin(\n",
+    "    fraction_selected[fraction_selected > sample_fraction_threshold].index\n",
+    ").values\n",
+    "adata_vis = adata_vis[ind, :].copy()\n",
+    "print(f\"After QC: {adata_vis.n_obs} locations, {adata_vis.obs['sample'].nunique()} samples\")\n",
+    "\n",
+    "# Gene filter (cell2location.utils.filter_genes)\n",
+    "selected = filter_genes(\n",
+    "    adata_vis,\n",
+    "    cell_count_cutoff=gene_filter_cell_count_cutoff,\n",
+    "    cell_percentage_cutoff2=gene_filter_cell_percentage_cutoff2,\n",
+    "    nonz_mean_cutoff=gene_filter_nonz_mean_cutoff,\n",
+    ")\n",
+    "# NB: embryo workflow computed filter but did NOT subset. Match that behaviour;\n",
+    "# the cell2location model handles non-selected genes via background prior.\n",
+    "# Uncomment the next line to subset.\n",
+    "# adata_vis = adata_vis[:, selected].copy()\n",
+    "\n",
+    "# Find shared genes with the reference signatures\n",
+    "intersect = np.intersect1d(adata_vis.var_names, inf_aver.index)\n",
+    "adata_vis = adata_vis[:, intersect].copy()\n",
+    "inf_aver = inf_aver.loc[intersect, :].copy()\n",
+    "print(f\"After intersecting with signatures: {adata_vis.n_vars} genes\")\n"
+   ],
+   "execution_count": null,
+   "outputs": []
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Stratified chunk assignment"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "metadata": {},
+   "source": [
+    "# Stratified chunk assignment (Phase 5):\n",
+    "# Each chunk gets a random sample of locations from EVERY sample, so all samples\n",
+    "# are present in every training chunk (joint multi-sample modelling preserved).\n",
+    "if n_chunks > 1:\n",
+    "    if 'training_batch' not in adata_vis.obs.columns:\n",
+    "        adata_vis.obs['training_batch'] = 0\n",
+    "        for sample in adata_vis.obs['sample'].unique():\n",
+    "            mask = adata_vis.obs['sample'] == sample\n",
+    "            adata_vis.obs.loc[mask, 'training_batch'] = np.random.choice(\n",
+    "                list(range(n_chunks)), size=int(mask.sum()), replace=True\n",
+    "            )\n",
+    "    ind = adata_vis.obs['training_batch'] == training_batch\n",
+    "    adata_vis = adata_vis[ind, :].copy()\n",
+    "    print(f\"Chunk {training_batch}/{n_chunks - 1}: {adata_vis.n_obs} locations\")\n"
+   ],
+   "execution_count": null,
+   "outputs": []
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Resolve N_cells_per_location + sanity-check detection_alpha"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "metadata": {},
+   "source": [
+    "# Resolve N_cells_per_location:\n",
+    "# - If a column was provided, use per-location array (and the hires-branch flags MUST be set).\n",
+    "# - Else use the scalar (Fig S27 fallback).\n",
+    "# - If both None, fail loudly.\n",
+    "if N_cells_per_location_column is not None:\n",
+    "    if N_cells_per_location_column not in adata_vis.obs.columns:\n",
+    "        raise ValueError(\n",
+    "            f\"N_cells_per_location_column={N_cells_per_location_column!r} not in adata_vis.obs. \"\n",
+    "            f\"Available: {list(adata_vis.obs.columns)}\"\n",
+    "        )\n",
+    "    N_cells_per_location = adata_vis.obs[[N_cells_per_location_column]].values.astype('float32')\n",
+    "    # Embryo formula reminder: n_cell_occupancy = occupancy * N_nuclei * scaling.\n",
+    "    # Effectiveness REQUIRES installing cell2location from hires_sliding_window\n",
+    "    # branch AND setting use_proportion_factorisation_prior_on_w_sf=True AND\n",
+    "    # use_n_s_cells_per_location_limit=True.\n",
+    "elif N_cells_per_location_scalar is not None:\n",
+    "    N_cells_per_location = float(N_cells_per_location_scalar)\n",
+    "else:\n",
+    "    raise ValueError(\n",
+    "        \"Set either N_cells_per_location_column (per-location array) or \"\n",
+    "        \"N_cells_per_location_scalar (tissue-level scalar; e.g. 5 for Visium, \"\n",
+    "        \"1 for Slide-seq V2). See Fig S27 in ../reference/.\"\n",
+    "    )\n",
+    "\n",
+    "# Sanity check on detection_alpha\n",
+    "if detection_alpha is None:\n",
+    "    raise ValueError(\n",
+    "        \"Set detection_alpha to 20 (high within-batch variability \u2014 FFPE / Cytassist / \"\n",
+    "        \"Visium-HD / older human samples) or 200 (low variability \u2014 fresh-frozen single-sample Visium). \"\n",
+    "        \"See Fig S27 in ../reference/.\"\n",
+    "    )\n"
+   ],
+   "execution_count": null,
+   "outputs": []
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Auto-derived hyperpriors"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "metadata": {},
+   "source": [
+    "# Auto-derived hyperpriors (supplement \u00a71.2 item 3 + embryo cells 36-39).\n",
+    "# These are computed from data; user does not set them directly.\n",
+    "if isinstance(N_cells_per_location, np.ndarray):\n",
+    "    N_mean = float(np.mean(N_cells_per_location))\n",
+    "else:\n",
+    "    N_mean = float(N_cells_per_location)\n",
+    "\n",
+    "expected_y_e = (\n",
+    "    adata_vis.obs[['sample', 'total_counts']].groupby('sample').mean()\n",
+    "    / (inf_aver.sum(0) * N_mean).mean()\n",
+    ")\n",
+    "mean_alpha_prior = float(np.round(\n",
+    "    ((expected_y_e.mean() ** 2) / expected_y_e.var()).values[0] / 3, 2\n",
+    "))\n",
+    "detection_cell_type_prior_alpha = float(np.round(\n",
+    "    ((inf_aver.sum(0).mean() ** 2) / inf_aver.sum(0).var()) * 20, 2\n",
+    "))\n",
+    "print(f\"Auto-derived: mean_alpha_prior={mean_alpha_prior}, \"\n",
+    "      f\"detection_cell_type_prior_alpha={detection_cell_type_prior_alpha}\")\n"
+   ],
+   "execution_count": null,
+   "outputs": []
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Setup + train Cell2location"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "metadata": {},
+   "source": [
+    "# Setup + train Cell2location (per supplement \u00a71.3 + embryo cell 44).\n",
+    "# REFUSALS (enforced by skill, surfaced as comments here):\n",
+    "#   - batch_size != None         (mini-batch is REFUSED for spatial mapping)\n",
+    "#   - log-transformed input       (REFUSED; cell2location needs linear counts)\n",
+    "#   - num_samples=1000 on n_obs > 100k without use_quantiles=True (REFUSED, OOM)\n",
+    "cell2location.models.Cell2location.setup_anndata(adata=adata_vis, batch_key=\"sample\")\n",
+    "\n",
+    "# Build kwargs dynamically because some are only valid on hires_sliding_window.\n",
+    "model_kwargs = dict(\n",
+    "    cell_state_df=inf_aver,\n",
+    "    amortised=False,\n",
+    "    N_cells_per_location=N_cells_per_location,\n",
+    "    detection_alpha=float(detection_alpha),\n",
+    "    detection_hyp_prior={\"mean_alpha\": mean_alpha_prior},\n",
+    "    A_factors_per_location=A_factors_per_location,\n",
+    "    B_groups_per_location=B_groups_per_location,\n",
+    "    n_groups=n_groups,\n",
+    ")\n",
+    "# hires-branch-only kwargs\n",
+    "if use_proportion_factorisation_prior_on_w_sf or use_n_s_cells_per_location_limit:\n",
+    "    model_kwargs.update(dict(\n",
+    "        use_proportion_factorisation_prior_on_w_sf=use_proportion_factorisation_prior_on_w_sf,\n",
+    "        use_n_s_cells_per_location_limit=use_n_s_cells_per_location_limit,\n",
+    "        N_cells_per_location_alpha_prior=N_cells_per_location_alpha_prior,\n",
+    "        A_B_per_location_alpha_prior=A_B_per_location_alpha_prior,\n",
+    "        use_per_cell_type_normalisation=use_per_cell_type_normalisation,\n",
+    "        detection_cell_type_prior_alpha=detection_cell_type_prior_alpha,\n",
+    "    ))\n",
+    "\n",
+    "mod = cell2location.models.Cell2location(adata_vis, **model_kwargs)\n",
+    "mod.view_anndata_setup()\n",
+    "\n",
+    "mod.train(\n",
+    "    max_epochs=max_epochs,\n",
+    "    batch_size=None,                 # FULL BATCH \u2014 DO NOT CHANGE (supplement \u00a71.3, issue #356)\n",
+    "    plan_kwargs={'optim': pyro.optim.Adam(optim_args={'lr': 0.002})},\n",
+    "    train_size=1,\n",
+    "    scale_elbo=1 / (adata_vis.n_obs * adata_vis.n_vars),\n",
+    "    accelerator='gpu',\n",
+    ")\n",
+    "\n",
+    "run_dir = os.path.join(output_dir, f\"{output_name}_chunk{training_batch}\")\n",
+    "os.makedirs(run_dir, exist_ok=True)\n",
+    "mod.save(run_dir, overwrite=True)\n"
+   ],
+   "execution_count": null,
+   "outputs": []
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Posterior export"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "metadata": {},
+   "source": [
+    "# Posterior export (supplement \u00a71.3 + issues #278/#360)\n",
+    "adata_vis = mod.export_posterior(\n",
+    "    adata_vis,\n",
+    "    sample_kwargs={\n",
+    "        'batch_size': int(np.ceil(adata_vis.n_obs / 4)),\n",
+    "        'accelerator': 'gpu',\n",
+    "        'return_observed': False,\n",
+    "    },\n",
+    "    add_to_obsm=['means', 'q05', 'q50', 'q95'],\n",
+    "    use_quantiles=use_quantiles,\n",
+    "    exclude_vars=['data_target'],\n",
+    ")\n",
+    "\n",
+    "if compute_expected:\n",
+    "    # Optional per-cell-type expression layers (memory-intensive; per-chunk only)\n",
+    "    expected_dict = mod.module.model.compute_expected_per_cell_type(\n",
+    "        mod.samples[\"post_sample_q05\"], mod.adata_manager\n",
+    "    )\n",
+    "    for i, n in enumerate(mod.factor_names_):\n",
+    "        adata_vis.layers[n] = expected_dict['mu'][i]\n",
+    "\n",
+    "# Save chunk results\n",
+    "adata_file = os.path.join(run_dir, \"sp.h5ad\")\n",
+    "adata_vis.write(adata_file)\n",
+    "print(f\"Wrote chunk output: {adata_file}\")\n"
+   ],
+   "execution_count": null,
+   "outputs": []
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## QC plots"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "metadata": {},
+   "source": [
+    "# QC plots (supplement \u00a71.3 recommendations)\n",
+    "# 1. ELBO history \u2014 should plateau; late oscillations are normal (issue #327)\n",
+    "mod.plot_history(5000)\n",
+    "plt.savefig(os.path.join(run_dir, \"training_ELBO_history_minus5k.png\"), bbox_inches='tight')\n",
+    "plt.show()\n",
+    "plt.close()\n",
+    "mod.plot_history(0)\n",
+    "plt.savefig(os.path.join(run_dir, \"training_ELBO_history_all.png\"), bbox_inches='tight')\n",
+    "plt.show()\n",
+    "plt.close()\n",
+    "\n",
+    "# 2. Posterior predictive check \u2014 log10(mu+1) vs log10(d+1) should be near-diagonal\n",
+    "mod.plot_QC(summary_name='q05')\n",
+    "plt.savefig(os.path.join(run_dir, \"reconstruction_accuracy_histogram.png\"), bbox_inches='tight')\n",
+    "plt.show()\n",
+    "plt.close()\n"
+   ],
+   "execution_count": null,
+   "outputs": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "name": "python",
+   "version": "3.10",
+   "mimetype": "text/x-python",
+   "file_extension": ".py"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
\ No newline at end of file
diff --git a/.gitignore b/.gitignore
index b3f54007..c0ac579e 100644
--- a/.gitignore
+++ b/.gitignore
@@ -3,6 +3,9 @@ doc
 # Resilio Sync
 .sync
 
+# Build-time mirror of .claude/skills/ -- see setup.py:_mirror_bundled_skills
+cell2location/_bundled_skills/
+
 # MAC
 *Icon*
 .DS_Store
diff --git a/AGENTS.md b/AGENTS.md
new file mode 100644
index 00000000..7a8ade3e
--- /dev/null
+++ b/AGENTS.md
@@ -0,0 +1,89 @@
+# AGENTS.md — cell2location
+
+This file is the entry point for coding agents (Claude Code, Cursor, Aider, Continue, Copilot, Codex) working with or near cell2location. Humans should read [README.md](README.md) instead.
+
+## What this is
+
+cell2location is a Bayesian model for **spatial deconvolution / cell type mapping** in spatial transcriptomics. It decomposes per-location mRNA counts into cell-type abundance using single-cell RNA-seq reference signatures. Supports 10X Visium, Visium-HD, Cytassist, Slide-seq V2, Stereo-seq, and Nanostring WTA/DSP.
+
+## When the user says X, use cell2location
+
+Reach for cell2location when the user's request includes any of these phrases:
+
+- spatial mapping
+- spatial deconvolution
+- cell type mapping
+- map cell types onto Visium / Visium-HD / Slide-seq / Stereo-seq / spatial transcriptomics
+- deconvolve spots / bins
+- estimate cell abundance per spot / per bin
+- cell composition of spatial locations
+
+## When NOT to use cell2location, and where to go instead
+
+- **Pure scRNA-seq (no spatial dimension)** → scanpy + scVI.
+- **Subcellular imaging where each transcript has x,y coordinates** (Xenium, MERSCOPE individual transcripts) → Baysor / Sopa for segmentation. cell2location is only meaningful AFTER aggregation to cells/bins.
+- **Cell-cell communication inference** → CellChat, Squidpy, cell2cell.
+- **Clone-aware spatial deconvolution** (clonal heterogeneity in spatial transcriptomics) → BaSISS [gerstung-lab/BaSISS](https://github.com/gerstung-lab/BaSISS) (Lomakin et al. 2022, Nature) for ISS-based clonal mapping.
+- **RNA velocity / cell fate dynamics in spatial data** → cell2fate [BayraktarLab/cell2fate](https://github.com/BayraktarLab/cell2fate).
+- **Complex confounded reference signatures** (multiple technologies, multiple donors, strong biological variance to regress out) → use regularizedvi [vitkl/regularizedvi](https://github.com/vitkl/regularizedvi) for the *signature step*, then cell2location for spatial mapping. regularizedvi handles purpose-specific covariate keys (library / dataset / technical / ambient / dispersion) better than the default `RegressionModel` for multi-technology atlases.
+- **Nanostring WTA / DSP**: cell2location ships a dedicated `Cell2location_WTA` model (negative-probe-binding layer + experiment-specific gene scaling). Also consider SpaceJam [vitkl/SpaceJam](https://github.com/vitkl/SpaceJam) for DSP-specific workflows.
+
+## The skills
+
+cell2location ships three bundled Claude Code skills (also readable by Cursor / Aider / other agents that respect `.claude/skills/`):
+
+- **Operating manual**: [.claude/skills/spatial-mapping/SKILL.md](.claude/skills/spatial-mapping/SKILL.md) — load BEFORE writing any code. Single skill that walks the user through reference signatures, spatial QC, hyperparameter choice (Fig S27 decision tree), chunking, branch selection (master vs `hires_sliding_window`), training, posterior export, aggregation. Supports both interactive (`AskUserQuestion`) and autonomous (data-driven defaults from supplementary methods §1.2 + Fig S27) modes.
+- **Project context (scope + technical decisions)**: [.claude/skills/cell2location-context/SKILL.md](.claude/skills/cell2location-context/SKILL.md) — owns the persistent `SPATIAL_MAPPING_CONTEXT.md` file. Called automatically by the operating manual at **Phase 0a** (`--science`: scientific goal, reference, target populations, success/failure criteria) and **Phase 8.5** (`--technical`: Phase 1–8 decision sweep + scope↔decision cross-check). Standalone-invocable via `/cell2location-context` to update goals between runs. The file is auto-discovered on each run so future sessions inherit the goals.
+- **Troubleshooting**: [.claude/skills/cell2location-troubleshooting/SKILL.md](.claude/skills/cell2location-troubleshooting/SKILL.md) — load when (a) the main skill instructed you to, (b) the user is dumping an error or unexpected result, or (c) the question is heavy on biological interpretation. Reads the user's declared failure criteria from `SPATIAL_MAPPING_CONTEXT.md` to match symptoms; drafts `gh issue create` bodies with the diagnostic checklist the maintainer normally asks for.
+
+**Convention**: when the main skill loads, all three should be read by the agent. Context and troubleshooting are also usable standalone.
+
+### Installing the skills into your agent
+
+If you `pip install cell2location`, the skills are shipped inside the wheel but not auto-registered with Claude Code / Cursor / Aider — the agents only auto-discover `.claude/skills/` in your **current working directory** or in `~/.claude/skills/`. Run once:
+
+```bash
+cell2location install-skills           # copy to ~/.claude/skills/
+cell2location install-skills --symlink # symlink instead (updates flow through pip upgrade)
+cell2location list-skills              # see what's bundled and where it would go
+cell2location uninstall-skills         # remove the copies/symlinks
+```
+
+After install, `/spatial-mapping`, `/cell2location-context`, and `/cell2location-troubleshooting` appear in the slash-command menu across all your projects. Default mode is **copy** (re-run `install-skills` to refresh after a cell2location upgrade). `--symlink` makes upgrades flow through automatically — pick that if you trust cell2location's release cadence not to surprise you mid-analysis.
+
+If you're working inside a clone of this repo, the skills auto-load from `.claude/skills/` — no install step needed.
+
+## Quick API map
+
+- [cell2location/models/_cell2location_model.py](cell2location/models/_cell2location_model.py) — `Cell2location` (spatial mapping; `setup_anndata`, `__init__`, `train`, `export_posterior`).
+- [cell2location/models/_cell2location_WTA_model.py](cell2location/models/_cell2location_WTA_model.py) — `Cell2location_WTA` (Nanostring WTA / DSP; uses per-observation `n_nuclei` as prior mean).
+- [cell2location/models/reference/_reference_model.py](cell2location/models/reference/_reference_model.py) — `RegressionModel` (NB-regression reference signatures; preferred for multi-batch/multi-tech atlases).
+- [cell2location/cluster_averages/__init__.py](cell2location/cluster_averages/__init__.py) — `get_cluster_averages` (hard-coded signature averaging; Smart-seq 2 fallback).
+- [cell2location/utils/__init__.py](cell2location/utils/__init__.py) — `filter_genes` (cell2location's gene-selection rule from supplement Fig S1).
+
+## Hyperparameters you MUST surface to the user — never silently default
+
+The skill enforces these explicitly. Outside the skill, never proceed without user input on:
+
+1. **`N_cells_per_location`** — expected cells per spot. Ask the user; or follow Fig S27 (10X Loupe manual count → tissue-level scalar; nuclei segmentation column → per-location array; no histology → cell-size formula by technology).
+2. **`detection_alpha`** — within-batch RNA variability regulariser. `20` for high variability (FFPE, Cytassist, Visium-HD, older human samples); `200` for low variability (fresh-frozen single-sample Visium).
+3. **Reference signature source** — which scRNA-seq atlas, which `labels_key`, which `batch_key`, which `categorical_covariate_keys` for per-gene tech regression.
+4. **Chunk size** — default to **1 chunk** if `n_obs` fits the GPU (full-batch training, `batch_size=None`). Only chunk if forced by memory; never silently switch to mini-batch (it's both slower and less accurate).
+
+## Tutorials vs real data
+
+[docs/notebooks/cell2location_tutorial.ipynb](docs/notebooks/cell2location_tutorial.ipynb) is for learning the API on 1–2k locations. **Do not** run it on real datasets > 10k locations — it uses defaults that are wrong for most realistic data (no nuclei segmentation, no chunking strategy, tutorial-specific `detection_alpha`). For any real dataset, load the skill and walk through it.
+
+## NO-CODE block — read this BEFORE writing any cell2location code
+
+```python
+# DO NOT write cell2location code yet.
+# Load .claude/skills/spatial-mapping/SKILL.md and walk the user through Phases 0-10.
+# The skill REFUSES to proceed without explicit decisions on:
+#   1. N_cells_per_location  (nuclei prior — segmentation column OR scalar estimate OR Fig S27 fallback)
+#   2. detection_alpha       (20 for high within-batch variability, 200 for low)
+#   3. reference signature source (which scRNA atlas, which labels_key, which batch_key)
+#   4. chunk size            (default to 1 chunk if data fits the GPU; mini-batch is REFUSED for spatial mapping)
+# Both interactive (AskUserQuestion) and autonomous-agent modes are supported.
+# See SKILL.md sections "Interactive Mode" and "Autonomous Mode" inside each phase.
+```
diff --git a/CITATION.cff b/CITATION.cff
new file mode 100644
index 00000000..8e572e18
--- /dev/null
+++ b/CITATION.cff
@@ -0,0 +1,99 @@
+cff-version: 1.2.0
+message: "If you use cell2location, please cite the following paper."
+title: "cell2location: Cell2location maps fine-grained cell types in spatial transcriptomics"
+authors:
+  - family-names: Kleshchevnikov
+    given-names: Vitalii
+  - family-names: Shmatko
+    given-names: Artem
+  - family-names: Dann
+    given-names: Emma
+  - family-names: Aivazidis
+    given-names: Alexander
+  - family-names: King
+    given-names: Hamish W.
+  - family-names: Li
+    given-names: Tong
+  - family-names: Elmentaite
+    given-names: Rasa
+  - family-names: Lomakin
+    given-names: Artem
+  - family-names: Kedlian
+    given-names: Veronika
+  - family-names: Gayoso
+    given-names: Adam
+  - family-names: Sarkin Jain
+    given-names: Mika
+  - family-names: Park
+    given-names: Jun Sung
+  - family-names: Ramona
+    given-names: Lauma
+  - family-names: Tuck
+    given-names: Elizabeth
+  - family-names: Arutyunyan
+    given-names: Anna
+  - family-names: Vento-Tormo
+    given-names: Roser
+  - family-names: Gerstung
+    given-names: Moritz
+  - family-names: James
+    given-names: Louisa
+  - family-names: Stegle
+    given-names: Oliver
+  - family-names: Bayraktar
+    given-names: Omer Ali
+repository-code: "https://github.com/BayraktarLab/cell2location"
+url: "https://cell2location.readthedocs.io/"
+license: Apache-2.0
+preferred-citation:
+  type: article
+  title: "Cell2location maps fine-grained cell types in spatial transcriptomics"
+  authors:
+    - family-names: Kleshchevnikov
+      given-names: Vitalii
+    - family-names: Shmatko
+      given-names: Artem
+    - family-names: Dann
+      given-names: Emma
+    - family-names: Aivazidis
+      given-names: Alexander
+    - family-names: King
+      given-names: Hamish W.
+    - family-names: Li
+      given-names: Tong
+    - family-names: Elmentaite
+      given-names: Rasa
+    - family-names: Lomakin
+      given-names: Artem
+    - family-names: Kedlian
+      given-names: Veronika
+    - family-names: Gayoso
+      given-names: Adam
+    - family-names: Sarkin Jain
+      given-names: Mika
+    - family-names: Park
+      given-names: Jun Sung
+    - family-names: Ramona
+      given-names: Lauma
+    - family-names: Tuck
+      given-names: Elizabeth
+    - family-names: Arutyunyan
+      given-names: Anna
+    - family-names: Vento-Tormo
+      given-names: Roser
+    - family-names: Gerstung
+      given-names: Moritz
+    - family-names: James
+      given-names: Louisa
+    - family-names: Stegle
+      given-names: Oliver
+    - family-names: Bayraktar
+      given-names: Omer Ali
+  journal: "Nature Biotechnology"
+  year: 2022
+  volume: 40
+  issue: 5
+  start: 661
+  end: 671
+  doi: "10.1038/s41587-021-01139-4"
+  url: "https://www.nature.com/articles/s41587-021-01139-4"
diff --git a/MANIFEST.in b/MANIFEST.in
new file mode 100644
index 00000000..0564b27e
--- /dev/null
+++ b/MANIFEST.in
@@ -0,0 +1,3 @@
+recursive-include cell2location/_bundled_skills *
+include AGENTS.md
+include CITATION.cff
diff --git a/README.md b/README.md
index 40973a6b..592efd21 100644
--- a/README.md
+++ b/README.md
@@ -4,6 +4,8 @@
 
 ### Comprehensive mapping of tissue cell architecture via integrated single cell and spatial transcriptomics (cell2location model)
 
+***Spatial mapping / spatial deconvolution of cell types onto spatial transcriptomics data (Visium, Visium-HD, Cytassist, Slide-seq, Stereo-seq, Nanostring WTA).***
+
 [![Stars](https://img.shields.io/github/stars/BayraktarLab/cell2location?logo=GitHub&color=yellow)](https://github.com/BayraktarLab/cell2location/stargazers)
 ![Build Status](https://github.com/BayraktarLab/cell2location/actions/workflows/test.yml/badge.svg?event=push)
 [![Documentation Status](https://readthedocs.org/projects/cell2location/badge/?version=latest)](https://cell2location.readthedocs.io/en/stable/?badge=latest)
@@ -11,6 +13,12 @@
 [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/BayraktarLab/cell2location/blob/master/docs/notebooks/cell2location_tutorial.ipynb)
 [![Docker image on quay.io](https://img.shields.io/badge/container-quay.io/vitkl/cell2location-brightgreen "Docker image on quay.io")](https://quay.io/vitkl/cell2location) 
 
+## For coding agents (Claude / Cursor / Copilot / Aider)
+
+Load [AGENTS.md](AGENTS.md) first, then the bundled skill at [.claude/skills/spatial-mapping/SKILL.md](.claude/skills/spatial-mapping/SKILL.md) before writing any cell2location code. The skill forces explicit decisions on `N_cells_per_location`, `detection_alpha`, reference signature source, and chunking — and supports both interactive and autonomous-agent modes. Two companion skills are also bundled: [.claude/skills/cell2location-context/SKILL.md](.claude/skills/cell2location-context/SKILL.md) captures the user's scientific scope (goals, target populations, success/failure criteria) and persists technical decisions in a `SPATIAL_MAPPING_CONTEXT.md` that future runs auto-discover; [.claude/skills/cell2location-troubleshooting/SKILL.md](.claude/skills/cell2location-troubleshooting/SKILL.md) drafts `gh issue create` bodies and routes to recurring answers from the issue tracker.
+
+After `pip install cell2location`, run `cell2location install-skills` once to copy the bundled skills into `~/.claude/skills/` so `/spatial-mapping`, `/cell2location-context`, and `/cell2location-troubleshooting` appear as slash commands across all your projects. Use `--symlink` if you want skill updates to flow through `pip install -U cell2location`. See [AGENTS.md](AGENTS.md#installing-the-skills-into-your-agent) for the full installer reference.
+
 If you use cell2location please cite our paper: 
 
 Kleshchevnikov, V., Shmatko, A., Dann, E. et al. Cell2location maps fine-grained cell types in spatial transcriptomics. Nat Biotechnol (2022). https://doi.org/10.1038/s41587-021-01139-4
diff --git a/cell2location/_cli.py b/cell2location/_cli.py
new file mode 100644
index 00000000..4cf60d0c
--- /dev/null
+++ b/cell2location/_cli.py
@@ -0,0 +1,208 @@
+"""Command-line entry point for cell2location.
+
+Subcommands:
+  list-skills        — show the bundled Claude / coding-agent skills.
+  install-skills     — copy (or --symlink) the bundled skills into ~/.claude/skills/.
+                       After running once, /spatial-mapping, /cell2location-context, and
+                       /cell2location-troubleshooting appear in the slash-command menu of
+                       any agent that respects ~/.claude/skills/ (Claude Code, Cursor,
+                       Aider, Continue, Codex).
+  uninstall-skills   — remove the previously installed copies/symlinks.
+
+The bundled skills live next to the package at cell2location/_bundled_skills/, copied
+into the wheel at build time from the repository's top-level .claude/skills/ directory.
+"""
+
+from __future__ import annotations
+
+import argparse
+import shutil
+import sys
+from pathlib import Path
+
+_PACKAGE_DIR = Path(__file__).resolve().parent
+_BUNDLED_DIR = _PACKAGE_DIR / "_bundled_skills"
+# Editable-install fallback: when the package is installed via `pip install -e .` and
+# the build-time mirror in setup.py has not run, the source .claude/skills/ tree is one
+# level up from the package directory.
+_DEV_FALLBACK_DIR = _PACKAGE_DIR.parent / ".claude" / "skills"
+_INSTALL_PREFIX = "cell2location-"  # collision-avoidance + provenance
+
+
+def _user_skills_dir() -> Path:
+    """Resolved at call time so tests can patch $HOME without reloading the module."""
+    return Path.home() / ".claude" / "skills"
+
+
+def _resolve_skills_source() -> Path | None:
+    if _BUNDLED_DIR.is_dir():
+        return _BUNDLED_DIR
+    if _DEV_FALLBACK_DIR.is_dir():
+        return _DEV_FALLBACK_DIR
+    return None
+
+
+def _discover_bundled_skills() -> list[Path]:
+    """Return one Path per bundled skill (each containing SKILL.md)."""
+    source = _resolve_skills_source()
+    if source is None:
+        return []
+    return sorted(p for p in source.iterdir() if p.is_dir() and (p / "SKILL.md").is_file())
+
+
+def _installed_name(skill_dir: Path) -> str:
+    """Return the destination directory name under ~/.claude/skills/."""
+    name = skill_dir.name
+    if name.startswith("cell2location-"):
+        return name  # already prefixed, e.g. cell2location-context, cell2location-troubleshooting
+    return _INSTALL_PREFIX + name  # e.g. spatial-mapping -> cell2location-spatial-mapping
+
+
+def _cmd_list_skills(_args: argparse.Namespace) -> int:
+    skills = _discover_bundled_skills()
+    source = _resolve_skills_source()
+    if not skills:
+        print(
+            f"No bundled skills found. Checked:\n  - {_BUNDLED_DIR}\n  - {_DEV_FALLBACK_DIR}\n"
+            "If you are running from an editable install (`pip install -e .`), the\n"
+            "fallback path should exist; if it does not, you may be installing from a\n"
+            "wheel that was built without the skills tree.",
+            file=sys.stderr,
+        )
+        return 1
+    user_skills_dir = _user_skills_dir()
+    print(f"Bundled skills (source: {source}):")
+    for skill in skills:
+        dest = user_skills_dir / _installed_name(skill)
+        status = "installed" if dest.exists() else "not installed"
+        print(f"  - {skill.name:30s} -> {dest}  [{status}]")
+    return 0
+
+
+def _cmd_install_skills(args: argparse.Namespace) -> int:
+    skills = _discover_bundled_skills()
+    if not skills:
+        print(
+            f"No bundled skills found (looked in {_BUNDLED_DIR} and {_DEV_FALLBACK_DIR}).",
+            file=sys.stderr,
+        )
+        return 1
+
+    user_skills_dir = _user_skills_dir()
+    if not args.dry_run:
+        user_skills_dir.mkdir(parents=True, exist_ok=True)
+    installed: list[str] = []
+    for skill in skills:
+        dest = user_skills_dir / _installed_name(skill)
+        if dest.exists() or dest.is_symlink():
+            if not args.force:
+                print(f"Skipping {dest} (already exists; use --force to overwrite).")
+                continue
+            if dest.is_symlink() or dest.is_file():
+                dest.unlink()
+            else:
+                shutil.rmtree(dest)
+
+        if args.dry_run:
+            mode = "symlink" if args.symlink else "copy"
+            print(f"[dry-run] would {mode} {skill} -> {dest}")
+            installed.append(dest.name)
+            continue
+
+        if args.symlink:
+            dest.symlink_to(skill, target_is_directory=True)
+        else:
+            shutil.copytree(skill, dest)
+        installed.append(dest.name)
+
+    if installed:
+        mode = "Linked" if args.symlink else "Installed"
+        print(
+            f"{mode} {len(installed)} skill(s) into {user_skills_dir}.\n"
+            "Restart your coding agent (Claude Code, Cursor, Aider, …) to see the new\n"
+            "slash commands: /spatial-mapping, /cell2location-context, "
+            "/cell2location-troubleshooting."
+        )
+    else:
+        print("Nothing installed. Use --force to overwrite existing entries.")
+    return 0
+
+
+def _cmd_uninstall_skills(args: argparse.Namespace) -> int:
+    skills = _discover_bundled_skills()
+    if not skills:
+        print(
+            f"No bundled skills found (looked in {_BUNDLED_DIR} and {_DEV_FALLBACK_DIR}); " "nothing to remove.",
+            file=sys.stderr,
+        )
+        return 1
+
+    user_skills_dir = _user_skills_dir()
+    removed: list[str] = []
+    for skill in skills:
+        dest = user_skills_dir / _installed_name(skill)
+        if not (dest.exists() or dest.is_symlink()):
+            continue
+        if args.dry_run:
+            print(f"[dry-run] would remove {dest}")
+            removed.append(dest.name)
+            continue
+        if dest.is_symlink() or dest.is_file():
+            dest.unlink()
+        else:
+            shutil.rmtree(dest)
+        removed.append(dest.name)
+
+    print(f"Removed {len(removed)} skill(s) from {user_skills_dir}.")
+    return 0
+
+
+def _build_parser() -> argparse.ArgumentParser:
+    parser = argparse.ArgumentParser(
+        prog="cell2location",
+        description="cell2location command-line utilities.",
+    )
+    sub = parser.add_subparsers(dest="command", required=True)
+
+    p_list = sub.add_parser("list-skills", help="List the bundled coding-agent skills.")
+    p_list.set_defaults(func=_cmd_list_skills)
+
+    p_install = sub.add_parser(
+        "install-skills",
+        help="Install the bundled skills into ~/.claude/skills/.",
+    )
+    p_install.add_argument(
+        "--symlink",
+        action="store_true",
+        help="Symlink instead of copy (updates flow through `pip install -U cell2location`).",
+    )
+    p_install.add_argument(
+        "--force",
+        action="store_true",
+        help="Overwrite existing entries in ~/.claude/skills/.",
+    )
+    p_install.add_argument(
+        "--dry-run",
+        action="store_true",
+        help="Print what would be installed, without writing.",
+    )
+    p_install.set_defaults(func=_cmd_install_skills)
+
+    p_uninstall = sub.add_parser(
+        "uninstall-skills",
+        help="Remove the previously installed cell2location-* skill entries.",
+    )
+    p_uninstall.add_argument("--dry-run", action="store_true", help="Print what would be removed.")
+    p_uninstall.set_defaults(func=_cmd_uninstall_skills)
+
+    return parser
+
+
+def main(argv: list[str] | None = None) -> int:
+    parser = _build_parser()
+    args = parser.parse_args(argv)
+    return args.func(args)
+
+
+if __name__ == "__main__":
+    sys.exit(main())
diff --git a/setup.cfg b/setup.cfg
index c662c30f..7356a564 100644
--- a/setup.cfg
+++ b/setup.cfg
@@ -8,6 +8,30 @@ license = Apache License, Version 2.0
 url = https://github.com/BayraktarLab/cell2location
 author = Vitalii Kleshchevnikov, Artem Shmatko, Emma Dann, Artem Lomakin, Alexander Aivazidis
 author_email = vitalii.kleshchevnikov@sanger.ac.uk
+keywords =
+    spatial-transcriptomics
+    spatial-mapping
+    spatial-deconvolution
+    cell-type-mapping
+    cell-type-deconvolution
+    visium
+    visium-hd
+    cytassist
+    slide-seq
+    stereo-seq
+    nanostring-wta
+    bayesian
+    pyro
+    scvi-tools
+    scverse
+    agent-friendly
+classifiers =
+    Development Status :: 5 - Production/Stable
+    Intended Audience :: Science/Research
+    Topic :: Scientific/Engineering :: Bio-Informatics
+    License :: OSI Approved :: Apache Software License
+    Programming Language :: Python :: 3
+    Operating System :: OS Independent
 
 
 [options]
@@ -18,6 +42,13 @@ install_requires =
     pyro-ppl>=1.8.0
     scvi-tools>=1.3.0
     torch>=1.9.0
+    # CVE-2026-44484 / GHSA-w37p-236h-pfx3: pytorch-lightning 2.6.2 and 2.6.3 were
+    # compromised on PyPI (credential-harvesting payload). The versions have been
+    # yanked from PyPI but the explicit exclusion protects users with stale mirrors
+    # or cached wheels. scvi-tools pulls lightning transitively; mirroring the pin
+    # on both names is defensive.
+    lightning != 2.6.2, != 2.6.3
+    pytorch-lightning != 2.6.2, != 2.6.3
     numpy
     pandas
     scanpy
@@ -40,3 +71,12 @@ docs =
     sphinx_rtd_theme==0.5.2
     myst-parser==0.15.2
     jinja2<3.1
+
+[options.entry_points]
+console_scripts =
+    cell2location = cell2location._cli:main
+
+[options.package_data]
+cell2location =
+    _bundled_skills/**/*
+    _bundled_skills/**/.*
diff --git a/setup.py b/setup.py
index 1abbd068..2675ee3c 100644
--- a/setup.py
+++ b/setup.py
@@ -1,4 +1,31 @@
+"""setup.py — mirrors the repo's .claude/skills/ tree into cell2location/_bundled_skills/
+at build time so the skills travel inside the wheel.
+
+The mirror runs unconditionally before setuptools.setup() so it covers every install
+path: `pip install .`, `pip install -e .`, `python -m build`, `python setup.py
+bdist_wheel`, `python setup.py sdist`. It is idempotent and a no-op when the source
+.claude/skills/ tree is absent (e.g. when the user is installing from a published
+wheel, in which case _bundled_skills/ already exists).
+"""
+
+import shutil
+from pathlib import Path
+
 import setuptools
 
+_REPO_ROOT = Path(__file__).resolve().parent
+_SOURCE_SKILLS = _REPO_ROOT / ".claude" / "skills"
+_BUNDLED_TARGET = _REPO_ROOT / "cell2location" / "_bundled_skills"
+
+
+def _mirror_bundled_skills() -> None:
+    if not _SOURCE_SKILLS.is_dir():
+        return
+    if _BUNDLED_TARGET.exists():
+        shutil.rmtree(_BUNDLED_TARGET)
+    shutil.copytree(_SOURCE_SKILLS, _BUNDLED_TARGET)
+
+
 if __name__ == "__main__":
+    _mirror_bundled_skills()
     setuptools.setup()
diff --git a/tests/test_cli.py b/tests/test_cli.py
new file mode 100644
index 00000000..9a886469
--- /dev/null
+++ b/tests/test_cli.py
@@ -0,0 +1,73 @@
+"""Smoke tests for the cell2location command-line entry points.
+
+`_cli.py` is loaded directly from its source path so the test can run in a CI matrix
+where the surrounding cell2location package import would pull in heavy ML deps. The
+CLI itself has no ML imports.
+"""
+
+from __future__ import annotations
+
+import importlib.util
+import io
+import os
+from contextlib import redirect_stderr, redirect_stdout
+from pathlib import Path
+from unittest.mock import patch
+
+import pytest
+
+_CLI_PATH = Path(__file__).resolve().parents[1] / "cell2location" / "_cli.py"
+_spec = importlib.util.spec_from_file_location("cell2location_cli_under_test", _CLI_PATH)
+_cli = importlib.util.module_from_spec(_spec)
+_spec.loader.exec_module(_cli)
+
+
+def _run(*argv: str) -> tuple[int, str, str]:
+    out, err = io.StringIO(), io.StringIO()
+    with redirect_stdout(out), redirect_stderr(err):
+        code = _cli.main(list(argv))
+    return code, out.getvalue(), err.getvalue()
+
+
+def test_list_skills_finds_bundled_or_dev_fallback() -> None:
+    code, stdout, stderr = _run("list-skills")
+    if code != 0:
+        pytest.skip(f"No bundled skills available: {stderr}")
+    assert "spatial-mapping" in stdout
+    assert "cell2location-context" in stdout
+    assert "cell2location-troubleshooting" in stdout
+
+
+def test_install_skills_dry_run(tmp_path: Path) -> None:
+    fake_home = tmp_path / "home"
+    fake_home.mkdir()
+    with patch.dict(os.environ, {"HOME": str(fake_home)}):
+        code, stdout, stderr = _run("install-skills", "--dry-run")
+    if code != 0:
+        pytest.skip(f"No bundled skills available: {stderr}")
+    assert "cell2location-spatial-mapping" in stdout
+    # Nothing should actually be written in dry-run mode.
+    assert not (fake_home / ".claude" / "skills").exists() or not any((fake_home / ".claude" / "skills").iterdir())
+
+
+def test_install_skills_copy_and_uninstall(tmp_path: Path) -> None:
+    fake_home = tmp_path / "home"
+    fake_home.mkdir()
+    with patch.dict(os.environ, {"HOME": str(fake_home)}):
+        code, _, err = _run("install-skills")
+        if code != 0:
+            pytest.skip(f"No bundled skills available: {err}")
+        installed_root = fake_home / ".claude" / "skills"
+        assert installed_root.is_dir()
+        installed_dirs = sorted(p.name for p in installed_root.iterdir() if p.is_dir())
+        assert any(name.startswith("cell2location-") for name in installed_dirs)
+
+        # Second install without --force should be a no-op (every entry already there).
+        code2, stdout2, _ = _run("install-skills")
+        assert code2 == 0
+        assert "Skipping" in stdout2 or "Nothing installed" in stdout2
+
+        code3, _, _ = _run("uninstall-skills")
+        assert code3 == 0
+        remaining = [p for p in installed_root.iterdir() if p.name.startswith("cell2location-")]
+        assert remaining == []