Add agent-discoverable metadata and spatial-mapping skill for coding agents#442
Open
vitkl wants to merge 7 commits into
Open
Add agent-discoverable metadata and spatial-mapping skill for coding agents#442vitkl wants to merge 7 commits into
vitkl wants to merge 7 commits into
Conversation
Make cell2location easier to find and use for coding agents (Claude Code, Cursor, Aider, Copilot, Codex). Adds: AGENTS.md (single agent landing page with trigger phrases, related-tools routing to sibling packages, dual-skill pointer, NO-CODE refusal block); README tagline + "For coding agents" pointer subsection; PyPI keywords/classifiers for cross-repo discovery; CITATION.cff for the 2022 Nature Biotechnology paper.
Main operating skill at .claude/skills/spatial-mapping/SKILL.md walks users through ten phases (mode + data, reference signatures, spatial QC, N_cells_per_location Fig S27 decision tree, detection_alpha, chunking, branch selection master vs hires_sliding_window, model hyperparameters, training + posterior export, launch, aggregation). Supports both interactive AskUserQuestion mode and autonomous data-driven mode. Forces explicit decisions on the four hyperparameters the maintainer routinely answers on the issue tracker. Reference materials bundled: Fig S1 + Fig S27 PNG extracts, paraphrased supplement §1.2-§1.4 + §2, paraphrased issue corpus from ~25 recurring vitkl answers, full supplement PDF.
…brain data helper
Three papermill-parametrised templates based on the cell2state_embryo
workflow (the only published cell2location workflow with correct stratified
per-sample chunking and nuclei-occupancy model wiring), simplified for
general use and stripped of embryo specifics:
- step1_reference_signatures.ipynb (RegressionModel)
- step2_spatial_mapping.ipynb (per-chunk Cell2location; supports both master
and hires_sliding_window via runtime-conditional kwargs)
- step2_aggregate_chunks.ipynb (combine per-chunk outputs)
Three launchers (LSF bsub, Slurm sbatch, local single-GPU) with the same
parameter contract — bsub.sh is derived from the embryo bsub.sh.
download_mouse_brain.py fetches the published Kleshchevnikov 2022 mouse-brain
dataset (5 Visium + paired snRNA reference) from the public Sanger object
store, letting users validate against published results before applying to
their own data.
Companion to the main spatial-mapping skill. Three-phase workflow: (1) match user's symptom against the harvested vitkl-guidance corpus shared with the main skill; (2) gh search GitHub issues for matches the corpus snapshot may have missed; (3) draft (not submit) a clean `gh issue create` body with the diagnostic checklist vitkl normally asks for — environment, data shape, hyperparameters used, ELBO trajectory, error trace. Routes biology-interpretation questions to discourse.scverse.org instead of the issue tracker. Refuses to auto-submit; refuses to include raw user data in drafted bodies.
…private GBMspace link Three fixes raised by the initial PR's CI run + post-merge text review: 1. download_mouse_brain.py: remove unused `hashlib`/`os` imports (F401) and the trailing `f""` without placeholder (F541) — flake8 now clean. 2. setup.cfg: pin `lightning != 2.6.2, != 2.6.3` (and mirror on `pytorch-lightning`) per the supply-chain attack disclosed 2026-04-30 (CVE-2026-44484 / GHSA-w37p-236h-pfx3). The compromised versions have been yanked from PyPI but the explicit exclusion protects users with stale mirrors or cached wheels. scvi-tools pulls lightning transitively; the cell2location pin is defensive. 3. AGENTS.md: drop the `BayraktarLab/GBMspace` link from the related-tools routing — the repo is private, so an external user clicking the link from public AGENTS.md would 404. BaSISS stays as the cancer-clone routing target.
Inserts a scientific-scope interview as the very first step of the spatial-mapping workflow and a technical-completeness sweep just before launch. Persists answers in SPATIAL_MAPPING_CONTEXT.md so future runs (and the troubleshooting skill) inherit the user's goal, reference, target populations, and failure criteria. - New skill .claude/skills/cell2location-context/ owns the persistent file (auto-discovery across cwd / .claude/ / ~/.claude/plans/; first-creation asks where to save). Two modes: --science (7-group rubric) and --technical (Phase 1-8 slot sweep + scope-vs-decision cross-check). - spatial-mapping/SKILL.md: new Phase 0a invokes --science before any technical decision; new Phase 8.5 invokes --technical before Phase 9 launch and can block launch on hard cross-check failures (e.g. detection_alpha=200 vs failure criterion about 10x within-sample variation). - cell2location-troubleshooting/SKILL.md: new Phase -1 reads BOTH the scope (especially declared failure criteria) AND the technical-decisions block before classifying the symptom; Phase 3 pre-fills the gh-issue diagnostic template from the context file instead of re-asking the user. - Skip path: users can opt out (recommended copy explains why answering helps) or import a prior handoff document as free-form scope. - Autonomous mode: skips the interview and emits a notebook markdown cell documenting the missing scope. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pip-installed cell2location previously left the bundled Claude / Cursor / Aider
skills inaccessible -- agents only auto-discover skills in cwd or in
~/.claude/skills/. This change makes /spatial-mapping, /cell2location-context,
and /cell2location-troubleshooting slash commands available across all projects
after a one-time:
cell2location install-skills # copy
cell2location install-skills --symlink # or symlink so pip -U flows through
How it works:
- setup.py mirrors .claude/skills/ -> cell2location/_bundled_skills/ at build
time so the wheel/sdist always ships the skills.
- cell2location/_cli.py resolves the bundled dir first, falls back to the
source .claude/skills/ tree for editable installs.
- New console_scripts entry point cell2location -> cell2location._cli:main
exposes list-skills / install-skills (--symlink --force --dry-run) /
uninstall-skills.
- Installed entries are namespaced cell2location-<skill> in ~/.claude/skills/
to avoid collisions and make provenance obvious.
- MANIFEST.in + setup.cfg [options.package_data] put _bundled_skills/ in the
wheel; .gitignore keeps the build-time mirror out of the repo.
- AGENTS.md and README.md document the install flow.
- tests/test_cli.py covers list / dry-run install / install+reinstall+uninstall
against a temp $HOME (loads _cli.py via importlib so the test runs even when
the surrounding cell2location import is broken in the local env).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Make cell2location easier for coding agents (Claude Code, Cursor, Aider, Copilot, Codex) to find and use correctly. Adds four layers of agent-friendliness, all in
.claude/skills/and repo-root metadata files. No Python source undercell2location/ortests/is touched. Zero CI risk.Four layers
setup.cfgPyPI keywords (16) and classifiers (6);CITATION.cfffor the 2022 Nature Biotechnology paper..claude/skills/spatial-mapping/— main operating manual. Single skill, format-plan-style (instructions +<reference>tag), dual-mode (interactiveAskUserQuestion/ autonomous data-driven). Walks the user through 10 phases (mode + data, reference signatures, spatial QC,N_cells_per_locationFig S27 decision tree,detection_alpha, chunking, branch selection master vshires_sliding_window, model hyperparameters, training + posterior export, launch, aggregation). Bundled reference materials: Fig S1 + Fig S27 PNG extracts, paraphrased supplement §1.2-§1.4 + §2, paraphrased issue corpus from ~25 recurring maintainer answers, full supplement PDF for deeper questions..claude/skills/cell2location-troubleshooting/— companion skill. Matches user symptoms against the harvested issue corpus,gh searchfallback for newer issues, drafts (does NOT submit)gh issue createbodies with the diagnostic metadata vitkl normally asks for. Routes biology-interpretation questions to discourse.scverse.org.Bundled templates
.claude/skills/spatial-mapping/templates/contains three papermill-parametrised notebooks based on the cell2state_embryo workflow (the only published workflow with correct stratified per-sample chunking and nuclei-occupancy model wiring), simplified for general use and stripped of embryo specifics:step1_reference_signatures.ipynb—RegressionModelfor batch-corrected signatures.step2_spatial_mapping.ipynb— per-chunkCell2location; supports bothmasterandhires_sliding_windowvia runtime-conditional kwargs.step2_aggregate_chunks.ipynb— combine per-chunk outputs.Plus three launchers (LSF
bsub.sh, Slurmsbatch.sh, localrun_local.sh) with the same parameter contract, anddata/download_mouse_brain.pythat fetches the published Kleshchevnikov 2022 mouse-brain dataset (5 Visium + paired snRNA reference) from the public Sanger object store.Commit structure
AGENTS.md,README.mdtagline,setup.cfgkeywords/classifiers,CITATION.cff.SKILL.md+ skillREADME.md+ Fig S1+S27 PNGs + paraphrase markdowns + supplement PDF.SKILL.md+README.md.Test plan
cell2location/ortests/modified (CI flake8/black/isort/pytest pass trivially).nbformat4.5 JSON, exactly oneparameters-tagged cell each.bash -nclean).download_mouse_brain.py: clean Python compile.CITATION.cff: parses, contains 2022 Nature Biotechnology DOI.setup.cfg: parses, 16 keywords + 6 classifiers added; existing fields untouched.grep -rE "cell_type_lvl7|FFPE_Cytassist|/nfs/team283|/nemo/lab|/lustre|sectionsRef|suspensionRef|FraqLim|CS17" .claude/skills/returns no matches.python templates/data/download_mouse_brain.py && papermill templates/step1_reference_signatures.ipynb out.ipynb -p ref_h5ad_path .../sc.h5ad -p max_epochs 100— slow (~20 min on CPU), documented here rather than added to CI to keep CI fast.packages/cell2location/meta.yamldescription with the skill mention and add tags (spatial-mapping,spatial-deconvolution,visium,visium-hd,agent-friendly).What this does NOT change
cell2location/.tests/.🤖 Generated with Claude Code