Human Unified Brain Spatial Transcriptomic Analysis β Human Fetal Brain Development Atlas Pipeline
A comprehensive spatial transcriptomics analysis pipeline for human fetal brain development, built on Stereo-seq (Spatial Enhanced Resolution Omics-sequencing) data. This repository covers the full workflow from cell segmentation to downstream biological interpretation, including deep learning-based embedding, GPU-accelerated clustering, region-specific DEG analysis, ligand-receptor interaction inference, spatial GRN inference, pseudotime analysis, and 3D visualization.
| Region | Directory | Key Analyses |
|---|---|---|
| Hippocampus | 05.hip_analysis/ |
DEG, Ligand-Receptor, Pseudotime, Tracks plots, Spatial gene plots |
| Thalamus | 06.thalamus_analysis/ |
DEG, Ligand-Receptor, Spatial gene plots, Tangram mapping |
| Mid-Hindbrain (MHB) | 07.mhb_analysis/ |
DEG, Ligand-Receptor, Tracks plots |
| Cerebellum | 08.cerebellum_analysis/ |
DEG, Single-cell annotation, Spatial annotation, Pseudotime, Ligand-Receptor, Circos plots |
| Spatial GRN | 09.grn_analysis/ |
GPU SpaGRN submodule, whole-brain and region-specific GRN notebooks, TF/regulon visualization |
| gsMap Analysis | 10.gsmap_analysis/ |
Spatial genetic enrichment across bin100, cell-bin, single-cell, and pseudobulk representations |
| Cortex | 12.3D_ply_plot/ |
3D PLY mesh gene visualization |
βββββββββββββββββββββββββββ
β 01. Cell Segmentation β β Stereo-seq raw data (.gem, .h5ad)
β (Stereopy + ONNX) β
βββββββββββββ¬ββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββ
β 02. FuseMap Integrationβ β Spatial data integration & embedding
βββββββββββββ¬ββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββ
β 03. DMT-HI Latent β β Deep Manifold Transformation
β Representation β with Hyperbolic embedding
βββββββββββββ¬ββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββ
β 04. RAPIDS GPU β β GPU-accelerated Leiden clustering
β Clustering β on DMT embeddings
βββββββββββββ¬ββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββ
β 05-08. Region Analysis β β Per-region downstream analyses
β β Β· DEG / Gene expression plots
β β Β· Ligand-Receptor (CCI) inference
β β Β· Pseudotime trajectory
β β Β· Spatial tracks visualization
β β Β· Cell type annotation
βββββββββββββ¬ββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββ
β 09. Spatial GRN β β GPU SpaGRN inference and
β Analysis β TF/regulon visualization
βββββββββββββ¬ββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββ
β 10. gsMap Analysis β β Spatial GWAS enrichment on
β β bin100/cell-bin/SC/pseudobulk
βββββββββββββ¬ββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββ
β 12. 3D PLY Plot β β 3D mesh visualization of
β 3D Visualization β gene expression on brain models
βββββββββββββββββββββββββββ
HUBSTA_Analysis/
βββ 01.cell_segmentation/ # Cell segmentation (Stereopy + ONNX model)
β βββ 01.cell_segmentation.py
β βββ 02.cell_correct.py
β βββ 03.make_cell_mask.py
β βββ cell_segmetation_v3.0.onnx
β
βββ 02.Fusemap/ # FuseMap: spatial data integration framework
β βββ FuseMap/
β β βββ config.py # Configuration
β β βββ dataset.py # Data loading
β β βββ model.py # Core model architecture
β β βββ train.py # Training pipeline
β β βββ loss.py # Loss functions
β β βββ preprocess.py # Preprocessing
β β βββ run.*.py # Region-specific run scripts
β β βββ paper_code/ # Paper reproduction code
β β βββ benchmark/ # Benchmarks (cell integration, gene imputation)
β β βββ cell_cell_interation/
β β βββ reference_mapping/
β β βββ universal_cell_type/
β β βββ universal_gene_embedding/
β β βββ universal_tissue_region/
β
βββ 03.DMT-HI/ # Deep Manifold Transformation (Hyperbolic)
β βββ main.py # Main entry point
β βββ model/ # Model definitions
β βββ conf_new/ # YAML config files
β βββ dataloader/ # Data loaders
β βββ manifolds/ # Hyperbolic manifold utilities
β βββ sweep/ # Hyperparameter sweep configs
β βββ run*.sh # Run scripts
β
βββ 04.rapids_cluster/ # GPU-accelerated clustering (RAPIDS)
β βββ *_single_1.py # Single-cell embedding clustering
β βββ *_spatial_1.py # Spatial embedding clustering
β βββ run.*.sh # Region-specific run scripts
β
βββ 05.hip_analysis/ # Hippocampus analysis
β βββ for_deg_plot.py/R/sh # DEG analysis
β βββ Ligand-receptor_interaction_inference.py
β βββ tracks_single.py / tracks_plot.py
β βββ *.ipynb # Analysis notebooks
β
βββ 06.thalamus_analysis/ # Thalamus analysis
β βββ for_deg_plot.py/R/sh # DEG analysis
β βββ Ligand-receptor_interaction_inference.py
β βββ tangram_yes.py # Tangram mapping
β βββ *.ipynb # Analysis notebooks
β
βββ 07.mhb_analysis/ # Mid-Hindbrain analysis
β βββ for_deg_plot.py/R/sh # DEG analysis
β βββ Ligand-receptor_interaction_inference.py
β βββ *.ipynb # Analysis notebooks
β
βββ 08.cerebellum_analysis/ # Cerebellum analysis
β βββ *read_process_deg.ipynb
β βββ *single_anno.py # Single-cell annotation
β βββ *spatial_anno.py # Spatial annotation
β βββ *tracks*.py # Tracks visualization
β βββ *Ligand-receptor*.py # CCI inference
β βββ *pseudotime_plot.ipynb # Pseudotime trajectory
β βββ *circosplot*.ipynb # Circos plots
β βββ *.ipynb
β
βββ 09.grn_analysis/ # Spatial GRN analysis and GPU SpaGRN submodule
β βββ SpaGRN/ # Git submodule: https://github.com/DBinary/SpaGRN
β βββ notebooks/ # Fetal brain GRN analysis notebooks
β βββ docs/ # Notebook and output inventories
β βββ _3D_plot.py # K3D 3D expression helper
β βββ requirements.txt
β
βββ 10.gsmap_analysis/ # gsMap spatial genetic enrichment analysis
β βββ bin100_3d/ # Whole-brain bin100 3D gsMap run
β βββ cell_bin/ # Cell-bin gsMap runs and Cauchy plots
β βββ single_cell/ # Single-cell gsMap run
β βββ pseudobulk/ # Pseudobulk construction and gsMap run
β βββ summary/ # Final comparison figures
β βββ shared/ # Shared paths and annotation helpers
β
βββ 12.3D_ply_plot/ # 3D PLY mesh visualization
β βββ 2_mhb_gene_plot_*.py # MHB 3D gene plot
β βββ 3_cortex_gene_plot_*.py # Cortex 3D gene plot
β βββ 04_brain_gene_plot_*.py # Whole brain 3D gene plot
β βββ 4_makemesh.ipynb # Mesh generation
β βββ 5_thalamus_gene_plot_*.py # Thalamus 3D gene plot
β
βββ README.md
- Stereo-seq cell segmentation using the Stereopy framework
- ONNX-accelerated deep learning model (v3.0) for cell boundary detection
- Cell mask generation and correction
- Universal gene embedding for spatial transcriptomics
- Tissue region identification and harmonization
- Reference mapping (MERFISH, STARmap PLUS, Slide-seq V2, Stereo-seq)
- Cell-cell interaction analysis
- Gene imputation and targeted gene panel selection
- Benchmark suite for cell integration and gene imputation
- Hyperbolic embedding for spatial transcriptomics data
- Configurable transformer and MLP backbones
- YAML-based experiment configuration
- Weights & Biases (wandb) integration for experiment tracking
- Hyperparameter sweep support
- RAPIDS cuML accelerated Leiden clustering
- Processes both single-cell and spatial embeddings
- Covers: Cerebellum, Cortex (P10), Hippocampus, Mid-Hindbrain, Thalamus, Whole Brain
| Analysis | Description |
|---|---|
| DEG Analysis | Differential expression with Python + R (for_deg_plot.py/R/sh) |
| Ligand-Receptor (CCI) | Cell-cell interaction inference between spatial regions |
| Pseudotime | Trajectory analysis of cell lineages |
| Tracks Plot | Spatial gene expression tracks across tissue slices |
| Cell Annotation | Single-cell and spatial annotation of cell types |
| Tangram Mapping | Spatial mapping of cell types |
| Circos Plots | Circos-style visualization of cell-cell interactions |
- Fetal brain spatial gene regulatory network analysis notebooks
- GPU-rewritten SpaGRN inference code included as a git submodule
- Whole-brain and region-specific TF/regulon summaries
- GRN pathway, heatmap, and 3D visualization helpers
- 3D mesh (PLY format) rendering of brain regions
- Gene expression mapped onto 3D brain surfaces
- Supports Cortex, Thalamus, Mid-Hindbrain, and whole brain
The pipeline relies on several key Python packages:
- Spatial Transcriptomics:
stereopy,spateo - Single-cell / Spatial:
scanpy,anndata,squidpy - GPU Clustering:
rapids-singlecell(requires NVIDIA GPU + RAPIDS) - Deep Learning:
pytorch,pytorch-lightning - FuseMap: See
02.Fusemap/FuseMap/fusemap_environment.yaml - DMT-HI: See
03.DMT-HI/requirements.txtandinstall_env.sh - Visualization:
matplotlib,seaborn,plotly - GRN inference:
spagrn,pyscenic,arboreto,omicverse,gseapy,k3d - 3D:
open3d,trimesh
For detailed environment setup, refer to:
01.cell_segmentation/readme.mdβ Stereopy installation02.Fusemap/FuseMap/fusemap_environment.yamlβ FuseMap conda environment03.DMT-HI/readme.mdβ DMT-HI environment09.grn_analysis/requirements.txtβ spatial GRN analysis dependencies
cd 01.cell_segmentation
python 01.cell_segmentation.py
python 02.cell_correct.py
python 03.make_cell_mask.pycd 02.Fusemap/FuseMap
python run.thalamus.py # or run.cerebellum.py / run.hip.py / run.mhb.py / run.ctx.pycd 03.DMT-HI
python main.py fit -c=conf_new/transf_cond_mnist.yamlcd 04.rapids_cluster
bash run.thalamus.sh # or run.cerebellum.sh / run.Hippocampus.sh etc.# Navigate to the desired brain region directory (e.g., Hippocampus)
cd 05.hip_analysis
# Run DEG analysis
bash for_deg_plot.sh
# Run ligand-receptor inference
python Ligand-receptor_interaction_inference.py
# Open notebooks for downstream visualization
jupyter lab 1.ipynbgit submodule update --init --recursive
cd 09.grn_analysis
jupyter lab notebooks/01_bin100_preprocessing_grn/01_03_Bin100_GRN_Inference.ipynbcd 12.3D_ply_plot
python 04_brain_gene_plot_20260306.py # Whole brain 3D gene expressionThe pipeline expects Stereo-seq data in .h5ad (AnnData) format. Each brain region has multiple tissue slices (identified by slice_code), each containing:
- Spatial coordinates (
obsm['align_spatial_2d']) - Gene expression matrix
- DMT latent embeddings (
obsm['X_dmt'],obsm['X_dmt_highdim']) - Leiden cluster assignments
The spatial GRN notebooks in 09.grn_analysis/ also expect GRN resources and intermediate AnnData outputs such as GRN_resource/, Process_Data/, Output/, and Figure/. Large data and generated outputs are intentionally excluded from the repository.
If you use this pipeline or any of its components in your research, please cite the relevant tools:
- Stereopy: https://stereopy.readthedocs.io/
- FuseMap: See
02.Fusemap/FuseMap/README.mdfor citation - RAPIDS: https://rapids.ai/
This project is for research purposes. See individual component directories for specific licenses (e.g., 02.Fusemap/FuseMap/LICENSE).
For questions or collaboration inquiries, please open an issue or contact the repository maintainer.