Skip to content

It takes a long time to run on both Windows and the SCC, and in practice it only uses a single core. #46

Description

@YifanWang-BU

Hi developers,

Thank you for developing this amazing method. It has been very helpful for our analysis.

I got the message that the CHOIR can only use a single core under the Windows environment. However, even when I submit the job to our SCC (Linux-based HPC cluster), it still appears to use only about one core in practice.

In addition, the Step 5 (building the clustering tree) has been running for over a week, which made me wonder whether this runtime is expected for large datasets, or whether the process might be stuck.

Do you have any recommended ways to speed up the process? and is it possible to enable true multic-core usage on Windows, or when running CHOIR on an HPC environment?

Here is the output shows where takes long time.
=== Xenium: Read10X -> Seurat -> NormalizeData -> CHOIR -> UMAP/plot ===
Normalizing layer: counts
Performing log-normalization
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|

  • CHOIR - Part 1: Build clustering tree

2026-02-22 11:34:32 : (Step 1/5) Checking inputs and preparing object..

Input data:

  • Object type: Seurat (v5)
  • of cells: 421363

  • of batches: 1

  • of modalities: 1

  • ATAC data: FALSE
  • Countsplitting: FALSE
  • Assay: RNA
  • Layer used to build tree: data
  • Layer used to prune tree: data

Proceeding with the following parameters:

  • Intermediate data stored under key: CHOIR
  • Alpha: 0.05
  • Multiple comparison adjustment: bonferroni
  • Features to train RF: var
  • of excluded features: 0

  • of permutations: 100

  • of RF trees: 50

  • Use variance: TRUE
  • Minimum accuracy: 0.5
  • Minimum connections: 1
  • Maximum repeated errors: 20
  • Distance approximation: TRUE
  • Maximum cells sampled: Inf
  • Downsampling rate: 0.1316
  • Minimum reads: >0 reads
  • Maximum clusters: 120
  • Minimum cluster depth: 2000
  • Normalization method: none
  • Subtree dimensionality reductions: TRUE
  • Dimensionality reduction method: Default
  • Dimensionality reduction parameters provided: No
  • of variable features: Default

  • Batch correction method: none
  • Batch correction parameters provided: No
  • Nearest neighbor parameters provided:
    • verbose: FALSE
  • Clustering parameters provided:
    • algorithm: 1
    • group.singletons: TRUE
    • verbose: FALSE
  • of cores: 4

  • Random seed: 1

2026-02-22 11:34:32 : (Step 2/5) Running initial dimensionality reduction..
2026-02-22 11:34:32 : Preparing matrix using 'RNA' assay and 'data' slot..
2026-02-22 11:34:33 : Running PCA with 2000 variable features..
2026-02-22 11:39:38 : (Step 3/5) Generating initial nearest neighbors graph..
2026-02-22 11:42:44 : (Step 4/5) Identify starting clustering resolution..
Starting resolution: 1e-05
2026-02-22 12:01:03 : (Step 5/5) Building clustering tree..

Thanks
Yifan

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions