It takes a long time to run on both Windows and the SCC, and in practice it only uses a single core.

Hi developers, 

Thank you for developing this amazing method. It has been very helpful for our analysis. 

I got the message that the CHOIR can only use a single core under the Windows environment. However, even when I submit the job to our SCC (Linux-based HPC cluster), it still appears to use only about one core in practice. 

In addition, the Step 5 (building the clustering tree) has been running for over a week, which made me wonder whether this runtime is expected for large datasets, or whether the process might be stuck. 

Do you have any recommended ways to speed up the process? and is it possible to enable true multic-core usage on Windows, or when running CHOIR on an HPC environment? 


Here is the output shows where takes long time. 
=== Xenium: Read10X -> Seurat -> NormalizeData -> CHOIR -> UMAP/plot ===
Normalizing layer: counts
Performing log-normalization
0%   10   20   30   40   50   60   70   80   90   100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
----------------------------------------
- CHOIR - Part 1: Build clustering tree
----------------------------------------
2026-02-22 11:34:32 : (Step 1/5) Checking inputs and preparing object..

Input data:
 - Object type: Seurat (v5)
 - # of cells: 421363
 - # of batches: 1
 - # of modalities: 1
 - ATAC data: FALSE
 - Countsplitting: FALSE
 - Assay: RNA
 - Layer used to build tree: data
 - Layer used to prune tree: data

Proceeding with the following parameters:
 - Intermediate data stored under key: CHOIR
 - Alpha: 0.05
 - Multiple comparison adjustment: bonferroni
 - Features to train RF: var
 - # of excluded features: 0
 - # of permutations: 100
 - # of RF trees: 50
 - Use variance: TRUE
 - Minimum accuracy: 0.5
 - Minimum connections: 1
 - Maximum repeated errors: 20
 - Distance approximation: TRUE
 - Maximum cells sampled: Inf
 - Downsampling rate: 0.1316
 - Minimum reads: >0 reads
 - Maximum clusters: 120
 - Minimum cluster depth: 2000
 - Normalization method: none
 - Subtree dimensionality reductions: TRUE
 - Dimensionality reduction method: Default
 - Dimensionality reduction parameters provided: No
 - # of variable features: Default
 - Batch correction method: none
 - Batch correction parameters provided: No
 - Nearest neighbor parameters provided: 
     - verbose: FALSE
 - Clustering parameters provided: 
     - algorithm: 1
     - group.singletons: TRUE
     - verbose: FALSE
 - # of cores: 4
 - Random seed: 1

2026-02-22 11:34:32 : (Step 2/5) Running initial dimensionality reduction..
2026-02-22 11:34:32 : Preparing matrix using 'RNA' assay and 'data' slot..
2026-02-22 11:34:33 : Running PCA with 2000 variable features..
2026-02-22 11:39:38 : (Step 3/5) Generating initial nearest neighbors graph..
2026-02-22 11:42:44 : (Step 4/5) Identify starting clustering resolution..
                      Starting resolution: 1e-05
2026-02-22 12:01:03 : (Step 5/5) Building clustering tree..


Thanks 
Yifan 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

It takes a long time to run on both Windows and the SCC, and in practice it only uses a single core. #46

of cells: 421363

of batches: 1

of modalities: 1

of excluded features: 0

of permutations: 100

of RF trees: 50

of variable features: Default

of cores: 4

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

It takes a long time to run on both Windows and the SCC, and in practice it only uses a single core. #46

Description

of cells: 421363

of batches: 1

of modalities: 1

of excluded features: 0

of permutations: 100

of RF trees: 50

of variable features: Default

of cores: 4

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions