Hi developers,
Thank you for developing this amazing method. It has been very helpful for our analysis.
I got the message that the CHOIR can only use a single core under the Windows environment. However, even when I submit the job to our SCC (Linux-based HPC cluster), it still appears to use only about one core in practice.
In addition, the Step 5 (building the clustering tree) has been running for over a week, which made me wonder whether this runtime is expected for large datasets, or whether the process might be stuck.
Do you have any recommended ways to speed up the process? and is it possible to enable true multic-core usage on Windows, or when running CHOIR on an HPC environment?
Here is the output shows where takes long time.
=== Xenium: Read10X -> Seurat -> NormalizeData -> CHOIR -> UMAP/plot ===
Normalizing layer: counts
Performing log-normalization
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
- CHOIR - Part 1: Build clustering tree
2026-02-22 11:34:32 : (Step 1/5) Checking inputs and preparing object..
Input data:
- Object type: Seurat (v5)
-
of cells: 421363
-
of batches: 1
-
of modalities: 1
- ATAC data: FALSE
- Countsplitting: FALSE
- Assay: RNA
- Layer used to build tree: data
- Layer used to prune tree: data
Proceeding with the following parameters:
- Intermediate data stored under key: CHOIR
- Alpha: 0.05
- Multiple comparison adjustment: bonferroni
- Features to train RF: var
-
of excluded features: 0
-
of permutations: 100
-
of RF trees: 50
- Use variance: TRUE
- Minimum accuracy: 0.5
- Minimum connections: 1
- Maximum repeated errors: 20
- Distance approximation: TRUE
- Maximum cells sampled: Inf
- Downsampling rate: 0.1316
- Minimum reads: >0 reads
- Maximum clusters: 120
- Minimum cluster depth: 2000
- Normalization method: none
- Subtree dimensionality reductions: TRUE
- Dimensionality reduction method: Default
- Dimensionality reduction parameters provided: No
-
of variable features: Default
- Batch correction method: none
- Batch correction parameters provided: No
- Nearest neighbor parameters provided:
- Clustering parameters provided:
- algorithm: 1
- group.singletons: TRUE
- verbose: FALSE
-
of cores: 4
- Random seed: 1
2026-02-22 11:34:32 : (Step 2/5) Running initial dimensionality reduction..
2026-02-22 11:34:32 : Preparing matrix using 'RNA' assay and 'data' slot..
2026-02-22 11:34:33 : Running PCA with 2000 variable features..
2026-02-22 11:39:38 : (Step 3/5) Generating initial nearest neighbors graph..
2026-02-22 11:42:44 : (Step 4/5) Identify starting clustering resolution..
Starting resolution: 1e-05
2026-02-22 12:01:03 : (Step 5/5) Building clustering tree..
Thanks
Yifan
Hi developers,
Thank you for developing this amazing method. It has been very helpful for our analysis.
I got the message that the CHOIR can only use a single core under the Windows environment. However, even when I submit the job to our SCC (Linux-based HPC cluster), it still appears to use only about one core in practice.
In addition, the Step 5 (building the clustering tree) has been running for over a week, which made me wonder whether this runtime is expected for large datasets, or whether the process might be stuck.
Do you have any recommended ways to speed up the process? and is it possible to enable true multic-core usage on Windows, or when running CHOIR on an HPC environment?
Here is the output shows where takes long time.
=== Xenium: Read10X -> Seurat -> NormalizeData -> CHOIR -> UMAP/plot ===
Normalizing layer: counts
Performing log-normalization
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
2026-02-22 11:34:32 : (Step 1/5) Checking inputs and preparing object..
Input data:
of cells: 421363
of batches: 1
of modalities: 1
Proceeding with the following parameters:
of excluded features: 0
of permutations: 100
of RF trees: 50
of variable features: Default
of cores: 4
2026-02-22 11:34:32 : (Step 2/5) Running initial dimensionality reduction..
2026-02-22 11:34:32 : Preparing matrix using 'RNA' assay and 'data' slot..
2026-02-22 11:34:33 : Running PCA with 2000 variable features..
2026-02-22 11:39:38 : (Step 3/5) Generating initial nearest neighbors graph..
2026-02-22 11:42:44 : (Step 4/5) Identify starting clustering resolution..
Starting resolution: 1e-05
2026-02-22 12:01:03 : (Step 5/5) Building clustering tree..
Thanks
Yifan