Skip to content

Pruning stuck checking for underclustering #29

Description

@doliv071

Hi All,

I was testing out CHOIR and it was going smoothly until the last step where it got stuck trying to resolve underclustering in 1 cluster. It made 20 attempts over the course of about an hour before I killed it. Is there a parameter to control this?

> clusSCE <- CHOIR::CHOIR(clusSCE)
----------------------------------------
- CHOIR - Part 1: Build clustering tree
----------------------------------------
2025-04-09 12:00:10 PM : (Step 1/7) Checking inputs and preparing object..

Input data:
 - Object type: SingleCellExperiment
 - # of cells: 13731
 - # of batches: 1
 - # of modalities: 1
 - ATAC data: FALSE
 - Countsplitting: FALSE
 - Assay used to build tree: logcounts
 - Assay used to prune tree: logcounts

Proceeding with the following parameters:
 - Intermediate data stored under key: CHOIR
 - Alpha: 0.05
 - Multiple comparison adjustment: bonferroni
 - Features to train RF: var
 - # of excluded features: 0
 - # of permutations: 100
 - # of RF trees: 50
 - Use variance: TRUE
 - Minimum accuracy: 0.5
 - Minimum connections: 1
 - Maximum repeated errors: 20
 - Distance approximation: TRUE
 - Maximum cells sampled: Inf
 - Downsampling rate: 0.3689
 - Minimum reads: >0 reads
 - Maximum clusters: auto
 - Minimum cluster depth: 2000
 - Normalization method: none
 - Subtree dimensionality reductions: TRUE
 - Dimensionality reduction method: Default
 - Dimensionality reduction parameters provided: No
 - # of variable features: Default
 - Batch correction method: none
 - Batch correction parameters provided: No
 - Nearest neighbor parameters provided: 
     - verbose: FALSE
 - Clustering parameters provided: 
     - algorithm: 1
     - group.singletons: TRUE
     - verbose: FALSE
 - # of cores: 22
 - Random seed: 1

2025-04-09 12:00:10 PM : (Step 2/7) Running initial dimensionality reduction..
2025-04-09 12:00:10 PM : Preparing input matrix using 'logcounts' assay..
2025-04-09 12:00:18 PM : Running PCA with 2000 variable features..
2025-04-09 12:00:37 PM : (Step 3/7) Generating initial nearest neighbors graph..
2025-04-09 12:00:42 PM : (Step 4/7) Identify starting clustering resolution..
                      [[ Current tree: 6 iterations in 14s ]]                       Starting resolution: 0.001
2025-04-09 12:01:05 PM : (Step 5/7) Building root clustering tree..
                      [[ Current tree: 8 iterations in 22s ]] 
                      
                      Identified 2 clusters in root tree.
2025-04-09 12:01:30 PM : (Step 6/7) Subclustering root tree..
2025-04-09 12:01:52 PM : 10% (Subtree 1/2, 13708 cells), 2 total clusters.                             
2025-04-09 12:01:56 PM : 15% (Subtree 1/2, 13708 cells), 2 total clusters.                             
2025-04-09 12:08:35 PM : 27% (Subtree 1/2, 13708 cells), 58 total clusters.                            
2025-04-09 12:09:10 PM : 35% (Subtree 1/2, 13708 cells), 64 total clusters.                            
2025-04-09 12:11:37 PM : 42% (Subtree 1/2, 13708 cells), 90 total clusters.                            
2025-04-09 12:12:04 PM : 57% (Subtree 1/2, 13708 cells), 99 total clusters.                            
2025-04-09 12:12:06 PM : 65% (Subtree 1/2, 13708 cells), 100 total clusters.                           
2025-04-09 12:12:15 PM : 72% (Subtree 1/2, 13708 cells), 104 total clusters.                           
2025-04-09 12:12:23 PM : 87% (Subtree 1/2, 13708 cells), 108 total clusters.                           
2025-04-09 12:12:28 PM : 95% (Subtree 1/2, 13708 cells), 112 total clusters.                           
2025-04-09 12:12:30 PM : 100% (Subtree 2/2, 23 cells), 112 total clusters.                             
2025-04-09 12:12:30 PM : 100% (Subtree 2/2, 23 cells), 113 total clusters.                             
Generating subtrees.. [==============================================================] 100% in 00:10:59

2025-04-09 12:12:30 PM : (Step 7/7) Compiling full clustering tree..
                      Full tree has 75 levels and 111 clusters.

----------------------------------------
- CHOIR - Part 2: Prune clustering tree
----------------------------------------
2025-04-09 12:12:32 PM : (Step 1/2) Checking inputs and preparing object..

Input data:
 - Object type: SingleCellExperiment
 - # of cells: 13731
 - # of batches: 1
 - # of modalities: 1
 - # of subtrees: 3
 - # of levels: 75
 - # of starting clusters: 111
 - Countsplitting: FALSE
 - Assay used to build tree: logcounts
 - Assay used to prune tree: logcounts

Proceeding with the following parameters:
 - Intermediate data stored under key: CHOIR
 - Alpha: 0.05
 - Multiple comparison adjustment: bonferroni
 - Features to train RF: var
 - # of excluded features: 0
 - # of permutations: 100
 - # of RF trees: 50
 - Use variance: TRUE
 - Minimum accuracy: 0.5
 - Minimum connections: 1
 - Maximum repeated errors: 20
 - Distance approximation: TRUE
 - Distance awareness: 2
 - All metrics collected: FALSE
 - Maximum cells sampled: Inf
 - Downsampling rate: 0.3689
 - Minimum reads: >0 reads
 - Normalization method: none
 - Batch correction method: none
 - Clustering parameters provided: 
     - algorithm: 1
     - group.singletons: TRUE
     - verbose: FALSE
 - # of cores: 22
 - Random seed: 1

2025-04-09 12:12:33 PM : (Step 2/2) Iterating through clustering tree..
2025-04-09 12:13:46 PM : 10% (12/75 levels) in 1.21 min. 101 clusters remaining.                       
2025-04-09 12:15:32 PM : 20% (23/75 levels) in 2.97 min. 79 clusters remaining.                        
2025-04-09 12:17:49 PM : 30% (34/75 levels) in 5.26 min. 74 clusters remaining.                        
2025-04-09 12:19:18 PM : 40% (46/75 levels) in 6.74 min. 56 clusters remaining.                        
2025-04-09 12:21:04 PM : 50% (57/75 levels) in 8.51 min. 41 clusters remaining.                        
2025-04-09 12:26:18 PM : 60% (68/75 levels) in 13.75 min. 40 clusters remaining.                       
2025-04-09 12:28:01 PM : 70% (70/75 levels) in 15.47 min. 40 clusters remaining.                       
2025-04-09 12:33:43 PM : 81% (72/75 levels) in 21.16 min. 38 clusters remaining.                       
2025-04-09 12:34:00 PM : 90% (74/75 levels) in 21.44 min. 37 clusters remaining.                       
2025-04-09 12:34:58 PM : Additional comparisons necessary. 36 clusters remaining.                      
2025-04-09 12:35:52 PM : Additional comparisons necessary. 35 clusters remaining.                      
2025-04-09 12:36:54 PM : Additional comparisons necessary. 34 clusters remaining.                      
2025-04-09 12:37:51 PM : Additional comparisons necessary. 33 clusters remaining.                      
2025-04-09 12:37:57 PM : Checking for underclustering in 7 clusters.                                   
2025-04-09 12:38:14 PM : Additional comparisons necessary. 27 clusters remaining.                      
2025-04-09 12:39:32 PM : Checking for underclustering in 1 clusters.                                   
2025-04-09 12:41:30 PM : Additional comparisons necessary. 27 clusters remaining.                      
2025-04-09 12:42:45 PM : Checking for underclustering in 1 clusters.                                   
2025-04-09 12:44:45 PM : Additional comparisons necessary. 27 clusters remaining.                      
2025-04-09 12:46:09 PM : Checking for underclustering in 1 clusters.                                   
2025-04-09 12:48:13 PM : Additional comparisons necessary. 27 clusters remaining.                      
2025-04-09 12:49:29 PM : Checking for underclustering in 1 clusters.                                   
2025-04-09 12:51:33 PM : Additional comparisons necessary. 27 clusters remaining.                      
2025-04-09 12:52:49 PM : Checking for underclustering in 1 clusters.                                   
2025-04-09 12:54:46 PM : Additional comparisons necessary. 27 clusters remaining.                      
2025-04-09 12:56:04 PM : Checking for underclustering in 1 clusters.                                   
2025-04-09 12:58:15 PM : Additional comparisons necessary. 27 clusters remaining.                      
2025-04-09 12:59:39 PM : Checking for underclustering in 1 clusters.                                   
2025-04-09 01:01:48 PM : Additional comparisons necessary. 27 clusters remaining.                      
2025-04-09 01:03:12 PM : Checking for underclustering in 1 clusters.                                   
2025-04-09 01:05:25 PM : Additional comparisons necessary. 27 clusters remaining.                      
2025-04-09 01:06:51 PM : Checking for underclustering in 1 clusters.                                   
2025-04-09 01:09:04 PM : Additional comparisons necessary. 27 clusters remaining.                      
2025-04-09 01:10:26 PM : Checking for underclustering in 1 clusters.                                   
2025-04-09 01:12:34 PM : Additional comparisons necessary. 27 clusters remaining.                      
2025-04-09 01:14:02 PM : Checking for underclustering in 1 clusters.                                   
2025-04-09 01:16:13 PM : Additional comparisons necessary. 27 clusters remaining.                      
2025-04-09 01:17:37 PM : Checking for underclustering in 1 clusters.                                   
2025-04-09 01:19:42 PM : Additional comparisons necessary. 27 clusters remaining.                      
2025-04-09 01:21:05 PM : Checking for underclustering in 1 clusters.                                   
2025-04-09 01:22:57 PM : Additional comparisons necessary. 27 clusters remaining.                      
2025-04-09 01:24:14 PM : Checking for underclustering in 1 clusters.                                   
2025-04-09 01:26:09 PM : Additional comparisons necessary. 27 clusters remaining.                      
2025-04-09 01:27:25 PM : Checking for underclustering in 1 clusters.                                   
2025-04-09 01:29:25 PM : Additional comparisons necessary. 27 clusters remaining.                      
2025-04-09 01:30:43 PM : Checking for underclustering in 1 clusters.                                   
2025-04-09 01:32:39 PM : Additional comparisons necessary. 27 clusters remaining.                      
2025-04-09 01:33:54 PM : Checking for underclustering in 1 clusters.                                   
2025-04-09 01:35:48 PM : Additional comparisons necessary. 27 clusters remaining.                      
2025-04-09 01:37:06 PM : Checking for underclustering in 1 clusters.                                   
2025-04-09 01:39:17 PM : Additional comparisons necessary. 27 clusters remaining.                      
2025-04-09 01:40:48 PM : Checking for underclustering in 1 clusters.                                   
2025-04-09 01:42:59 PM : Additional comparisons necessary. 27 clusters remaining.                      
2025-04-09 01:44:25 PM : Checking for underclustering in 1 clusters. 

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions