Skip to content

Error when max_clusters != "auto" #47

Description

@sian-wood

Hello dev team,

Due to memory limits which I have been encountering when trying to apply CHOIR to a particularly large dataset, I have been experimenting with the sample_max and max_clusters parameters. However, setting max_clusters to any value other than "auto" causes an error as follows:

  • Line 1296 of buildTree.R sets subtree_sizes to NULL: "subtree_sizes" = if(max_clusters == "auto", c(n_cells, subtree_sizes), NULL)
  • Line 409 of pruneTree.R does the following: subtree_names_filtered <- subtree_names[subtree_sizes > 3]. This causes subtree_names_filtered to be character(0).
  • Line 411 of pruneTree.R sets n_subtrees_filtered <- length(subtree_names_filtered), i.e. n_subtrees_filtered <- 0
  • Lines 512-516 of pruneTree.R: if buildTree_parameters[["subtree_reductions"]] == TRUE, then n_input_matrices <- n_subtrees_filtered i.e. n_input_matrices is 0.
  • On line 523 of pruneTree.R, we then have the following for-loop: for (subtree in 1:n_input_matrices). In the second iteration of the loop, subtree = 0, which leads to the following error on line 584:
    Error in input_matrices[[subtree]] <- input_matrix :
    attempt to select less than one element in integerOneIndex

On another note, I am applying CHOIR to flow cytometry data, which is a big contributor to the memory and runtime issues I am experiencing, and is why I am playing around with these parameters. If anyone would be willing to discuss with me the implications of applying this method to cytometry data, that would be wonderful. I have seen that this type of data was not discussed in your paper, and I assume that this is because CHOIR is optimised to deal with far higher dimensionality (I only have 27 variables) and fewer cells (some of my datasets contain up to 6.5 million cells). I would love to hear if the potential use of CHOIR for cytometry data was ever considered, and what other challenges you would expect in this scenario.

Thank you for your time!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions