Adding new method: scMerge2#63
Conversation
|
@lazappi Hi! This is Seo :) |
Co-authored-by: Luke Zappia <lazappi@users.noreply.github.com>
|
@mumichae Could you take a look at this PR? |
|
@mumichae I fixed up the code with the help of your feedback!
Let me know if there are any thing else that can be better :) |
mumichae
left a comment
There was a problem hiding this comment.
Already looking a lot better!
There are still some computational bottlenecks that are worth solving (given that this methods uses the complete count matrix and densifying is expensive.
| adata <- anndata::read_h5ad(par$input) | ||
|
|
||
| anndataToUnsupervisedScMerge2 <- function(adata, top_n = 1000, verbose = TRUE) { | ||
| counts <- t(as.matrix(adata$layers[["counts"]])) |
There was a problem hiding this comment.
Could you preserve the sparsity of the data? According to the documentation, scMerge should be able to deal with sparse matrices. If the matrix is already a dgeMatrix, you can avoid conversion
|
|
||
| cat("Run unsupervised scMerge2\n") | ||
|
|
||
| scMerge2_res <- anndataToUnsupervisedScMerge2(adata, top_n = 1000L, verbose = TRUE) |
There was a problem hiding this comment.
Could you make top_n for the control genes a parameter that can be adjusted in the config.vsh.yaml?
| batch <- as.character(adata$obs$batch) | ||
| cellTypes <- as.character(adata$obs$cell_type) | ||
|
|
||
| scMerge2_res <- scMerge2( | ||
| exprsMat = exprsMat, | ||
| batch = batch, | ||
| ctl = ctl, | ||
| verbose = verbose | ||
| ) |
There was a problem hiding this comment.
You can simplify the code a bit
| batch <- as.character(adata$obs$batch) | |
| cellTypes <- as.character(adata$obs$cell_type) | |
| scMerge2_res <- scMerge2( | |
| exprsMat = exprsMat, | |
| batch = batch, | |
| ctl = ctl, | |
| verbose = verbose | |
| ) | |
| scMerge2_res <- scMerge2( | |
| exprsMat = exprsMat, | |
| batch = as.character(adata$obs$batch), | |
| ctl = ctl, | |
| verbose = verbose | |
| ) |
| seg_df <- seg_df[order(seg_df$segIdx, decreasing = TRUE), , drop = FALSE] | ||
| ctl <- rownames(seg_df)[seq_len(min(top_n, nrow(seg_df)))] | ||
|
|
||
| exprsMat <- t(as.matrix(adata$layers[["normalized"]])) |
There was a problem hiding this comment.
Is densification via as.matrix really necessary here?
| embedding <- prcomp(t(corrected_mat))$x[, 1:10, drop = FALSE] | ||
| rownames(embedding) <- colnames(corrected_mat) |
There was a problem hiding this comment.
PCA computation is not necessary for feature methods, because the "embedding" will be computed after the integration in a post-processing step. See Combat for example
| embedding <- prcomp(t(corrected_mat))$x[, 1:10, drop = FALSE] | |
| rownames(embedding) <- colnames(corrected_mat) |
| obsm = list( | ||
| X_emb = embedding[as.character(adata$obs_names), , drop = FALSE] # match input cells | ||
| ), |
There was a problem hiding this comment.
PCA not needed here
| obsm = list( | |
| X_emb = embedding[as.character(adata$obs_names), , drop = FALSE] # match input cells | |
| ), |
| batch <- as.character(adata$obs$batch) | ||
| cellTypes <- as.character(adata$obs$cell_type) | ||
|
|
||
| scMerge2_res <- scMerge2( | ||
| exprsMat = exprsMat, | ||
| batch = batch, | ||
| cellTypes = cellTypes, | ||
| ctl = ctl, | ||
| verbose = verbose | ||
| ) |
There was a problem hiding this comment.
Nitpick: simplify code
| batch <- as.character(adata$obs$batch) | |
| cellTypes <- as.character(adata$obs$cell_type) | |
| scMerge2_res <- scMerge2( | |
| exprsMat = exprsMat, | |
| batch = batch, | |
| cellTypes = cellTypes, | |
| ctl = ctl, | |
| verbose = verbose | |
| ) | |
| scMerge2_res <- scMerge2( | |
| exprsMat = exprsMat, | |
| batch = as.character(adata$obs$batch), | |
| cellTypes = as.character(adata$obs$cell_type), | |
| ctl = ctl, | |
| verbose = verbose | |
| ) |
There was a problem hiding this comment.
Comments from unsupervised scMerge2 apply here as well
| rownames(counts) <- as.character(adata$var_names) | ||
| colnames(counts) <- as.character(adata$obs_names) | ||
|
|
||
| seg_df <- scSEGIndex(exprs_mat = counts) |
There was a problem hiding this comment.
add a comment to document what you are doing here
Describe your changes
This PR is for a new method scMerge2.
Checklist before requesting a review
I have performed a self-review of my code
Check the correct box. Does this PR contain:
Proposed changes are described in the CHANGELOG.md
CI Tests succeed and look good!