-
Notifications
You must be signed in to change notification settings - Fork 9
Open
Description
Coming back to the longForm discussed here, MEA now defines the following methods:
> suppressPackageStartupMessages(library(MultiAssayExperiment))
Warning message:
replacing previous import ‘S4Arrays::read_block’ by ‘DelayedArray::read_block’ when loading ‘SummarizedExperiment’
> showMethods("longForm")
Function: longForm (package BiocGenerics)
object="ANY"
object="ExperimentList"
object="MultiAssayExperiment"
ANY implicitly defines additional methods ...
> getMethod("longForm", "ANY")
Method Definition:
function (object, ...)
{
.local <- function (object, colDataCols, i = 1L, ...)
{
rowNAMES <- rownames(object)
if (is.null(rowNAMES))
rowNames <- as.character(seq_len(nrow(object)))
if (is(object, "ExpressionSet"))
object <- Biobase::exprs(object)
if (is(object, "SummarizedExperiment") || is(object,
"RaggedExperiment"))
object <- assay(object, i = i)
BiocBaseUtils::checkInstalled("reshape2")
res <- reshape2::melt(object, varnames = c("rowname",
"colname"), value.name = "value")
if (!is.character(res[["rowname"]]))
res[["rowname"]] <- as.character(res[["rowname"]])
res
}
.local(object, ...)
}
<bytecode: 0x59175b2aefc8>
<environment: namespace:MultiAssayExperiment>
Signatures:
object
target "ANY"
defined "ANY"
... including for a SummarizedExperiment
> nrows <- 5; ncols <- 2
> counts <- matrix(runif(nrows * ncols, 1, 1e4), nrows)
> colData <- DataFrame(Treatment=c("ChIP", "Input"), row.names=LETTERS[1:2])
> se0 <- SummarizedExperiment(assays=SimpleList(counts=counts), colData=colData)
> longForm(se0)
rowname colname value
1 1 A 1888.095
2 2 A 3194.072
3 3 A 7372.889
4 4 A 2488.492
5 5 A 7293.829
6 1 B 5895.799
7 2 B 9025.518
8 3 B 1884.100
9 4 B 3057.519
10 5 B 5762.292
> showMethods("longForm")
Function: longForm (package BiocGenerics)
object="ANY"
object="ExperimentList"
object="MultiAssayExperiment"
object="SummarizedExperiment"
(inherited from: object="ANY")
Suggestion 1
I would suggest to implement longForm,SummarizedExperiment in the SummarizedExperiment package.
Suggestion 2
I would also suggest to allow to return all assays as a long table, ideally by default.
Current behaviour:
> assay(se0, "counts2") <- assay(se0) * 10
> longForm(se0, i = 1)
rowname colname value
1 1 A 1888.095
2 2 A 3194.072
3 3 A 7372.889
4 4 A 2488.492
5 5 A 7293.829
6 1 B 5895.799
7 2 B 9025.518
8 3 B 1884.100
9 4 B 3057.519
10 5 B 5762.292
> longForm(se0, i = 2)
rowname colname value
1 1 A 18880.95
2 2 A 31940.72
3 3 A 73728.89
4 4 A 24884.92
5 5 A 72938.29
6 1 B 58957.99
7 2 B 90255.18
8 3 B 18841.00
9 4 B 30575.19
10 5 B 57622.92
I would find it useful to have
> longForm(se0)
DataFrame with 20 rows and 4 columns
rowname colname value assayName
<integer> <factor> <numeric> <character>
1 1 A 1888.10 counts
2 2 A 3194.07 counts
3 3 A 7372.89 counts
4 4 A 2488.49 counts
5 5 A 7293.83 counts
... ... ... ... ...
16 1 B 58958.0 counts2
17 2 B 90255.2 counts2
18 3 B 18841.0 counts2
19 4 B 30575.2 counts2
20 5 B 57622.9 counts2
Suggestion 3
I also think these long tables should incorporate colData and rowData columns.
Here's an example for a colData variable:
> longFormSE(se0, colvars = "Treatment")
DataFrame with 20 rows and 5 columns
rowname colname value assayName Treatment
<integer> <factor> <numeric> <character> <character>
1 1 A 1888.10 counts ChIP
2 2 A 3194.07 counts ChIP
3 3 A 7372.89 counts ChIP
4 4 A 2488.49 counts ChIP
5 5 A 7293.83 counts ChIP
... ... ... ... ... ...
16 1 B 58958.0 counts2 Input
17 2 B 90255.2 counts2 Input
18 3 B 18841.0 counts2 Input
19 4 B 30575.2 counts2 Input
20 5 B 57622.9 counts2 Input
A rowData variables:
> rowData(se0)$X <- letters[1:5]
> longFormSE(se0, rowvars = "X")
DataFrame with 20 rows and 5 columns
rowname colname value assayName X
<integer> <factor> <numeric> <character> <character>
1 1 A 9418.8870 counts a
2 2 A 6657.9743 counts b
3 3 A 1240.3003 counts c
4 4 A 1278.6833 counts d
5 5 A 27.7678 counts e
... ... ... ... ... ...
16 1 B 10652.9 counts2 a
17 2 B 34444.3 counts2 b
18 3 B 48373.1 counts2 c
19 4 B 21214.1 counts2 d
20 5 B 85826.3 counts2 e
and both, of course
> longFormSE(se0, colvars = "Treatment", rowvars = "X")
DataFrame with 20 rows and 6 columns
rowname colname value assayName Treatment X
<integer> <factor> <numeric> <character> <character> <character>
1 1 A 9418.8870 counts ChIP a
2 2 A 6657.9743 counts ChIP b
3 3 A 1240.3003 counts ChIP c
4 4 A 1278.6833 counts ChIP d
5 5 A 27.7678 counts ChIP e
... ... ... ... ... ... ...
16 1 B 10652.9 counts2 Input a
17 2 B 34444.3 counts2 Input b
18 3 B 48373.1 counts2 Input c
19 4 B 21214.1 counts2 Input d
20 5 B 85826.3 counts2 Input e
I would be happy to provide an initial implementation and unit test.
Metadata
Metadata
Assignees
Labels
No labels