-
Notifications
You must be signed in to change notification settings - Fork 11
Description
Related to #27, though I note that the following now works
library(DelayedArray)
DelayedArray(
matrix(
IntegerList(
c(list(c(1L, 1L)), list(c(1L,1L)), list(c(1L,1L)), list(c(2L,2L)))
),
nrow = 2, ncol = 2)
)
#> <2 x 2> matrix of class DelayedMatrix and type "list":
#> [,1] [,2]
#> [1,] 1, 1 1, 1
#> [2,] 1, 1 2, 2Created on 2020-02-28 by the reprex package (v0.3.0)
Is there a motivation to support RleListMatrix? For the same use case as above, I'm using VariantAnnotation to build a CompressedVcf object and it has matrices of lists. The list elements are in many cases NA so it may be efficient to be able to store these as an Rle-derived object. I can't go as far as to verify that such a structure would benefit from Rle - would the elements be sufficiently contiguous?
My workaround at the moment is to collapse the list elements into single delimited strings, in which case DelayedArray or RleMatrix work out of the box. In this case the string concatenation results in the matrix object decreasing in size by a factor of ~8 (potentially due to global string pooling). Converting to RleMatrix reduces it again by an additional factor of ~16. Total compression from matrix of lists to character RleMatrix is 128x. If RleListMatrix was able provide a comparable benefit without converting to string then that could be very useful.
I'll link another issue to this one specific to VariantAnnotation, but I thought I'd check if this was a) possible; b) useful; and c) of interest.
Ping @lawremi who first proposed investigating support for this structure.