New code matrix packet types

Currently, the Par3 spec supports three types of code matrices:

  1. Cauchy Matrix Packet
  2. Sparse Random Matrix Packet (generated from a specific PRNG)
  3. Explicit Matrix Packet

In practice, these require an O(N^3) matrix inversion for recovery, and an O(N^2) matrix multiplication for both generation and recovery. (The sparse random matrix is more efficient during generation; not sure if this also holds for recovery.)

However it looks like the state of the art has improved, with some fast implementations of Reed-Solomon codes being proposed:

  * [FastECC](https://github.com/Bulat-Ziganshin/FastECC) (O(N log n) RS)
  * [Leopard-RS](https://github.com/catid/leopard) (O(N log n) MDS RS)
  * [Wirehair](https://github.com/catid/wirehair) “O(N) Fountain Code for Large Data”
  * [Raptor codes](https://datatracker.ietf.org/doc/html/rfc6330)

The first two provide the same redundancy as the Cauchy matrix, but can be support generation and recovery in O(N log N) time, making them much faster for large number of blocks. Leopard-RS claims to encode at 450 MB/s and decode at 190 MB/s, which is really fast.

The latter two should be even faster, though like sparse random matrices, they might need slightly more recovery blocks than the number of errors in a cohort.

It might be possible to support some of these coding schemes with an explicit matrix packet (I haven't delved deeply enough into the implementation to figure it out; the math is a bit intimidating), but even if true, that leaves two problems:

  1. The explicit matrix packet takes up a lot of space, especially when interleaving is involved (see #11).
  2. It may not be trivial to recover an efficient algorithm from an explicit packet.

Which brings me to the **main question:** should Par3 support any of these coding schemes explicitly? If so, which ones? And if so, should we have have explicit support for interleaving?

It's worth noting that @Yutaka-Sawada already implemented a Leopard matrix packet:

    The FFT matrix packet has a type value of "PAR FFT\0" (ASCII). The packet's body contains the following:
    
    *Table: FFT Packet Body Contents*
    
    | Length (bytes) | Type         | Description                          |
    |---------------:|:-------------|:-------------------------------------|
    |        8       | unsigned int | Index of first input block           |
    |        8       | unsigned int | Index of last input block plus 1     |
    |        1       | signed int   | Max number of blocks (as power of 2) |
    |      0 ~ 8     | byte[]       | Number of interleaving blocks        |


One solution is to just add this to the spec. But that leads to a side question: if Leopard-RS is an option, how useful are the _other_ coding schemes? Are e.g. the sparse random codes sufficiently faster to justify using them instead?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New code matrix packet types #21

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

New code matrix packet types #21

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions