Should differential encoding be used?

I'm not sure what the general attitude in the CG is about compression, but I remember that early on there was a strong focus on minimizing file size. With that in mind, standard compression algorithms like deflate don't like arbitrary numbers - they are unable, for instance, to predict that an increasing number sequence will continue to increase, so a workaround for this is to store numbers as differences, e.g. instead of [5, 8, 11, 21], store [5, 3, 3, 10]. This makes the numbers smaller, which tends to increase the compression ratio because it concentrates probability mass (e.g. in arithmetic/huffman encoding) toward smaller numbers.

The current format is quite natural:

> the u32 byte offset of the hinted instruction from the first instruction of the function.

But since the list must be sorted, it would be straightforward to use differential encoding instead.

I haven't been very active in the Wasm community but I remember in the beginning there was an idea of having higher-level Wasm binary encodings, outside the core spec and implemented by 3rd parties, that could be used to optimize file size. Did that ever actually happen? If not, optimizing items in the core spec for compressibility starts to look more worthwhile.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Should differential encoding be used? #15

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Should differential encoding be used? #15

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions