Vbq #45

robamler · 2023-12-21T13:13:31Z

Implement a fast Variational Bayesian Quantization with a dynamically adjusting empirical prior.

See:

for the original proposal: Yang, Bamler, and Mandt, "Variatioinal Bayesian Quantization", ICML 2020
for an evaluation on neural network compression: Tan and Bamler, "Post-Training Neural Network Compression With Variational Bayesian Quantization", Deploy & Monitor ML Workshop at NeurIPS 2022

This will be used internally by `constriction::quant::EmpiricalDistribution`.

This enforces a discipline on private fields.

This avoids confusion because both `pos` and `accum` can be used as a "key" when searching for an entry. Accordingly, the type parameter for pos is now `P`. Also, rename `amount` to `count` to indicate that this should probably be an integer (so that tree restructuring operations don't lead to rounding errors).

Now the tests pass miri.

Includes unit tests, which pass (also in miri).

Add tests for `left_cumulative` and `quantile_function` in empty and almost empty trees.

Removes the parent pointers from all nodes in an `AugmentedBTree` as these turn out to be unnecessary for our use cases. This greatly simplifies the implementation because it removes aliasing, so we can use normal owned pointers (`Box`) and no longer need to deal with raw pointers. It should also make the implementation trivially unwind safe, which wasn't obviously the case before.

This is actually a no-op for the types for which we've used `BoundedVec` so far, but it wouldn't be for a `BoundedVec<T, CAP>` where `T` implements `Drop`.

Implements `BoundedVec<T, CAP>` as a wrapper around `BoundedPairOfVecs<T, (), CAP>`, and combine the fields `separators` and `first_child` of `NonLeafNode` into a single field whose type enforces that the two always have the same length. This makes it a bit easier to reason about the implementation of `NonLeafNode::insert`.

Many functions in `constriction` check assertions that involve bit lengths generic parameters. So far, these checks were implemented with simple `assert!` macros. This approach did not incur any run-time cost since the checks can be trivially evaluated at compile time and are therefore pretty much guaranteed to be optimized away as dead code during monomorphization (assuming that the assertions are satisfied). However, checking for assertions at run time still leaves room for accidental misuse of a function that can only be detected once control flow actually reaches an incorrect function call during testing. With this commit, we now check most assertions at compile time using const evaluation and a trick discussed at <https://morestina.net/blog/1940>. This leads to compile-time errors for all incorrect function calls, even if an incorrect function call is not reachable from any unit tests. This is a *breaking change* as it breaks correct code in edge cases where users validate generic parameters at run time. For example, assume a consumer of `constriction` contains the following function to encode a sequence of dice throws: ```rust use constriction::{stream::model::UniformModel, UnwrapInfallible}; fn encode_dice<const MAX_PRECISION: usize>( dice: &[u32], ) -> Result<Vec<u32>, Box<dyn std::error::Error>> { let mut coder = constriction::stream::stack::DefaultAnsCoder::new(); if MAX_PRECISION <= 32 { let model = UniformModel::<u32, MAX_PRECISION>::new(6); coder.encode_iid_symbols_reverse(dice, model)?; Ok(coder.into_compressed().unwrap_infallible()) } else { let model = UniformModel::<u32, 32>::new(6); coder.encode_iid_symbols_reverse(dice, model)?; Ok(coder.into_compressed().unwrap_infallible()) } } fn caller() { let dice = [3, 2, 4, 0, 1]; dbg!(encode_dice::<24>(&dice).unwrap()); // no problem here dbg!(encode_dice::<33>(&dice).unwrap()); // <-- compiler error } ``` Here, the function `encode_dice` has a const generic parameter `MAX_PRECISION`, which allows the caller to set the precision with which probabilities are represented for encoding. The function prevents misuse by checking if `MAX_PRECISION` is higher than its highest allowed value of 32, in which case it uses precision 32 instead. The function `caller` calls `encode_dice` with precisions 24 and 33. Before this commit, this would have worked, and rust would have emitted two different monomorphized variants of `encode_dice` that each contain only the respective relevant branch of the `if` statement. While the method `encode_iid_symbols_reverse` contained an `assert!` macro that would have panicked for a precision of 33, this statement was unreachable. But with this commit, the example won't compile anymore because the assertion is now checked at compile time. These compile-time checks occur before dead code elimination. They are therefore also performed on the branch that implements encoding with a precision of 33, even though this branch would never be reached at runtime. The above example shows that this commit is technically a breaking change and should therefore warrant incrementing the leading version number. In practice, however, it seems very unlikely that the illustrated issue would arise in real-world code.

Current version compiles but is not yet complete. Our strategy: - steal only from one neighbor (but still plan to merge three siblings into two if stealing doesn't resolve underflow) - require `CAP >= 4` so we don't have to care about quite as many edge cases for now (we can always release this constraint later).

This will allow us to reduce the amount of code duplication, e.g., for the implementations of `{NonLeafNode, LeafNode}::remove`. Use `PairOfBoundedVecs` for both leaf and non-leaf nodes. For non-leaf nodes, the pair of vecs are the list of separators and their right child pointers. For leaf nodes, the second bounded vec contains `()` instead of child pointers. This should completely compile away. Tests pass, also in miri.

Tests pass, also in miri.

Currently need to be run with `cargo bench --features benchmark-internals`.

This method combines `AugmentedBTree::remove` and `AugmentedBTree::insert` into a single operation that avoids unnecessarily repeated tree traversals if the remove and insert positions are close to each other. This method was motivated by the fact that shifts by small amounts are expected to come up frequently in VBQ, and avoiding unnecessarily repeated tree traversals might speed things up. Unfortunately, the implementation ended up much more complicated than expected because it turns out that there are lots of edge cases. And benchmarks don't show any advantages in speed compared to simply calling `remove` followed by `insert`. Therefore, I'll remove the method `AugmentedBTree::remove` in an upcoming commit.

This will allow reusing most of the code of the pybindings for `vbq` for the pybindings for `rate_distortion_quantization`.

Needs documentation.

These should always have been generic over `CAP`, it was just an oversight that they were only implemented for the default `CAP`.

Includes unit tests for both rust and python, as well as python API documentation.

Suppresses lots of "unused code" warnings when running miri tests.

As I understand it, this should be better because it keeps transferring ownership instead of creating temporary references and thus temporarily increasing the reference count.

robamler added 30 commits November 27, 2023 03:52

Sketch of EmpiricalDistributions

3f58d98

Start own implementation of an augmented B-tree

6b439e7

This will be used internally by `constriction::quant::EmpiricalDistribution`.

refactor AugmentedBTree::insert

e02c014

First draft of AugmentedBTree::insert

c3d4ac0

Divide augmented_btree into submoduls

824b18a

This enforces a discipline on private fields.

Tests + fixes for AugmentedBTree::{insert, cdf}

2f1b671

Fix ownership issue in AugmentedBTree pointers

87087a5

Now the tests pass miri.

Implement AugmentedBTree::quantile_function

6efaea8

Includes unit tests, which pass (also in miri).

More unit tests for AugmentedBTree

f30d329

Add tests for `left_cumulative` and `quantile_function` in empty and almost empty trees.

Minor cleanup

449b186

Add missing impl Drop for BoundedVec

e7b13a5

This is actually a no-op for the types for which we've used `BoundedVec` so far, but it wouldn't be for a `BoundedVec<T, CAP>` where `T` implements `Drop`.

Minor cleanups

1c40222

Nicer NonLeafNode::leak_raw_parts

2ea78f7

More unit tests for remove

c405eb3

Fix bugs in remove and add extensive unit tests

69e2241

Tests pass, also in miri.

Clean up tests

6cc9084

Add benchmarks for AugmentedBtree

85a33e3

Currently need to be run with `cargo bench --features benchmark-internals`.

Draft of new DynamicEmpiricalDistribution

c541219

Sketch VBQ with DynamicEmpiricalDistribution

ec8e3db

AugmentedBTree: make closures move

d19e5ed

Sketch Python API for VBQ

482698b

Remove AugmentedBTree::shift & fix clippy lints

6c235a1

Better error handling in AugmentedBTree

3ea1be0

robamler added 3 commits February 18, 2024 09:59

Refactor pybindings for quant

0570309

This will allow reusing most of the code of the pybindings for `vbq` for the pybindings for `rate_distortion_quantization`.

Python frontend for RatedGrid + RD quantization

d7a72f7

Needs documentation.

Make impls of EmpiricalDistribution generic over CAP

81ca5ba

These should always have been generic over `CAP`, it was just an oversight that they were only implemented for the default `CAP`.

robamler force-pushed the vbq branch 2 times, most recently from 3210006 to 81ca5ba Compare February 19, 2024 17:44

robamler added 6 commits February 19, 2024 21:16

Docs + python tests for RatedGrid and rd-quantization

5ecb2be

Merge branch 'main' into vbq

099df1c

Fix typo in documentation

deebfbd

Merge branch 'main' into vbq

2564278

Implement quant::DynamicRatedGrid

e59fe9e

Python: add reference argument to R/D-quant

0a69e64

robamler force-pushed the vbq branch from 68dcd44 to 0a69e64 Compare February 29, 2024 11:43

robamler added 9 commits March 1, 2024 21:04

Python: add RatedGrid.{insert, remove, update}

366068c

Includes unit tests for both rust and python, as well as python API documentation.

Remove unnecessary imports in tests

93e6c32

Merge branch 'main' into vbq

7f10d29

Don't include unused code for miri

aa48145

Suppresses lots of "unused code" warnings when running miri tests.

Fix documentation of static assertions

e1cb36c

Merge branch 'main' into vbq

0c94674

Address new clippy lints

b643631

Merge branch 'main' into vbq

7d2af32

Add test that AugmentedBTree is Send + Sync

c388356

robamler force-pushed the vbq branch from 56acfdb to c388356 Compare October 23, 2024 16:14

robamler added 8 commits October 24, 2024 17:50

Refactor where pymodules get initialized

e4a694f

pybindings: fallback if .to_vec() fails

7784449

Use Borrowed API for all PyTuples

3db279c

Minor consistency fix in lifetime names

1954264

Replace .to_object(py) by .into_any().unbind()

376e873

As I understand it, this should be better because it keeps transferring ownership instead of creating temporary references and thus temporarily increasing the reference count.

Merge branch 'main' into vbq

bc9282d

Port pybindings::quant to PyO3 version 0.22

4cda62f

Merge branch 'simplify-pybindings' into vbq

3efe975

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Vbq #45

Vbq #45

Uh oh!

robamler commented Dec 21, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Vbq #45

Are you sure you want to change the base?

Vbq #45

Uh oh!

Conversation

robamler commented Dec 21, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants