feat(iter): Add next_many() for batch iteration#346
feat(iter): Add next_many() for batch iteration#346Zeutschler wants to merge 6 commits intoRoaringBitmap:mainfrom
Conversation
Kerollmops
left a comment
There was a problem hiding this comment.
Hey @Zeutschler 👋
That's a great addition to the crate 🎉
I took a look at the usage, and it seems to me that returning an &[u32] would be preferable, as I often see you use it like &[..n] and it is a source of errors to me. If anyone wants the number, they can use the []::len() method, or the []::is_empty() method to check whether it's empty.
Would you mind changing the signature of the Iter::next_many method and also the internal methods, too? Would it also be possible to add more proptests to check if the interaction between the classic Iterator::next and the new next_many methods, please?
I am wondering: if we don't want to introduce a trait rather than raw methods, we could implement them on different structs, and people can use trait constraints instead 🤔 What do you think?
Thank you and have a nice day 🌵
Add next_many() method to RoaringBitmap iterators (both Iter and IntoIter) for efficient batch extraction of values into a buffer. This method is significantly faster than calling next() repeatedly: - 2.5-3.1x speedup for dense bitmaps (bitmap storage) - 10.5x speedup for sparse arrays (array storage) Implementation details: - Store::Iter: Direct slice copy for Array/Vec, bit extraction for Bitmap, run expansion for Interval stores - Container::Iter: Uses u16 buffer internally, combines with container key - Bitmap::Iter/IntoIter: Handles front/back iterators and container transitions The API mirrors the next_many() method available in CRoaring and the Go implementation of RoaringBitmap. Benchmarks (1M dense values): next(): 19.02ms next_many(): 6.16ms (3.09x faster) Benchmarks (10K sparse values, every 100th): next(): 1.67ms next_many(): 159.58µs (10.46x faster)
871283d to
9e54232
Compare
9e54232 to
2655748
Compare
|
For CRoaring-rs, I did next_many(&mut self, dst: &mut [u32]) -> usize. I used
I think this should return |
|
Hey @Dr-Emann 👋 Long time no see! Thanks again for the review. I changed the API based on your feedback and modified the implementation of the After these improvements, @Zeutschler, you would probably want to rerun your quick benchmarks 👀 Have a nice day 🌵 |
Summary
Add
next_many(&mut self, dst: &mut [u32]) -> &[u32]method toIterandIntoIterfor efficient batch extraction of bitmap values into a user-provided buffer.Motivation
When iterating over large bitmaps, calling
next()repeatedly incurs significant per-element overhead. Thenext_many()method extracts multiple values at once, enabling:This API mirrors the
next_many()method available in CRoaring and the Go implementation of RoaringBitmap.Performance
next()next_many()API
Returns an immutable slice of the
dstbuffer. Returns an empty slice when the iterator is exhausted.Implementation Details
DoubleEndedIteratorand container transitionsChecklist
cargo test)