Skip to content

feat(iter): Add next_many() for batch iteration#346

Open
Zeutschler wants to merge 6 commits intoRoaringBitmap:mainfrom
Zeutschler:feature/next-many
Open

feat(iter): Add next_many() for batch iteration#346
Zeutschler wants to merge 6 commits intoRoaringBitmap:mainfrom
Zeutschler:feature/next-many

Conversation

@Zeutschler
Copy link

@Zeutschler Zeutschler commented Jan 18, 2026

Summary

Add next_many(&mut self, dst: &mut [u32]) -> &[u32] method to Iter and IntoIter for efficient batch extraction of bitmap values into a user-provided buffer.

Motivation

When iterating over large bitmaps, calling next() repeatedly incurs significant per-element overhead. The next_many() method extracts multiple values at once, enabling:

  • Reduced function call overhead
  • Better cache locality with contiguous buffer writes
  • ILP-friendly processing of batched results

This API mirrors the next_many() method available in CRoaring and the Go implementation of RoaringBitmap.

Performance

Benchmark next() next_many() Speedup
Dense (1M values, bitmap storage) 19.02ms 6.16ms 3.09x
Sparse (10K values, array storage) 1.67ms 159.58µs 10.46x

API

let mut iter = bitmap.iter();
let mut buf = [0u32; 1024];
loop {
    let out = iter.next_many(&mut buf);
    if out.is_empty() { break; }
    // Process out
}

Returns an immutable slice of the dst buffer. Returns an empty slice when the iterator is exhausted.

Implementation Details

  • Store::Iter: Direct slice copy for Array/Vec storage, bit extraction for Bitmap storage, run expansion for Interval stores
  • Container::Iter: Uses u16 buffer internally, combines with container key to produce u32 values
  • Bitmap::Iter/IntoIter: Handles front/back iterators from DoubleEndedIterator and container transitions

Checklist

  • Tests pass (cargo test)
  • No breaking changes
  • Backward compatible

Copy link
Member

@Kerollmops Kerollmops left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @Zeutschler 👋

That's a great addition to the crate 🎉

I took a look at the usage, and it seems to me that returning an &[u32] would be preferable, as I often see you use it like &[..n] and it is a source of errors to me. If anyone wants the number, they can use the []::len() method, or the []::is_empty() method to check whether it's empty.

Would you mind changing the signature of the Iter::next_many method and also the internal methods, too? Would it also be possible to add more proptests to check if the interaction between the classic Iterator::next and the new next_many methods, please?

I am wondering: if we don't want to introduce a trait rather than raw methods, we could implement them on different structs, and people can use trait constraints instead 🤔 What do you think?

Thank you and have a nice day 🌵

Add next_many() method to RoaringBitmap iterators (both Iter and IntoIter)
for efficient batch extraction of values into a buffer.

This method is significantly faster than calling next() repeatedly:
- 2.5-3.1x speedup for dense bitmaps (bitmap storage)
- 10.5x speedup for sparse arrays (array storage)

Implementation details:
- Store::Iter: Direct slice copy for Array/Vec, bit extraction for Bitmap,
  run expansion for Interval stores
- Container::Iter: Uses u16 buffer internally, combines with container key
- Bitmap::Iter/IntoIter: Handles front/back iterators and container transitions

The API mirrors the next_many() method available in CRoaring and the Go
implementation of RoaringBitmap.

Benchmarks (1M dense values):
  next():      19.02ms
  next_many(): 6.16ms (3.09x faster)

Benchmarks (10K sparse values, every 100th):
  next():      1.67ms
  next_many(): 159.58µs (10.46x faster)
@Dr-Emann
Copy link
Member

For CRoaring-rs, I did next_many(&mut self, dst: &mut [u32]) -> usize.

I used -> usize for consistency with the Read trait, but thinking about the differences, here's what I came up with:

  • Before NLL, returning a slice could be more painful, but that's no longer a concern
  • In a trait, especially one often used so close to unsafe, it's nice that the API requires the bytes are written to the provided slice, without an offset, etc. This doesn't really apply in this case.

I think this should return -> &'a mut [u32], rather than an immutable slice, though, rather than unnecessarily losing mutability.

@Kerollmops
Copy link
Member

Kerollmops commented Feb 26, 2026

Hey @Dr-Emann 👋

Long time no see! Thanks again for the review. I changed the API based on your feedback and modified the implementation of the next_many container methods to avoid copying too much between the u16 arrays and the u32 array output. I used the util::join method. What do you think?

After these improvements, @Zeutschler, you would probably want to rerun your quick benchmarks 👀

Have a nice day 🌵

@Kerollmops Kerollmops self-assigned this Feb 26, 2026
@Kerollmops Kerollmops self-requested a review February 26, 2026 11:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants