Skip to content

feat: dynamic stream buffer#79

Merged
mdsteele merged 5 commits intomdsteele:masterfrom
francisdb:feat/auto_grow_buffer
Feb 4, 2026
Merged

feat: dynamic stream buffer#79
mdsteele merged 5 commits intomdsteele:masterfrom
francisdb:feat/auto_grow_buffer

Conversation

@francisdb
Copy link
Contributor

@francisdb francisdb commented Feb 2, 2026

This pull request introduces a new dynamically growing buffer implementation for streams, replacing the previous fixed-size buffer approach.

This introduces a breaking change as the options on the Stream have been moved to the CompundFile. Also you now configuring max_buffer_size instead of buffer_size with a default of 1 MiB.

I ran the original benchmark and with buffer size ignored there were only improvements, no regressions. So I removed the specific benchmarks per buffer size.

@francisdb
Copy link
Contributor Author

francisdb commented Feb 2, 2026

@mdsteele that max_buffer_size would probably make more sense to be configured for the whole cfb file instead of per stream. This for memory-constrained environments. What are your thoughts?

@mdsteele
Copy link
Owner

mdsteele commented Feb 2, 2026

Thanks for working on this!

If this is a better/more-performant design, then I think a breaking change here makes sense; the API that would get broken was only just recently published, so probably not a lot of dependencies on it yet, and anyway this is a pre-1.0 crate.

Making this a per-CompoundFile setting seems reasonable to me too. In that case, my initial gut is we should:

  1. Just remove CreateStreamOptions and OpenStreamOptions (and their associated methods) entirely
  2. Add a new OpenOptions for CompoundFile, with max_buffer_size and strict settings, leaving CompoundFile::open and CompoundFile::open_strict as convenience methods
  3. Ideally, make cfb::OpenOptions be as similar as possible to std::fs::OpenOptions, which I guess would mean putting the open() method on cfb::OpenOptions rather than adding a CompoundFile::open_with_options method? Although maybe we'd want two methods, one that takes an F and a convenience one that takes an AsRef<Path>, not sure.

But I'm totally open to something else if you suggest otherwise.

@francisdb
Copy link
Contributor Author

@mdsteele want me to split up to OpenOptions and CreateOptions or do you think it's ok to do open_options.create() and having the strict being ignored?

@francisdb francisdb force-pushed the feat/auto_grow_buffer branch from 63617b2 to 8467e5e Compare February 2, 2026 20:58
@francisdb francisdb force-pushed the feat/auto_grow_buffer branch from 8467e5e to d0357f5 Compare February 3, 2026 09:02
@francisdb
Copy link
Contributor Author

francisdb commented Feb 3, 2026

Looks like this caused a regression on the read side. Adding a read benchmark.

@francisdb
Copy link
Contributor Author

All done. Question on the single or split options remains.

@francisdb francisdb force-pushed the feat/auto_grow_buffer branch from 005e1f7 to 69a8c72 Compare February 3, 2026 17:03
@mdsteele
Copy link
Owner

mdsteele commented Feb 3, 2026

@mdsteele want me to split up to OpenOptions and CreateOptions or do you think it's ok to do open_options.create() and having the strict being ignored?

Oh, hmm. I think effectively ignoring the strict setting on create() is probably fine; the crate always tries to be strict when creating files anyway. Maybe there could be other options we'd add in the future that would only make sense for one or the other? I guess if necessary we could just make it an error to call, say, open() when using an option that really only makes sense for create().

@francisdb
Copy link
Contributor Author

Want me to squash everything?

@mdsteele
Copy link
Owner

mdsteele commented Feb 4, 2026

Looks great, thanks

@mdsteele mdsteele merged commit 317049d into mdsteele:master Feb 4, 2026
4 checks passed
@francisdb francisdb deleted the feat/auto_grow_buffer branch February 4, 2026 17:49
@francisdb
Copy link
Contributor Author

@mdsteele mind releasing this?
I'm switching to other projects for now. Further optimizations will be for later. (tree balancing, tree node name caching, iteration improvements).

@mdsteele
Copy link
Owner

Sounds good, thanks. Published as v0.14.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants