-
Notifications
You must be signed in to change notification settings - Fork 0
Migrate mmap serialization from bincode to rkyv with zero-copy deserialization #21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Add rkyv 0.8.14 dependency with Archive/Serialize/Deserialize support - Create ScanResultRkyv type with zero-copy optimizations - Fix Duration conversion (u64 instead of u128 to avoid truncation) - Improve timestamp error handling (expect instead of unwrap_or) - Update mmap_writer with improved error handling and alignment docs - Update mmap_reader with zero-copy deserialization and error logging - Add LENGTH_PREFIX for proper entry boundary detection - All mmap tests passing (13/13) Co-authored-by: doublegate <6858123+doublegate@users.noreply.github.com>
- Add clamping to u64::MAX to prevent overflow - Document that u64 can represent up to ~584 years - Add safety comment for Duration::from_nanos - Address code review feedback on potential data loss Co-authored-by: doublegate <6858123+doublegate@users.noreply.github.com>
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
|
@copilot fix the failing 'Security Audit' check ... it looks like it's still trying to use bincode |
Remove bincode from workspace and prtip-scanner Cargo.toml files since it has been fully replaced by rkyv. This fixes the Security Audit CI check that was still detecting the unused bincode dependency. Co-authored-by: doublegate <6858123+doublegate@users.noreply.github.com>
doublegate
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed / Approved - final removal of bincode entries -- DG 1/24
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR migrates the memory-mapped scan result serialization from bincode to rkyv 0.8.14, enabling zero-copy use of the mmap buffer (no intermediate copy for alignment) and removing the deprecated bincode dependency from the workspace. It introduces a dedicated ScanResultRkyv archive type, adds a length-prefixed fixed-size entry format for mmap, and wires rkyv into the core and scanner crates.
Changes:
- Replace
bincodewithrkyvfor mmap serialization/deserialization inMmapResultWriterandMmapResultReader, including length-prefix handling, improved error mapping/logging, and explicit alignment documentation/assertions. - Introduce
ScanResultRkyvinprtip-corewith safe conversions to/fromScanResult, including clampedDurationnanoseconds and stricter timestamp handling, and re-export it fromprtip-core. - Remove
bincodefrom the workspace and crate manifests and addrkyv(withstdandallocfeatures), updatingCargo.lockaccordingly.
Reviewed changes
Copilot reviewed 7 out of 8 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
crates/prtip-scanner/src/output/mmap_writer.rs |
Switches writer to ScanResultRkyv + rkyv::to_bytes, adds an 8-byte length prefix per entry, documents and asserts alignment, and keeps zero-filling within the fixed 512-byte entry size. |
crates/prtip-scanner/src/output/mmap_reader.rs |
Switches reader to use a length prefix and rkyv::from_bytes (no extra Vec copy), with validation and detailed error logging for invalid lengths or deserialization failures. |
crates/prtip-scanner/Cargo.toml |
Replaces the bincode dependency with a workspace-scoped rkyv dependency for the scanner crate. |
crates/prtip-core/src/types.rs |
Adds the ScanResultRkyv archive struct and From conversions to/from ScanResult, handling IP address encoding, PortState mapping, clamped Duration to u64 nanos, and strict nanosecond timestamps. |
crates/prtip-core/src/lib.rs |
Re-exports ScanResultRkyv so scanner and other crates can depend on it via the prtip-core public API. |
crates/prtip-core/Cargo.toml |
Adds rkyv as a dependency for the core crate to support deriving Archive/Serialize/Deserialize on ScanResultRkyv. |
Cargo.toml |
Removes the workspace bincode dependency and adds workspace-level rkyv = 0.8.14 with std and alloc features for zero-copy serialization support. |
Cargo.lock |
Removes bincode and adds rkyv and its transitive dependencies (bytecheck, rancor, rend, etc.), reflecting the new serialization stack. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Read length prefix (u64 in little-endian) | ||
| let len = u64::from_le_bytes( | ||
| entry_bytes[..LENGTH_PREFIX_SIZE] | ||
| .try_into() | ||
| .expect("LENGTH_PREFIX_SIZE is 8 bytes"), | ||
| ) as usize; | ||
|
|
||
| // Validate length | ||
| if len == 0 || len + LENGTH_PREFIX_SIZE > self.entry_size { | ||
| eprintln!( | ||
| "MmapResultReader: invalid entry length {} at index {}", | ||
| len, index | ||
| ); | ||
| return None; | ||
| } | ||
|
|
||
| // Use zero-copy deserialization without unnecessary allocation | ||
| let data_bytes = &entry_bytes[LENGTH_PREFIX_SIZE..LENGTH_PREFIX_SIZE + len]; | ||
| match rkyv::from_bytes::<ScanResultRkyv, rkyv::rancor::Error>(data_bytes) { | ||
| Ok(rkyv_result) => Some(ScanResult::from(rkyv_result)), |
Copilot
AI
Jan 25, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Switching the per-entry encoding from raw bincode bytes to rkyv with a length prefix changes the on-disk format, but the header version field is still hard-coded to 1 and the reader only checks for version == 1 before assuming the new layout. Any existing mmap files written by the previous bincode-based implementation (also with version 1 and the same entry size) will now be misinterpreted: get_entry will read a bogus length from the old data and then either log deserialization errors or silently return None for valid old entries. To avoid breaking compatibility, please either (a) bump the header version and teach the reader to reject old files with a clear error, or (b) keep version 1 as the bincode layout and introduce a new version for the rkyv+length-prefix layout, with conditional decoding logic for both formats.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@copilot apply changes based on this feedback
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in commit 0fbf687. Bumped file format version from 1 to 2 to distinguish the new rkyv+length-prefix encoding from the old bincode format. Version 1 files are now explicitly rejected with a clear error message: "Incompatible file format: version 1 (bincode) is no longer supported..." This prevents silent data corruption. Added test to verify the rejection.
Increment file format version from 1 to 2 to distinguish the new rkyv+length-prefix encoding from the old bincode format. Version 1 files are now explicitly rejected with a clear error message directing users to regenerate their scan results. This prevents silent data corruption when attempting to read old bincode files with the new rkyv deserialization code. - Writer: Changed header version from 1 to 2 with explanatory comment - Reader: Added version check that rejects v1 files with helpful error message - Added test to verify version 1 files are properly rejected Co-authored-by: doublegate <6858123+doublegate@users.noreply.github.com>
rkyv Migration with Review Comments Applied ✅
Summary
Successfully migrated memory-mapped scan result serialization from bincode to rkyv (v0.8.14) for improved performance and zero-copy deserialization. All review comments from PR #20 have been proactively addressed and validated.
Checklist
Review Comments Addressed
Recent Fixes
Version Compatibility
Test Results
Security
Performance Benefits
Original prompt
This pull request was created from Copilot chat.
💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.