Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Jan 24, 2026

rkyv Migration with Review Comments Applied ✅

Summary

Successfully migrated memory-mapped scan result serialization from bincode to rkyv (v0.8.14) for improved performance and zero-copy deserialization. All review comments from PR #20 have been proactively addressed and validated.

Checklist

  • Add rkyv dependency to workspace Cargo.toml
  • Create ScanResultRkyv type in prtip-core/src/types.rs with proper conversions
  • Update mmap_writer.rs to use rkyv serialization with improved error handling
  • Update mmap_reader.rs to use zero-copy rkyv deserialization with error logging
  • Fix Duration conversion to avoid truncation (u64 instead of u128)
  • Improve timestamp error handling (expect instead of unwrap_or)
  • Document alignment requirements in mmap_writer.rs
  • Verify IpAddr Archive support (manual byte array conversion used)
  • Address code review feedback on Duration conversion safety
  • Remove bincode dependency completely (fixes Security Audit)
  • Bump format version to prevent bincode compatibility issues
  • Run comprehensive tests (all passing including version check)
  • Security checks (no vulnerabilities found)
  • Code formatting verification (passed)
  • Clippy checks (passed with no warnings)

Review Comments Addressed

  1. Zero-copy deserialization - No to_vec allocation, using rkyv::from_bytes directly
  2. Error logging - Added eprintln with index, length, and error details
  3. Alignment documentation - Added docs and compile-time assertions
  4. Duration conversion - Using u64 with clamping to prevent overflow/truncation
  5. Timestamp error handling - Using expect with descriptive message
  6. IpAddr Archive support - Manual byte array conversion (16 bytes for IPv6 compat)
  7. Error conversion - Using debug formatting for detailed error messages
  8. Format version compatibility - Bumped to v2 to reject old bincode files

Recent Fixes

  • Removed bincode dependency - Cleaned up workspace and scanner Cargo.toml files
  • Updated Cargo.lock - Removed bincode v1.3.3 from dependency tree
  • Version bump to 2 - Prevents silent corruption when reading old v1 (bincode) files
  • Clear error messages - Version 1 files rejected with helpful migration guidance

Version Compatibility

  • Version 2 (current): rkyv format with 8-byte length prefix (new)
  • Version 1 (deprecated): bincode format (no longer supported)
  • Files created with version 1 are explicitly rejected with error:

    "Incompatible file format: version 1 (bincode) is no longer supported.
    This file was created with an older version of the scanner.
    Please regenerate scan results with the current version (rkyv format, version 2)."

Test Results

  • ✅ All 14 mmap tests passing (including new version 1 rejection test)
  • ✅ All 16 output tests passing
  • ✅ 208 network tests passing
  • ✅ Workspace builds successfully
  • ✅ Clippy passes with no warnings
  • ✅ Code formatting verified
  • ✅ Bincode completely removed from dependency tree

Security

  • ✅ No vulnerabilities found in rkyv 0.8.14 dependency
  • ✅ Bincode dependency removed (no longer referenced)
  • ✅ Security Audit passes
  • ✅ Version mismatch prevents silent data corruption

Performance Benefits

  • Zero-copy deserialization eliminates unnecessary allocations
  • rkyv's direct memory access improves read performance
  • Maintains 20-50% RAM reduction from mmap design
  • Fixed-size entries (512 bytes) enable O(1) random access
Original prompt

Address Review Comments from PR #20 - rkyv Migration

Apply the following improvements based on code review feedback for the bincode to rkyv migration:

1. Improve deserialization performance (mmap_reader.rs)

Issue: The to_vec() copy undermines rkyv's zero-copy design.

Change in crates/prtip-scanner/src/output/mmap_reader.rs around line 88-95:

  • Remove the unnecessary aligned_data: Vec<u8> = entry_bytes.to_vec() allocation
  • Use direct deserialization: rkyv::from_bytes::<ScanResultRkyv, rkyv::rancor::Error>(entry_bytes)
  • This maintains zero-copy benefits and improves performance

2. Add error logging for deserialization failures (mmap_reader.rs)

Issue: Errors are silently swallowed, making debugging difficult.

Change in crates/prtip-scanner/src/output/mmap_reader.rs around line 95:
Replace Err(_) => None with:

Err(e) => {
    eprintln!(
        "MmapResultReader: failed to deserialize entry at index {} with length {}: {}",
        index,
        len,
        e
    );
    None
}

3. Document and verify alignment requirements (mmap_writer.rs)

Issue: The 8-byte LENGTH_PREFIX_SIZE may be insufficient for rkyv's alignment needs.

Change in crates/prtip-scanner/src/output/mmap_writer.rs around line 15:

  • Add a compile-time assertion or runtime check to verify ENTRY_SIZE provides adequate alignment
  • Add documentation about alignment requirements
  • Consider ensuring ENTRY_SIZE is a multiple of 16 bytes for rkyv's typical alignment needs

4. Fix Duration conversion truncation (types.rs)

Issue: Converting u128 to u64 in Duration::from_nanos() can cause data loss.

Change in crates/prtip-core/src/types.rs around line 517:

  • Store response_time_nanos as u64 instead of u128 in ScanResultRkyv
  • Update the conversion in From<&ScanResult> to cast to u64: response_time_nanos: result.response_time.as_nanos() as u64
  • Update the conversion in From<ScanResultRkyv> to use the u64 directly without casting

5. Improve timestamp error handling (types.rs)

Issue: unwrap_or(0) silently converts invalid timestamps to epoch zero.

Change in crates/prtip-core/src/types.rs around line 502:
Replace:

timestamp_nanos: result.timestamp.timestamp_nanos_opt().unwrap_or(0),

With:

timestamp_nanos: result
    .timestamp
    .timestamp_nanos_opt()
    .expect("timestamp out of range for nanosecond representation"),

6. Verify IpAddr Archive support (types.rs)

Issue: IpAddr may not implement Archive trait without additional features.

Action:

  • Verify the code compiles and IpAddr is properly supported by rkyv 0.8.14
  • If compilation fails, implement a manual conversion using a byte array representation
  • Check if additional rkyv features are needed in Cargo.toml

7. Improve error conversion (mmap_writer.rs)

Issue: Error details are lost when converting rkyv errors to io::Error.

Change in crates/prtip-scanner/src/output/mmap_writer.rs around line 64:
Replace:

.map_err(|e| io::Error::new(io::ErrorKind::InvalidData, e.to_string()))?;

With:

.map_err(|e| {
    let msg = format!("rkyv serialization error (rkyv::rancor::Error): {e:?}");
    io::Error::new(io::ErrorKind::InvalidData, msg)
})?;

Testing

After making these changes:

  1. Verify the code compiles successfully
  2. Ensure all existing tests pass
  3. Test that deserialization works correctly without the Vec allocation
  4. Verify error messages are properly logged

This pull request was created from Copilot chat.


💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Copilot AI and others added 2 commits January 24, 2026 23:06
- Add rkyv 0.8.14 dependency with Archive/Serialize/Deserialize support
- Create ScanResultRkyv type with zero-copy optimizations
- Fix Duration conversion (u64 instead of u128 to avoid truncation)
- Improve timestamp error handling (expect instead of unwrap_or)
- Update mmap_writer with improved error handling and alignment docs
- Update mmap_reader with zero-copy deserialization and error logging
- Add LENGTH_PREFIX for proper entry boundary detection
- All mmap tests passing (13/13)

Co-authored-by: doublegate <6858123+doublegate@users.noreply.github.com>
- Add clamping to u64::MAX to prevent overflow
- Document that u64 can represent up to ~584 years
- Add safety comment for Duration::from_nanos
- Address code review feedback on potential data loss

Co-authored-by: doublegate <6858123+doublegate@users.noreply.github.com>
Copilot AI changed the title [WIP] Improve deserialization performance and error logging in mmap_reader.rs Migrate mmap serialization from bincode to rkyv with zero-copy deserialization Jan 24, 2026
Copilot AI requested a review from doublegate January 24, 2026 23:22
@codecov
Copy link

codecov bot commented Jan 24, 2026

Codecov Report

❌ Patch coverage is 0% with 64 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
crates/prtip-core/src/types.rs 0.00% 38 Missing ⚠️
crates/prtip-scanner/src/output/mmap_reader.rs 0.00% 14 Missing ⚠️
crates/prtip-scanner/src/output/mmap_writer.rs 0.00% 12 Missing ⚠️

📢 Thoughts on this report? Let us know!

@doublegate
Copy link
Owner

@copilot fix the failing 'Security Audit' check ... it looks like it's still trying to use bincode

Remove bincode from workspace and prtip-scanner Cargo.toml files since
it has been fully replaced by rkyv. This fixes the Security Audit CI check
that was still detecting the unused bincode dependency.

Co-authored-by: doublegate <6858123+doublegate@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Jan 25, 2026

@copilot fix the failing 'Security Audit' check ... it looks like it's still trying to use bincode

Fixed in commit 6c360eb. Removed bincode from workspace and prtip-scanner Cargo.toml files - it was still declared as a dependency even though the code had migrated to rkyv.

Copy link
Owner

@doublegate doublegate left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed / Approved - final removal of bincode entries -- DG 1/24

@doublegate doublegate marked this pull request as ready for review January 25, 2026 01:01
Copilot AI review requested due to automatic review settings January 25, 2026 01:02
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR migrates the memory-mapped scan result serialization from bincode to rkyv 0.8.14, enabling zero-copy use of the mmap buffer (no intermediate copy for alignment) and removing the deprecated bincode dependency from the workspace. It introduces a dedicated ScanResultRkyv archive type, adds a length-prefixed fixed-size entry format for mmap, and wires rkyv into the core and scanner crates.

Changes:

  • Replace bincode with rkyv for mmap serialization/deserialization in MmapResultWriter and MmapResultReader, including length-prefix handling, improved error mapping/logging, and explicit alignment documentation/assertions.
  • Introduce ScanResultRkyv in prtip-core with safe conversions to/from ScanResult, including clamped Duration nanoseconds and stricter timestamp handling, and re-export it from prtip-core.
  • Remove bincode from the workspace and crate manifests and add rkyv (with std and alloc features), updating Cargo.lock accordingly.

Reviewed changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
crates/prtip-scanner/src/output/mmap_writer.rs Switches writer to ScanResultRkyv + rkyv::to_bytes, adds an 8-byte length prefix per entry, documents and asserts alignment, and keeps zero-filling within the fixed 512-byte entry size.
crates/prtip-scanner/src/output/mmap_reader.rs Switches reader to use a length prefix and rkyv::from_bytes (no extra Vec copy), with validation and detailed error logging for invalid lengths or deserialization failures.
crates/prtip-scanner/Cargo.toml Replaces the bincode dependency with a workspace-scoped rkyv dependency for the scanner crate.
crates/prtip-core/src/types.rs Adds the ScanResultRkyv archive struct and From conversions to/from ScanResult, handling IP address encoding, PortState mapping, clamped Duration to u64 nanos, and strict nanosecond timestamps.
crates/prtip-core/src/lib.rs Re-exports ScanResultRkyv so scanner and other crates can depend on it via the prtip-core public API.
crates/prtip-core/Cargo.toml Adds rkyv as a dependency for the core crate to support deriving Archive/Serialize/Deserialize on ScanResultRkyv.
Cargo.toml Removes the workspace bincode dependency and adds workspace-level rkyv = 0.8.14 with std and alloc features for zero-copy serialization support.
Cargo.lock Removes bincode and adds rkyv and its transitive dependencies (bytecheck, rancor, rend, etc.), reflecting the new serialization stack.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +81 to +100
// Read length prefix (u64 in little-endian)
let len = u64::from_le_bytes(
entry_bytes[..LENGTH_PREFIX_SIZE]
.try_into()
.expect("LENGTH_PREFIX_SIZE is 8 bytes"),
) as usize;

// Validate length
if len == 0 || len + LENGTH_PREFIX_SIZE > self.entry_size {
eprintln!(
"MmapResultReader: invalid entry length {} at index {}",
len, index
);
return None;
}

// Use zero-copy deserialization without unnecessary allocation
let data_bytes = &entry_bytes[LENGTH_PREFIX_SIZE..LENGTH_PREFIX_SIZE + len];
match rkyv::from_bytes::<ScanResultRkyv, rkyv::rancor::Error>(data_bytes) {
Ok(rkyv_result) => Some(ScanResult::from(rkyv_result)),
Copy link

Copilot AI Jan 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switching the per-entry encoding from raw bincode bytes to rkyv with a length prefix changes the on-disk format, but the header version field is still hard-coded to 1 and the reader only checks for version == 1 before assuming the new layout. Any existing mmap files written by the previous bincode-based implementation (also with version 1 and the same entry size) will now be misinterpreted: get_entry will read a bogus length from the old data and then either log deserialization errors or silently return None for valid old entries. To avoid breaking compatibility, please either (a) bump the header version and teach the reader to reject old files with a clear error, or (b) keep version 1 as the bincode layout and introduce a new version for the rkyv+length-prefix layout, with conditional decoding logic for both formats.

Copilot uses AI. Check for mistakes.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot apply changes based on this feedback

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit 0fbf687. Bumped file format version from 1 to 2 to distinguish the new rkyv+length-prefix encoding from the old bincode format. Version 1 files are now explicitly rejected with a clear error message: "Incompatible file format: version 1 (bincode) is no longer supported..." This prevents silent data corruption. Added test to verify the rejection.

Increment file format version from 1 to 2 to distinguish the new rkyv+length-prefix
encoding from the old bincode format. Version 1 files are now explicitly rejected
with a clear error message directing users to regenerate their scan results.

This prevents silent data corruption when attempting to read old bincode files
with the new rkyv deserialization code.

- Writer: Changed header version from 1 to 2 with explanatory comment
- Reader: Added version check that rejects v1 files with helpful error message
- Added test to verify version 1 files are properly rejected

Co-authored-by: doublegate <6858123+doublegate@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants