Skip to content

refactor(message_pack_format): make module concrete with DTOs + codec trait#47

Closed
GordonYuanyc wants to merge 5 commits into
mainfrom
refactor/restructure-modules
Closed

refactor(message_pack_format): make module concrete with DTOs + codec trait#47
GordonYuanyc wants to merge 5 commits into
mainfrom
refactor/restructure-modules

Conversation

@GordonYuanyc
Copy link
Copy Markdown
Collaborator

Stacked on #46. Review #46 first; this branch's diff against main will collapse to just the new commit once #46 merges.

Summary

  • Move all wire-format DTOs (WireFormat, CmsData, CountMinSketchWithHeapSerialized, HydraKllSketchData, KllSketchData, StringSet, DeltaResult) out of individual wrappers and into message_pack_format::dto as the single source of truth.
  • Add a unified Error enum and a MessagePackCodec trait (static-dispatch, zero cost) implemented by every wire-format-aligned wrapper.
  • Add tests/msgpack_compat.rs with round-trip + DTO-shape coverage and #[ignore]-d slots reserved for cross-language golden-bytes fixtures from sketchlib-go.

Backwards compatibility

The legacy inherent serialize_msgpack / deserialize_msgpack methods are preserved with their original signatures as thin shims over the trait — callers and FFI bindings need no changes. KllSketchData and DeltaResult remain pub via re-exports from their wrapper modules.

Trade-offs considered

  • Trait vs. inherent only: kept both. Trait is canonical; inherent shims preserve API. Static dispatch means no runtime cost.
  • DTO extraction for CountSketch / DdSketch / HllSketch: declined — their public fields ARE the wire fields, so the MessagePackCodec impl serializes them directly. Documented in message_pack_format::mod.
  • Error type: hand-rolled two-variant enum (Encode / Decode) — no new dependency on thiserror.

Test plan

  • cargo build
  • cargo test (394 lib + 12 round-trip + 5 ignored compat tests pass)
  • cargo clippy --all-targets
  • cargo fmt --check

🤖 Generated with Claude Code

GordonYuanyc and others added 2 commits May 8, 2026 19:11
Move PR #36#45 wire-format-aligned variants out of src/sketches/ into a
new src/wrapper/ module so the in-process algorithms and the
sketchlib-go-aligned producer types live in separate trees. Add a
src/message_pack_format.rs placeholder noting that asap_sketchlib (Rust)
and sketchlib-go each maintain their own MessagePack representation,
byte-compatible at the envelope level.

Files moved:
- src/wrapper/{ddsketch,countminsketch,countsketch,hll,kll,countminsketch_topk}.rs
  carry the wire-format types appended after the per-file banner in PR #36+.
- src/wrapper/{hydra_kll,set_aggregator,delta_set_aggregator}.rs are
  whole-file moves (they were entirely wire-format).

CountL2HH stays in src/sketches/countsketch_topk.rs — it's an in-process
top-k algorithm used by sketch_framework, not a wire-format type.

No semantic changes; cargo test --lib --all-features still 447 passed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@GordonYuanyc GordonYuanyc marked this pull request as draft May 8, 2026 20:38
GordonYuanyc and others added 3 commits May 8, 2026 20:46
The `[`CANONICAL_HASH_SEED_TABLE`]` and `[`CANONICAL_HASH_SEED`]`
shortcut links in the module-level `//!` docs failed to resolve under
`cargo doc --all-features`, breaking the Rust Docs CI workflow with
`-D warnings`. Both items are still defined in this module; the doc
text simply uses inline-code formatting now instead of intra-doc
links so docgen succeeds.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… trait

Move all wire-format DTOs (WireFormat, CmsData, CountMinSketchWithHeapSerialized,
HydraKllSketchData, KllSketchData, StringSet, DeltaResult) out of individual
wrappers and into message_pack_format::dto as the single source of truth.
Add a unified Error enum and a MessagePackCodec trait (static-dispatch, zero
cost) that every wire-format-aligned wrapper implements.

The legacy inherent serialize_msgpack / deserialize_msgpack methods are
preserved with their original signatures as thin shims over the trait, so
callers and FFI bindings need no changes. KllSketchData and DeltaResult
remain pub via re-exports from their wrapper modules.

Add tests/msgpack_compat.rs with round-trip and DTO-shape coverage for every
type, plus #[ignore]-d slots reserved for cross-language golden-bytes
fixtures from sketchlib-go.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`[`Error`]`, `[`MessagePackCodec`]`, `[`dto`]` shortcut links inside
the module-level `//!` block failed to resolve under
`cargo doc --all-features` even though the items are defined in the
same module — likely a rustdoc quirk around `pub use` shadowing the
shortcut-link namespace. Switch to inline-code formatting so docgen
succeeds; the items are still discoverable via the usual rustdoc
sidebar.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@GordonYuanyc GordonYuanyc force-pushed the refactor/restructure-modules branch from f82c308 to 38c27b6 Compare May 8, 2026 20:48
Base automatically changed from refactor/module-restructure to main May 12, 2026 14:23
@GordonYuanyc
Copy link
Copy Markdown
Collaborator Author

Superseded by #46, which took a different direction (split portable/native + dropped the wrapper module and legacy msgpack shims entirely).

@GordonYuanyc GordonYuanyc deleted the refactor/restructure-modules branch May 12, 2026 20:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant