Skip to content

Allow trailing characters when deserializing #99

@MorganR

Description

@MorganR

Apologies if I'm simply doing this wrong.

I'm trying to implement a sort of streaming deserializer, but running into a limitations with the public Deserializer API that seem to make this impossible.

What I'm doing at a high level:

I want to deserialize several small structs out of a long data stream. Imagine there is 16kB of data, which contains several 1kB structs, and I can throw away a lot of data so that I end up with individual structs that are only 128 bytes each. If I have the full 16kB in memory, then I can use serde-json-core::from_slice to parse this perfectly fine. However, this requires me to have a large, mostly unused buffer.

Instead, I want to have a 4kB buffer that I can read into, parsing out one struct at a time, and throwing away data as I go. This would avoid allocating the full 16kB. This all seems fairly doable if I could do something like this:

// Fill the buffer.
let chunk: &[u8] = buffer.read_chunk().await?;
// Deserialize the next value from the buffer.
let (value, n): (T, usize) = serde_json_core::from_slice(chunk)?;
// Tell the buffer it can throw away `n` bytes.
buffer.consume(n);

However, this doesn't work because from_slice returns Err(Error::TrailingCharacters) if chunk contains more data than a single value.

I can call serde::de::Deserialize() like below, but I can't see any way to get the value of n (the number of bytes read when deserializing the value).

let mut deserializer = serde_json_core::de::Deserializer::new(chunk, None);
let value: T = de::Deserialize::deserialize(&mut deserializer)?;
let n = 0 // ???

What I need:

Either:

  1. Allow trailing characters in serde_json_core::from_slice, making use of the existing return parameter to indicate the number of bytes read.
  2. Expose a public index() method on the Deserializer to get the value of self.index

Would either of those be amenable? Or have I missed some existing way to do this?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions