Skip to content

Add binary-encoded generative process#180

Open
loren-ac wants to merge 1 commit intomainfrom
feature/binary-encoded-process
Open

Add binary-encoded generative process#180
loren-ac wants to merge 1 commit intomainfrom
feature/binary-encoded-process

Conversation

@loren-ac
Copy link
Copy Markdown
Collaborator

Summary

  • Adds BinaryEncodedProcess, a wrapper that encodes any generative process's vocabulary into fixed-width big-endian binary (MSB first)
  • Emits individual bits (vocab_size=2) and defers the underlying state transition until a complete binary word is emitted
  • Supports incomplete sequences in probability/log_probability (sequences may end mid-token)
  • Unused binary codes (when vocab size is not a power of 2) get probability 0

Test plan

  • 35 tests covering properties, observation distributions, state transitions, probability, generation, and wrapping different process types (HMM, GHMM)
  • Verified complete binary sequences yield identical probabilities to base process
  • Verified decoded generated sequences match base process stationary distribution
  • Verified identity behavior when base vocab_size is already 2 (coin process)

Introduce BinaryEncodedProcess, which wraps any GenerativeProcess by
encoding its vocabulary into fixed-width big-endian binary. The wrapper
emits individual bits (vocab_size=2) and defers the underlying state
transition until a complete binary word has been emitted. Incomplete
sequences are supported in probability/log_probability computations.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant