Skip to content

Add operation normalizer#92

Open
swalkinshaw wants to merge 1 commit intomainfrom
add-operation-normalizer
Open

Add operation normalizer#92
swalkinshaw wants to merge 1 commit intomainfrom
add-operation-normalizer

Conversation

@swalkinshaw
Copy link
Contributor

@swalkinshaw swalkinshaw commented Mar 2, 2026

Summary

Add bluejay-operation-normalizer crate that produces canonical operation signatures following Apollo's enhanced operation signature algorithm

Operation normalizer

Public API is two functions:

  • normalize(doc, op_name) -> Result<String, SignatureError> — canonical normalized string
  • signature(doc, op_name) -> Result<String, SignatureError> — BLAKE3 hash of the normalized string

Normalization rules (Apollo enhanced mode):

  • Values: ints/floats → 0, strings → "", lists → [], object keys preserved with values normalized recursively, booleans/enums/null/variables preserved as-is
  • Sorting: fields by response name, then fragment spreads, then inline fragments — all alphabetical. Arguments, directives, variable definitions also sorted alphabetically. Equal sort keys are tie-broken by compact rendering for full canonicalization
  • Aliases: preserved (enhanced mode behavior)
  • Fragments: unused fragment definitions stripped, used fragments sorted alphabetically and emitted before the operation
  • Whitespace: all redundant whitespace removed

Benchmarks:

     Running benches/normalize.rs (/Users/scottwalkinshaw/dev/bluejay/target/release/deps/normalize-cae969b0aa85a489)
normalize_small         time:   [101.20 ns 102.38 ns 104.00 ns]
signature_small         time:   [211.35 ns 211.96 ns 212.61 ns]
normalize_medium        time:   [479.34 ns 479.94 ns 480.59 ns]
signature_medium        time:   [755.67 ns 759.39 ns 763.26 ns]
normalize_complex       time:   [1.5625 µs 1.5668 µs 1.5718 µs]
signature_complex       time:   [2.2152 µs 2.2160 µs 2.2169 µs]
normalize_nested_inputs time:   [412.22 ns 412.48 ns 412.79 ns]
signature_nested_inputs time:   [742.05 ns 742.49 ns 742.93 ns]

@swalkinshaw swalkinshaw force-pushed the add-operation-normalizer branch 3 times, most recently from b76e355 to ce88b46 Compare March 2, 2026 20:07
@swalkinshaw swalkinshaw requested a review from adampetro March 2, 2026 20:09
@swalkinshaw swalkinshaw marked this pull request as ready for review March 2, 2026 20:09
@swalkinshaw swalkinshaw force-pushed the add-operation-normalizer branch from ce88b46 to e713a29 Compare March 2, 2026 20:14
@swalkinshaw swalkinshaw force-pushed the add-operation-normalizer branch from e713a29 to 381b7d6 Compare March 2, 2026 20:18
Comment on lines +7 to +26
#[derive(Debug, Clone, PartialEq, Eq)]
pub enum SignatureError {
OperationNotFound(String),
AmbiguousOperation,
NoOperations,
}

impl std::fmt::Display for SignatureError {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match self {
Self::OperationNotFound(name) => write!(f, "operation not found: {name}"),
Self::AmbiguousOperation => {
write!(f, "multiple operations found; specify operation name")
}
Self::NoOperations => write!(f, "no operations in document"),
}
}
}

impl std::error::Error for SignatureError {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could use thiserror to do this all in the enum definition I think

Comment on lines +68 to +70
OperationType::Query => "query",
OperationType::Mutation => "mutation",
OperationType::Subscription => "subscription",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might be able to add strum::IntoStaticStr to the derive on OperationType and then this would just become .into()

Comment on lines +242 to +262
fn write_variable_type<E: ExecutableDocument>(
out: &mut String,
vt: &VariableTypeReference<'_, E::VariableType>,
) {
match vt {
VariableTypeReference::Named(name, required) => {
out.push_str(name);
if *required {
out.push('!');
}
}
VariableTypeReference::List(inner, required) => {
out.push('[');
write_variable_type::<E>(out, &inner.as_ref());
out.push(']');
if *required {
out.push('!');
}
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do have VariableTypeReference::display_name which you could use I think, unless it is less performant due to allocating more strings

@swalkinshaw
Copy link
Contributor Author

I'll chat more with @ravangen next week or so about this but there are a few more decisions we can make:

  1. Aliases: preserve or strip?
  2. Enum values: preserve as-is (current) or replace with placeholder?
  3. Booleans: preserve both true/false (current) or normalize to a single value?
  4. Object values: preserve structure with normalized values (current) or replace with empty {}?

There's also a semantic and functional difference between normalizing for identity (like persisted operations) vs normalization as an execution view (does this operation fetch the same data at runtime?). The latter would require the schema but could include features like @skip/@include evaluation.

We could offer both in the future but I prefer to keep this simpler and static for now, operating on just the AST.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants