Improve raw string parsing clarity per JEP-12 #62

joshrotenberg · 2026-01-21T01:19:55Z

This PR refactors the raw string literal parsing to use explicit character-by-character processing instead of the current string replacement approach. This makes the escape handling logic clearer and aligns the code more directly with the JEP-12 specification.

Changes

Refactor consume_raw_string in lexer to process characters explicitly
Add documentation for raw string literals to rustdoc
Add comprehensive tests for escape handling edge cases

Behavior

No behavioral changes - this is a refactor for clarity:

\' produces a literal single quote (to avoid delimiter collision)
All other backslashes are preserved literally

Testing

All 861 compliance tests pass
Added comprehensive unit tests covering 25+ escape sequence patterns
Doc tests verify examples work correctly
Verified identical output to previous implementation across ~20,000 generated test cases

Refactors raw string parsing to use explicit character-by-character processing instead of string replacement. This makes the escape handling logic clearer and matches the JEP-12 specification more directly. Behavior is unchanged: - \' produces a literal single quote - All other backslashes are preserved literally Adds comprehensive tests and documentation for raw string literals. See: https://github.com/jmespath/jmespath.jep/blob/main/proposals/0012-raw-string-literals.md

cetra3 · 2026-01-21T19:52:14Z

jmespath/src/lexer.rs

+        // Read until closing quote, then process escapes
        self.consume_inside(pos, '\'', |s| {
-            Ok(Literal(Rcvar::new(Variable::String(s.replace("\\'", "'")))))
+            // Only \' is an escape sequence - replace with literal quote


How does this differ from existing behaviour?

No functional difference, just a little bit more explicit. The PR is mostly for docs/tests/clarity.

OK what about perf difference, I am going to assume this is slower than the standard replace

You're right, it was a bit slower. Just checked in an optimization that fixes that (and seems to make it a touch faster than the original).

- Add fast path check for strings without backslashes (common case) - Remove peekable() overhead in favor of direct pattern matching - Benchmark shows 8% improvement over original replace() approach - 35% faster than initial char-by-char implementation

cetra3 reviewed Jan 21, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve raw string parsing clarity per JEP-12 #62

Improve raw string parsing clarity per JEP-12 #62

Uh oh!

joshrotenberg commented Jan 21, 2026

Uh oh!

cetra3 Jan 21, 2026

Uh oh!

joshrotenberg Jan 21, 2026

Uh oh!

cetra3 Jan 22, 2026

Uh oh!

joshrotenberg Jan 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Improve raw string parsing clarity per JEP-12 #62

Are you sure you want to change the base?

Improve raw string parsing clarity per JEP-12 #62

Uh oh!

Conversation

joshrotenberg commented Jan 21, 2026

Changes

Behavior

Testing

Uh oh!

cetra3 Jan 21, 2026

Choose a reason for hiding this comment

Uh oh!

joshrotenberg Jan 21, 2026

Choose a reason for hiding this comment

Uh oh!

cetra3 Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

joshrotenberg Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants