The JSON parser already has one assembly lookup optimization:
whitespace_lookup.asm- Fast whitespace classification used inskip_whitespace()
After analyzing the codebase, parse_hex4() function (src/json.c:255-275) is the optimal target for assembly lookup optimization:
Why it's the best target:
- High Frequency: Called for every Unicode escape sequence (
\uXXXX) in JSON strings - Multiple Range Checks: Contains 3 different character comparisons:
isxdigit()checkc >= '0' && c <= '9'c >= 'a' && c <= 'f'
- Value Computation: Performs arithmetic based on character classification
- Predictable Pattern: Always processes exactly 4 hex digits
- 256-byte lookup table
- Maps characters to hex values (0-15) or -1 for invalid chars
- Handles '0'-'9', 'a'-'f', 'A'-'F'
- Replace multiple range checks with single lookup
- Eliminate
isxdigit()calls - Direct value computation from table
- Add compilation rule for
hex_lookup.asm - Add linker rule for
hex_lookup.o.gprof - Update coverage script dependencies
- Target: Main parsing dispatch (character classification for
{ } [ ] " : ,) - Benefit: Faster syntax character recognition
- Priority: Medium (already simple switch statements)
- Target: String escape processing (lines 300-312)
- Benefit: Faster escape character handling
- Priority: Low (small switch statement, not called as frequently)
- Target: Number parsing digit validation
- Benefit: Faster digit range checks
- Priority: Low (simple range checks already efficient)
Hex Lookup Optimization:
- Eliminates 12 character comparisons per Unicode escape
- Removes 3 conditional branches per hex digit
- Direct table lookup vs multiple range checks
- Highest impact for JSON with Unicode strings
- ✅ Create TODO.md plan
- ✅ Create hex_lookup.asm file
- ✅ Update parse_hex4() function
- ✅ Update build system (ninja build)
- ✅ Update coverage script
- ✅ Test with perf.sh
- ✅ Measure performance improvements
- Average execution time: 0.303 seconds
- Minimum execution time: 0.279 seconds
- Total runs: 100
- Build status: ✅ Successful
- Line coverage: 99.4% (7776 of 7824 lines)
- Function coverage: 100.0% (443 of 443 functions)
- Build status: ✅ Successful
The hex lookup optimization successfully:
- Eliminated 12 character comparisons per Unicode escape sequence
- Removed 3 conditional branches per hex digit
- Replaced multiple range checks with single table lookup
- Maintained full code coverage and functionality
- Compare with baseline performance (without optimization)
- Consider additional lookup optimizations for other hot paths
- Profile with real-world JSON data containing Unicode escapes