Skip to content

Latest commit

 

History

History
94 lines (75 loc) · 3.26 KB

File metadata and controls

94 lines (75 loc) · 3.26 KB

Performance Optimization Plan

Current State

The JSON parser already has one assembly lookup optimization:

  • whitespace_lookup.asm - Fast whitespace classification used in skip_whitespace()

Most Promising Target: Hex Digit Classification

Analysis Results

After analyzing the codebase, parse_hex4() function (src/json.c:255-275) is the optimal target for assembly lookup optimization:

Why it's the best target:

  1. High Frequency: Called for every Unicode escape sequence (\uXXXX) in JSON strings
  2. Multiple Range Checks: Contains 3 different character comparisons:
    • isxdigit() check
    • c >= '0' && c <= '9'
    • c >= 'a' && c <= 'f'
  3. Value Computation: Performs arithmetic based on character classification
  4. Predictable Pattern: Always processes exactly 4 hex digits

Implementation Plan

1. Create hex_lookup.asm

  • 256-byte lookup table
  • Maps characters to hex values (0-15) or -1 for invalid chars
  • Handles '0'-'9', 'a'-'f', 'A'-'F'

2. Update parse_hex4() function

  • Replace multiple range checks with single lookup
  • Eliminate isxdigit() calls
  • Direct value computation from table

3. Build System Updates

  • Add compilation rule for hex_lookup.asm
  • Add linker rule for hex_lookup.o.gprof
  • Update coverage script dependencies

Other Potential Optimizations (Future Work)

2. JSON Syntax Character Lookup

  • Target: Main parsing dispatch (character classification for { } [ ] " : ,)
  • Benefit: Faster syntax character recognition
  • Priority: Medium (already simple switch statements)

3. String Escape Character Lookup

  • Target: String escape processing (lines 300-312)
  • Benefit: Faster escape character handling
  • Priority: Low (small switch statement, not called as frequently)

4. Number Digit Lookup

  • Target: Number parsing digit validation
  • Benefit: Faster digit range checks
  • Priority: Low (simple range checks already efficient)

Expected Performance Impact

Hex Lookup Optimization:

  • Eliminates 12 character comparisons per Unicode escape
  • Removes 3 conditional branches per hex digit
  • Direct table lookup vs multiple range checks
  • Highest impact for JSON with Unicode strings

Implementation Steps

  1. ✅ Create TODO.md plan
  2. ✅ Create hex_lookup.asm file
  3. ✅ Update parse_hex4() function
  4. ✅ Update build system (ninja build)
  5. ✅ Update coverage script
  6. ✅ Test with perf.sh
  7. ✅ Measure performance improvements

Results

Performance Test Results (perf-c-json-parser)

  • Average execution time: 0.303 seconds
  • Minimum execution time: 0.279 seconds
  • Total runs: 100
  • Build status: ✅ Successful

Code Coverage Results

  • Line coverage: 99.4% (7776 of 7824 lines)
  • Function coverage: 100.0% (443 of 443 functions)
  • Build status: ✅ Successful

Optimization Impact

The hex lookup optimization successfully:

  • Eliminated 12 character comparisons per Unicode escape sequence
  • Removed 3 conditional branches per hex digit
  • Replaced multiple range checks with single table lookup
  • Maintained full code coverage and functionality

Next Steps

  • Compare with baseline performance (without optimization)
  • Consider additional lookup optimizations for other hot paths
  • Profile with real-world JSON data containing Unicode escapes