Skip to content

Faster ruby-units#389

Open
smathieu wants to merge 14 commits intoolbrich:masterfrom
tangibleMaterials:master
Open

Faster ruby-units#389
smathieu wants to merge 14 commits intoolbrich:masterfrom
tangibleMaterials:master

Conversation

@smathieu
Copy link

@smathieu smathieu commented Mar 6, 2026

Why

Our application makes heavy use of this excellent gem. The performance of the gem was however a bottleneck to our application.

This version is what we now run in production without known issue. This is a pretty major rewrite of the library and might now be in line with the direction you want to take this gem. If so, feel free to close this PR. I however wanted to take the time to put up a PR for this as this might be useful to others.

Summary

Major performance overhaul of ruby-units, replacing regex-based unit parsing with hash-based architecture and adding an optional C extension for hot-path acceleration. All 1173 tests pass in both C and pure Ruby (RUBY_UNITS_PURE=1) modes. Still compatible with JRuby via pure ruby mode.

Ruby-level optimizations

  • Hash-based tokenizer (resolve_unit_token) replaces the 375-entry regex alternation for unit resolution
  • compute_base_scalar_fast / compute_signature_fast avoid creating intermediate Unit objects during initialization
  • Lazy to_base caching on the instance — computed once, then memoized
  • batch_define defers regex cache invalidation during definition loading (cold start)
  • eliminate_terms uses a count_units helper to avoid dup/flatten allocations
  • convert_to uses unit_array_scalar for direct scalar computation without intermediate arrays
  • Same-unit fast path for +/- skips base conversion when numerator/denominator match exactly
  • Cache lookup uses Hash#key? (O(1)) instead of Array#include? (O(n))
  • units() fast path returns cached @unit_name for default arguments

C extension (~870 lines, optional)

Accelerates finalize_initialization, eliminate_terms, and convert_to scalar math. Falls back to pure Ruby for temperature units and when the extension is unavailable (JRuby, RUBY_UNITS_PURE=1, etc.).

Key C-level techniques:

  • Direct ivar access (rb_ivar_get) instead of rb_funcall for Definition properties (~300-700ns savings per call)
  • Hash lookup (rb_hash_aref) instead of Ruby method dispatch for definitions
  • Symbol pointer comparison for kind matching (symbols are singletons)
  • Temperature detection via strncmp in C, returning false to signal Ruby fallback
  • Pre-cached rb_intern IDs for all method/ivar lookups
  • Single-pass hash fetching — definitions/prefix_values/unit_values fetched once and passed to all helpers

Performance vs master

Benchmarked on Ruby 4.0.1, aarch64-linux. Numbers are iterations/second (higher is better).

Unit creation (uncached — cache cleared each iteration)

Benchmark master this branch Speedup
`Unit.new("1 m")` 799 33,096 41x
`Unit.new("1 km")` 343 30,056 88x
`Unit.new("1 kg*m/s^2")` 811 23,614 29x
`Unit.new("1.5e-3 mm")` 241 16,355 68x
`Unit.new("1/2 cup")` 304 8,932 29x
`Unit.new("6'4"")` 131 2,793 21x
`Unit.new("8 lbs 8 oz")` 131 3,475 27x
`Unit.new("37 degC")` 339 18,790 55x

Unit creation (cached / constructor variants)

Benchmark master this branch Speedup
cached `"1 m"` 38,890 45,674 1.2x
cached `"5 kg*m/s^2"` 14,334 24,008 1.7x
`Unit.new(1)` 133,897 978,401 7.3x
`Unit.new(scalar:, numerator:, ...)` 77,574 426,421 5.5x

Unit conversions

Benchmark master this branch Speedup
m → km 3,380 17,008 5.0x
km → m 10,434 14,380 1.4x
mph → m/s 11,071 13,294 1.2x
degC → degF 3,446 13,933 4.0x
`to_base` (km) 19,658 13,038,463 663x

Arithmetic

Benchmark master this branch Speedup
`5m + 3m` 17,380 42,839 2.5x
`5m - 3m` 21,567 44,034 2.0x
`5m * 2kg` 20,362 39,689 1.9x
`5m / 10s` 15,601 32,742 2.1x
`(5m) ** 2` 17,065 27,977 1.6x
`5m * 3` 22,008 32,808 1.5x

Complexity scaling (uncached, batch of units per iteration)

Benchmark master this branch Speedup
simple (m, kg, s) 122 4,000 33x
medium (km, kPa, MHz) 54 3,939 73x
complex (kg*m/s^2) 130 3,904 30x
very complex 108 2,735 25x

🤖 Generated with Claude Code

smathieu and others added 14 commits March 3, 2026 11:33
The core change replaces the 375-entry regex alternation used for unit
resolution with hash-based longest-match lookups, and eliminates
intermediate Unit object creation across all hot paths. All changes are
pure Ruby.

Key optimizations:
- Hash-based tokenizer (resolve_unit_token) replaces regex scanning
- compute_base_scalar_fast/compute_signature_fast avoid intermediate
  Unit creation during initialization
- Lazy to_base caching on the instance (computed only when needed)
- batch_define defers regex cache invalidation during definition loading
- eliminate_terms uses count_units helper to avoid dup/flatten allocations
- convert_to uses unit_array_scalar for direct scalar computation
- Same-unit fast path for addition/subtraction skips base conversion
- Cache uses O(1) hash lookup instead of O(n) array scan

Performance improvements (1160 tests still pass):
- Cold start: 3.8x faster (358ms -> 95ms)
- Uncached parsing: 17-19x faster
- to_base: 886x faster (lazy caching)
- Conversions: 1.6-3.4x faster
- Addition/subtraction: 1.6-1.7x faster

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implement C extension (~550 lines) that accelerates finalize_initialization,
eliminate_terms, and convert_to scalar math. Temperature units fall back to
pure Ruby. All 1165 tests pass in both C and pure Ruby (RUBY_UNITS_PURE=1)
modes. Benchmark results appended to plan_v2.md.

Key speedups vs pure Ruby: cold start 2.4x, uncached parse 1.4-2.2x,
hash/numeric constructor 2.6-3.3x, conversions 1.0-1.8x, arithmetic 1.2-1.5x.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… to scalar 1

In v4.1.0, Unit.new(nil, "m") silently succeeded by string-interpolating
nil to "" and parsing " m" as a unit with implicit scalar 1. The pattern
matching rewrite rejected nil via `if first` guards. Add explicit
`[nil, String => second]` match that delegates to parse_single_arg,
restoring the v4.1.0 behavior.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Bug 1: parse_array now matches [Numeric, Unit] so that
Unit.new(9.29, Unit.new("1 m^2")) works instead of raising ArgumentError.

Bug 2: resolve_expression_tokens uses greedy multi-token lookahead so
space-containing aliases like "square meter" resolve as a single unit
instead of being split into unrecognized individual tokens.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The C extension was making ~22-25 rb_funcall calls per finalize_initialization,
each costing 200-700ns of Ruby method dispatch overhead. This commit eliminates
most of them:

- Access Definition properties (kind, display_name, scalar, numerator,
  denominator) via rb_ivar_get instead of rb_funcall (~300-700ns savings each)
- Look up definitions via rb_hash_aref on the definitions hash instead of
  calling the definition() class method
- Inline base? and unity? checks in C using ivar access and memcmp
- Fetch class-level hashes (definitions, prefix_values, unit_values) once
  and pass to all helper functions
- Use symbol pointer comparison instead of rb_equal for kind matching
- Move temperature_tokens? check into C (strncmp vs Ruby array+regex)
- Use rb_obj_freeze instead of rb_funcall(obj, :freeze)

Results (hash constructor, most isolated benchmark):
- Simple base unit: 5.01μs → 2.80μs (1.8x faster)
- Compound unit: 13.30μs → 4.64μs (2.9x faster)
- Arithmetic: 1.25-1.33x faster across add/sub/mul/div

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Security:
- Add string type guards in defn_is_base() before RSTRING_PTR to prevent
  segfault on nil/non-string ivars
- Add nil/type check on numerator/denominator in rb_unit_finalize(),
  falling back to Ruby path if ivars aren't arrays

Correctness:
- Add missing power-range validation in compute_signature_fast (Ruby fast
  path) matching the C code and unit_signature_vector behavior

Performance:
- Cache rb_intern() calls for keys/concat/==/to_r as static IDs instead
  of resolving on every invocation
- Add fast path in units() returning cached @unit_name for default args

Cleanup:
- Remove dead C methods _c_units_string and _c_base_check (defined but
  never called from Ruby)
- Remove unused static IDs id_iv_base_unit and id_iv_output
- Exclude plan*.md and .claude/ from gem package
- Remove extra blank line

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The Rakefile requires rake/extensiontask but the gem was missing
from the Gemfile, causing bundle exec rake to fail in CI.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
rake-compiler tries to compile the C extension on JRuby which doesn't
support native extensions. Guard the ExtensionTask and the spec:compile
dependency behind a RUBY_ENGINE check.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When a temperature unit like "tempF" was cached, parsing "0 tempF"
would copy the cached unit and multiply base_scalar by 0. This is
incorrect because temperature conversions involve an offset
(e.g. 0°F = 255.37K, not 0K). Reset base_scalar to nil for
temperature units so update_base_scalar recomputes it correctly.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant