AbacusKit is an iOS / macOS SDK that recognizes a soroban (Japanese abacus) in real time through the camera and reads off the value shown. It uses a lightweight, guided-capture design: image processing with Core Image and inference with Apple Core ML (including the Neural Engine). It has no external binary dependencies.
Features β’ Installation β’ Usage β’ How it works
- π― Guided capture β your app passes only a frame and the digit count
N; the library rectifies the frame, divides it intoNequal columns, and classifies each digit. - π’ Integer output β returns the digits (0β9) concatenated left to right (no unit point or decimal point).
- β‘ Real time β Core Image preprocessing plus parallel Core ML inference, stabilized with temporal aggregation over consecutive frames.
- π Color-independent β converts to grayscale before classification, matching the training pipeline (color carries no bead signal).
- π§΅ Swift 6 ready β
AbacusRecognizeris anactorand safe to call fromasync/await. - π¦ Zero dependencies β no OpenCV or ExecuTorch binaries; the model is bundled with the SDK.
dependencies: [
.package(url: "https://github.com/TaiyoYamada/AbacusKit.git", from: "1.0.0")
]The Core ML model (AbacusDigitClassifier.mlpackage) is bundled with the SDK and is compiled and loaded on first use.
import AbacusKit
let recognizer = AbacusRecognizer()
// Pass the frame your app showed on the preview, in the source image's
// (CVPixelBuffer) pixel coordinates.
let region = AbacusRegion.rect(guideRectInImageCoords)
let result = try await recognizer.recognize(
pixelBuffer: cameraFrame,
region: region,
digitCount: 23 // The digit count N specified by the app.
)
print(result.text) // "09452037697982622718818" (the digit string)
print(result.digits) // [0, 9, 4, 5, 2, ...]
print(result.intValue ?? 0) // The value if it fits in an Int (nil for 23 digits, etc.)
print(result.confidence) // The lowest column's confidenceimport AbacusKit
import AVFoundation
final class CameraController: NSObject, AVCaptureVideoDataOutputSampleBufferDelegate {
private let recognizer = AbacusRecognizer()
private let region = AbacusRegion.rect(/* the preview frame in image coordinates */)
private let digitCount = 23
func captureOutput(_ output: AVCaptureOutput,
didOutput sampleBuffer: CMSampleBuffer,
from connection: AVCaptureConnection) {
guard let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { return }
Task {
do {
// Average probabilities over consecutive frames; return once stable.
if let stable = try await recognizer.recognizeStabilized(
pixelBuffer: pixelBuffer, region: region, digitCount: digitCount
) {
await MainActor.run { self.commit(stable) }
}
} catch let error as AbacusError where error.isRetryable {
// Wait for the next frame.
} catch {
print("error:", error)
}
}
}
}
When the shot is tilted and distortion remains, pass the frame as four corners (rectified with a perspective correction):
let region = AbacusRegion.quad([topLeft, topRight, bottomRight, bottomLeft])When the soroban is rotated 90Β° and placed vertically, pass orientation: .vertical (the digits are stood back upright during extraction):
let result = try await recognizer.recognize(
pixelBuffer: frame, region: region, digitCount: 23, orientation: .vertical
)CVPixelBuffer + region + digitCount (N)
β
βΌ AbacusRecognizer (actor)
ColumnExtractor (Core Image)
ββ rectify the frame (rect = crop / quad = perspective correction)
ββ divide the width into N
ββ convert each column to 32Γ128 grayscale (R = G = B, Rec. 601)
β
βΌ
DigitInferenceEngine (Core ML, parallel)
ββ AbacusDigitClassifier β digit 0β9 (softmax probabilities used directly)
β
βΌ
BoardAggregator (optional temporal aggregation: averaging over frames)
β
βΌ
SorobanResult { digits, text, columns, confidence, timing }
The model (AbacusDigitClassifier) is the .mlpackage produced by the training and conversion repo
abacus-recognition. Its input is an RGB 32Γ128 image
(with the 1/255 scale baked in), and its output is classLabel and classLabel_probs (softmax already applied).
public actor AbacusRecognizer {
public init(configuration: AbacusConfiguration = .default)
public func configure(_ config: AbacusConfiguration) async throws
public func recognize(
pixelBuffer: sending CVPixelBuffer,
region: AbacusRegion,
digitCount: Int,
orientation: SorobanOrientation = .horizontal
) async throws -> SorobanResult
public func recognizeStabilized(
pixelBuffer: sending CVPixelBuffer,
region: AbacusRegion,
digitCount: Int,
orientation: SorobanOrientation = .horizontal,
minFrames: Int = 3
) async throws -> SorobanResult?
public func resetAggregation()
}
public enum AbacusRegion: Sendable, Equatable {
case rect(CGRect) // An axis-aligned frame (the common case).
case quad([CGPoint]) // Four corners [TL, TR, BR, BL] (perspective correction).
}
public enum SorobanOrientation: Sendable, Codable {
case horizontal // Standard (digits left to right, heaven beads on top).
case vertical // Rotated 90Β° (digits top to bottom); stood upright during extraction.
}
public struct SorobanResult: Sendable, Equatable {
public let digits: [Int] // Left to right, length N.
public let columns: [SorobanColumn] // Per-digit detail.
public let confidence: Float // The lowest column's confidence.
public let timing: TimingBreakdown
public var text: String { get } // The digits joined (leading zeros kept).
public var intValue: Int? { get } // Only when it fits in an Int.
}
Configure with AbacusConfiguration (default / fast / highAccuracy).
| Metric | iPhone 15 Pro (approx.) |
|---|---|
| Column extraction (Core Image) | 1β3 ms |
| Inference (Core ML, 23 digits in parallel) | 3β8 ms |
| Total | 5β12 ms |
| FPS | 40β60 |
- iOS 17.0+ / macOS 14.0+
- Xcode 16.0+
- Swift 6.0+
MIT License. See LICENSE.
swift build # Build
swift test # Test
swift package generate-documentation # DocCCode style is enforced with SwiftFormat (.swiftformat) and SwiftLint (.swiftlint.yml). Run them locally before opening a PR.
Made with β€οΈ for iOS developers