AbacusKit

Soroban Recognition SDK for iOS

AbacusKit is an iOS / macOS SDK that recognizes a soroban (Japanese abacus) in real time through the camera and reads off the value shown. It uses a lightweight, guided-capture design: image processing with Core Image and inference with Apple Core ML (including the Neural Engine). It has no external binary dependencies.

Features • Installation • Usage • How it works

🚀 Features

🎯 Guided capture — your app passes only a frame and the digit count N; the library rectifies the frame, divides it into N equal columns, and classifies each digit.
🔢 Integer output — returns the digits (0–9) concatenated left to right (no unit point or decimal point).
⚡ Real time — Core Image preprocessing plus parallel Core ML inference, stabilized with temporal aggregation over consecutive frames.
🌑 Color-independent — converts to grayscale before classification, matching the training pipeline (color carries no bead signal).
🧵 Swift 6 ready — AbacusRecognizer is an actor and safe to call from async/await.
📦 Zero dependencies — no OpenCV or ExecuTorch binaries; the model is bundled with the SDK.

📦 Installation

Swift Package Manager

dependencies: [
    .package(url: "https://github.com/TaiyoYamada/AbacusKit.git", from: "1.0.0")
]

The Core ML model (AbacusDigitClassifier.mlpackage) is bundled with the SDK and is compiled and loaded on first use.

🏃 Usage

Basics

import AbacusKit

let recognizer = AbacusRecognizer()

// Pass the frame your app showed on the preview, in the source image's
// (CVPixelBuffer) pixel coordinates.
let region = AbacusRegion.rect(guideRectInImageCoords)

let result = try await recognizer.recognize(
    pixelBuffer: cameraFrame,
    region: region,
    digitCount: 23          // The digit count N specified by the app.
)

print(result.text)          // "09452037697982622718818" (the digit string)
print(result.digits)        // [0, 9, 4, 5, 2, ...]
print(result.intValue ?? 0) // The value if it fits in an Int (nil for 23 digits, etc.)
print(result.confidence)    // The lowest column's confidence

Camera integration (stabilized with temporal aggregation)

import AbacusKit
import AVFoundation

final class CameraController: NSObject, AVCaptureVideoDataOutputSampleBufferDelegate {
    private let recognizer = AbacusRecognizer()
    private let region = AbacusRegion.rect(/* the preview frame in image coordinates */)
    private let digitCount = 23

    func captureOutput(_ output: AVCaptureOutput,
                       didOutput sampleBuffer: CMSampleBuffer,
                       from connection: AVCaptureConnection) {
        guard let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { return }
        Task {
            do {
                // Average probabilities over consecutive frames; return once stable.
                if let stable = try await recognizer.recognizeStabilized(
                    pixelBuffer: pixelBuffer, region: region, digitCount: digitCount
                ) {
                    await MainActor.run { self.commit(stable) }
                }
            } catch let error as AbacusError where error.isRetryable {
                // Wait for the next frame.
            } catch {
                print("error:", error)
            }
        }
    }
}

When the shot is tilted and distortion remains, pass the frame as four corners (rectified with a perspective correction):

let region = AbacusRegion.quad([topLeft, topRight, bottomRight, bottomLeft])

When the soroban is rotated 90° and placed vertically, pass orientation: .vertical (the digits are stood back upright during extraction):

let result = try await recognizer.recognize(
    pixelBuffer: frame, region: region, digitCount: 23, orientation: .vertical
)

🧠 How it works

CVPixelBuffer + region + digitCount (N)
        │
        ▼  AbacusRecognizer (actor)
   ColumnExtractor (Core Image)
     ├─ rectify the frame (rect = crop / quad = perspective correction)
     ├─ divide the width into N
     └─ convert each column to 32×128 grayscale (R = G = B, Rec. 601)
        │
        ▼
   DigitInferenceEngine (Core ML, parallel)
     └─ AbacusDigitClassifier → digit 0–9 (softmax probabilities used directly)
        │
        ▼
   BoardAggregator (optional temporal aggregation: averaging over frames)
        │
        ▼
   SorobanResult { digits, text, columns, confidence, timing }

The model (AbacusDigitClassifier) is the .mlpackage produced by the training and conversion repo abacus-recognition. Its input is an RGB 32×128 image (with the 1/255 scale baked in), and its output is classLabel and classLabel_probs (softmax already applied).

📚 API Reference

public actor AbacusRecognizer {
    public init(configuration: AbacusConfiguration = .default)
    public func configure(_ config: AbacusConfiguration) async throws

    public func recognize(
        pixelBuffer: sending CVPixelBuffer,
        region: AbacusRegion,
        digitCount: Int,
        orientation: SorobanOrientation = .horizontal
    ) async throws -> SorobanResult

    public func recognizeStabilized(
        pixelBuffer: sending CVPixelBuffer,
        region: AbacusRegion,
        digitCount: Int,
        orientation: SorobanOrientation = .horizontal,
        minFrames: Int = 3
    ) async throws -> SorobanResult?

    public func resetAggregation()
}

public enum AbacusRegion: Sendable, Equatable {
    case rect(CGRect)       // An axis-aligned frame (the common case).
    case quad([CGPoint])    // Four corners [TL, TR, BR, BL] (perspective correction).
}

public enum SorobanOrientation: Sendable, Codable {
    case horizontal         // Standard (digits left to right, heaven beads on top).
    case vertical           // Rotated 90° (digits top to bottom); stood upright during extraction.
}

public struct SorobanResult: Sendable, Equatable {
    public let digits: [Int]            // Left to right, length N.
    public let columns: [SorobanColumn] // Per-digit detail.
    public let confidence: Float        // The lowest column's confidence.
    public let timing: TimingBreakdown
    public var text: String { get }     // The digits joined (leading zeros kept).
    public var intValue: Int? { get }   // Only when it fits in an Int.
}

Configure with AbacusConfiguration (default / fast / highAccuracy).

⚡ Performance

Metric	iPhone 15 Pro (approx.)
Column extraction (Core Image)	1–3 ms
Inference (Core ML, 23 digits in parallel)	3–8 ms
Total	5–12 ms
FPS	40–60

🔧 Requirements

iOS 17.0+ / macOS 14.0+
Xcode 16.0+
Swift 6.0+

📄 License

MIT License. See LICENSE.

🛠 Development

swift build          # Build
swift test           # Test
swift package generate-documentation   # DocC

Code style is enforced with SwiftFormat (.swiftformat) and SwiftLint (.swiftlint.yml). Run them locally before opening a PR.

Made with ❤️ for iOS developers

Name		Name	Last commit message	Last commit date
Latest commit History 117 Commits
.github/workflows		.github/workflows
Examples/CameraApp		Examples/CameraApp
Sources/AbacusKit		Sources/AbacusKit
Tests		Tests
.gitignore		.gitignore
.swiftformat		.swiftformat
.swiftlint.yml		.swiftlint.yml
LICENSE		LICENSE
Makefile		Makefile
Package.resolved		Package.resolved
Package.swift		Package.swift
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AbacusKit

Soroban Recognition SDK for iOS

🚀 Features

📦 Installation

Swift Package Manager

🏃 Usage

Basics

Camera integration (stabilized with temporal aggregation)

🧠 How it works

📚 API Reference

⚡ Performance

🔧 Requirements

📄 License

🛠 Development

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AbacusKit

Soroban Recognition SDK for iOS

🚀 Features

📦 Installation

Swift Package Manager

🏃 Usage

Basics

Camera integration (stabilized with temporal aggregation)

🧠 How it works

📚 API Reference

⚡ Performance

🔧 Requirements

📄 License

🛠 Development

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages