ADFA-3813 | Fix OCR image metadata parsing and refactor CV domain by jatezzz · Pull Request #1300 · appdevforall/CodeOnTheGo

jatezzz · 2026-05-13T16:46:10Z

Description

Fixes an OCR issue where arbitrary metadata values for image files were parsed incorrectly (e.g., reading "image" as "im" or "im_age"), causing broken android:src references in the generated XML. Alongside this fix, the Computer Vision architecture was heavily refactored: complex domain logic was extracted from ComputerVisionViewModel and the monolithic repository into focused, single-responsibility UseCases (RunVisionUC, GenerateXmlUC, PrepareImageUC, etc.) and helper objects (DetectionScaler, LayoutTreeBuilder, TextAssociator).

Details

Added regex replacements in DrawableCleaner (ValueCleanersImpl.kt) to correct common OCR misinterpretations for the word "image".
Replaced ComputerVisionRepository with VisionRepository to purely handle ML operations.
Split monolithic ViewModel logic into isolated UseCases (GenerateXmlUC, ImportPlaceholderImageUC, PrepareImageUC, RemovePlaceholderImageUC, RunVisionUC).
Extracted UI mapping and XML tree building logic into standalone components (LayoutTreeBuilder, TextAssociator, and DetectionScaler).

Screen.Recording.2026-05-13.at.11.22.18.AM.mov

Ticket

ADFA-3813

Observation

The domain refactoring drastically reduces the bloat in the ComputerVisionViewModel, effectively decoupling the UI state management from ML detection and XML generation logic.

…omain logic into use cases.

coderabbitai · 2026-05-13T16:50:25Z

Warning

Rate limit exceeded

@jatezzz has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 20 minutes and 23 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: eefd1f11-ab17-4244-b333-cf1af0c7ea1a

📥 Commits

Reviewing files that changed from the base of the PR and between b667da3 and 9815c28.

📒 Files selected for processing (4)

cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/DetectionScaler.kt
cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/parser/ValueCleanersImpl.kt
cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/usecase/PrepareImageUC.kt
cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/ui/viewmodel/ComputerVisionViewModel.kt

📝 Walkthrough

Walkthrough

The PR refactors the computer-vision architecture by decomposing a monolithic repository and geometry processor into a simplified repository interface, domain use cases, and focused geometry utilities. It removes ComputerVisionRepository/ComputerVisionRepositoryImpl and replaces them with a leaner VisionRepository that delegates to model sources. It extracts layout-geometry responsibilities from LayoutGeometryProcessor into DetectionScaler, TextAssociator, and LayoutTreeBuilder. It introduces five new domain use cases (PrepareImageUC, RunVisionUC, GenerateXmlUC, ImportPlaceholderImageUC, RemovePlaceholderImageUC) that orchestrate the vision and XML-generation pipelines. The ViewModel now consumes these use cases instead of directly managing operations.

Changes

Repository Layer Refactoring

Layer / File(s)	Summary
Simplified vision repository contract and implementation `VisionRepository.kt`, `VisionRepositoryImpl.kt`	Replaces the removed `ComputerVisionRepository`/`ComputerVisionRepositoryImpl` with a cleaner interface that exposes only model initialization, widget detection, text recognition, and lifecycle methods (`initModel()`, `detectWidgets(bitmap)`, `recognizeText(bitmap)`, `isInitialized()`, `release()`). The new implementation delegates to `YoloModelSource` and `OcrSource`.

Geometry and Layout Utilities Decomposition

Layer / File(s)	Summary
Detection scaling utility `DetectionScaler.kt`	Extracts the bounding-box scaling logic from the removed `LayoutGeometryProcessor` into a focused singleton that converts YOLO-normalized boxes into pixel coordinates for a target Android resolution, including clamping and minimum-size enforcement.
Text-to-widget association utilities `TextAssociator.kt`	Extracts text-association logic from the removed `LayoutGeometryProcessor` into a singleton with two public methods: `assignTextToParents(...)` for assigning OCR text to parent widgets via overlap threshold, and `assignNearbyTextToWidgets(...)` for assigning nearby text with vertical-alignment and proximity scoring. Includes widget labelability checks and widget-type-specific text cleaning.
Layout tree builder `LayoutTreeBuilder.kt`	Extracts layout-tree construction from the removed `LayoutGeometryProcessor` into a singleton that groups scaled boxes into rows (using vertical overlap and center-alignment heuristics) and builds higher-level `LayoutItem` structures (radio/checkbox groups, horizontal rows, simple views).

Domain Use Cases Layer

Layer / File(s)	Summary
Image preparation use case `PrepareImageUC.kt`	New use case that decodes an image URI, applies EXIF-based rotation using `TAG_ORIENTATION` fallback, computes smart left/right boundary percentages via `SmartBoundaryDetector`, and returns a `Result<PreparedImage>` wrapping the bitmap and guide percentages. Handles cancellation semantics correctly.
Vision processing orchestration use case `RunVisionUC.kt`	New use case that orchestrates YOLO detection, detection resolution via `GenericBoxResolver`, region OCR with left/right guide percentages, detection merging via `DetectionMerger`, filtering by bounds, and margin-annotation parsing. Emits progress updates via callback and returns `Result<VisionResult>` with detections and annotations. Preserves coroutine cancellation semantics.
XML generation use case `GenerateXmlUC.kt`	New use case that wraps `YoloToXmlConverter.generateXmlLayout(...)` in `runCatching`, passing through detections, annotations, image selection map, dimensions, and forcing `wrapInScroll = true`. Returns `Result<Pair<String, String>>` for layout and strings XML.
Placeholder image import/removal use cases `ImportPlaceholderImageUC.kt`, `RemovePlaceholderImageUC.kt`	New use cases that delegate to `DrawableImportHelper`: `ImportPlaceholderImageUC` imports a user-selected gallery image with a fallback name derived from `placeholderId`; `RemovePlaceholderImageUC` removes a drawable by resource name. Both return `Result`-wrapped outcomes.

Converter and XML Generator Updates

Layer / File(s)	Summary
YoloToXmlConverter refactoring `YoloToXmlConverter.kt`	Removes the `LayoutGeometryProcessor` dependency and delegates scaling and text-association steps to `DetectionScaler` and `TextAssociator`. The constructor now takes only `annotationMatcher` and `xmlGenerator`. Methods `scaleDetections`, `associateTextToWidgets`, and `extractCanvasTags` are updated to use the new utility objects instead of the geometry processor.
AndroidXmlGenerator refactoring `AndroidXmlGenerator.kt`	Removes the `LayoutGeometryProcessor` dependency and updates the constructor. The `buildXml` method now derives layout items via `LayoutTreeBuilder.buildLayoutTree(boxes)` instead of calling the injected geometry processor.

ViewModel and DI Integration

Layer / File(s)	Summary
ComputerVisionViewModel refactoring `ComputerVisionViewModel.kt`	Updates constructor to inject `VisionRepository` and five use cases (`PrepareImageUC`, `RunVisionUC`, `GenerateXmlUC`, `ImportPlaceholderImageUC`, `RemovePlaceholderImageUC`) instead of repository implementation details and UI helpers. Image loading now delegates to `PrepareImageUC`; detection runs through `RunVisionUC` with progress callbacks; XML generation and export use `GenerateXmlUC`; placeholder interactions use the import/remove use cases. Event routing, error handling, and resource cleanup are updated accordingly. Method name changed from `initializeModel()` to `initModel()` and `releaseResources()` to `release()`.
DI module wiring `ComputerVisionModule.kt`	Updates Koin bindings to register `VisionRepository` (backed by `VisionRepositoryImpl` with `OcrSource`), explicitly registers `OcrSource`, `RegionOcrProcessor`, and `GenericBoxResolver`, and adds singleton registrations for all five use cases. The `ComputerVisionViewModel` factory is updated to inject the use cases instead of helpers.

Minor Logic Fix

Layer / File(s)	Summary
DrawableCleaner normalization `ValueCleanersImpl.kt`	Adds a post-cleanup normalization step to replace the substring `im_age` with `image` before producing the final `@drawable/...` value, improving OCR-based resource name resolution.

Sequence Diagram(s)

sequenceDiagram
    participant ViewModel as ComputerVisionViewModel
    participant PrepareUC as PrepareImageUC
    participant RunVisionUC as RunVisionUC
    participant VisionRepo as VisionRepository
    participant YoloSource as YoloModelSource
    participant OCRSource as OcrSource
    participant GenerateUC as GenerateXmlUC
    participant Converter as YoloToXmlConverter

    ViewModel->>PrepareUC: invoke(uri)
    activate PrepareUC
    PrepareUC->>PrepareUC: decode & rotate image
    PrepareUC->>PrepareUC: compute left/right boundaries
    PrepareUC-->>ViewModel: Result<PreparedImage>
    deactivate PrepareUC

    ViewModel->>RunVisionUC: invoke(bitmap, leftPct, rightPct, onProgress)
    activate RunVisionUC
    RunVisionUC->>VisionRepo: detectWidgets(bitmap)
    activate VisionRepo
    VisionRepo->>YoloSource: runInference(bitmap)
    YoloSource-->>VisionRepo: List<DetectionResult>
    VisionRepo-->>RunVisionUC: Result<List<DetectionResult>>
    deactivate VisionRepo
    
    RunVisionUC->>VisionRepo: recognizeText(bitmap)
    activate VisionRepo
    VisionRepo->>OCRSource: recognize(bitmap)
    OCRSource-->>VisionRepo: List<TextBlock>
    VisionRepo-->>RunVisionUC: Result<List<TextBlock>>
    deactivate VisionRepo
    
    RunVisionUC->>RunVisionUC: merge detections & parse annotations
    RunVisionUC-->>ViewModel: Result<VisionResult>
    deactivate RunVisionUC

    ViewModel->>GenerateUC: invoke(detections, annotations, ...)
    activate GenerateUC
    GenerateUC->>Converter: generateXmlLayout(...)
    activate Converter
    Converter->>Converter: scale, associate text, build layout
    Converter-->>GenerateUC: Pair<layoutXml, stringsXml>
    deactivate Converter
    GenerateUC-->>ViewModel: Result<Pair<String, String>>
    deactivate GenerateUC

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

appdevforall/CodeOnTheGo#1185: Directly overlaps with refactoring of YoloToXmlConverter's text/annotation matching and layout generation flow.
appdevforall/CodeOnTheGo#1171: Main PR builds on SmartBoundaryDetector and guide-percentage logic for left/right boundary detection in PrepareImageUC.
appdevforall/CodeOnTheGo#887: Main PR's handling of left/right guide percentages in vision pipeline directly connects to UI changes that emit UpdateGuides events.

Suggested reviewers

Daniel-ADFA
avestaadfa
hal-eisen-adfa

Poem

🐰 A processor once grand, now split into pieces small,
Detections scale, text finds home, builders heed the call,
Use cases guide the flow, the ViewModel stays light,
Separated concerns dance—refactored just right! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly summarizes the main changes: an OCR metadata parsing fix and a comprehensive refactoring of the Computer Vision domain architecture.
Description check	✅ Passed	The description comprehensively covers the OCR fix and architectural refactoring, explaining both the problem and the solution with specific component names and improvements.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/ADFA-3813-ocr-metadata-parsing-experimental

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/DetectionScaler.kt`:
- Around line 26-27: Integer division in DetectionScaler.kt causes normW and
normH to lose precision; cast operands to floating-point before dividing (e.g.,
convert (rect.right - rect.left) and sourceWidth/sourceHeight to Float/Double)
so normalization uses floating-point math, and ensure normW/normH types match
(Float/Double) where they are used; update the normalization lines referencing
rect, normW, normH, sourceWidth and sourceHeight accordingly.

In
`@cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/parser/ValueCleanersImpl.kt`:
- Around line 167-168: The current cleanup replaces "im_age" but leaves the OCR
variant "im" producing "@drawable/im"; in ValueCleanersImpl (the cleaned ->
finalCleaned flow) update the replacement logic to also normalize standalone
"im" to "image" (use a word-boundary or equivalent check so you don't
accidentally change substrings) before returning "@drawable/$finalCleaned";
ensure you apply this to the same cleaned/finalCleaned variable used in the
return path so empty-value fallback still returns rawValue.

In
`@cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/usecase/PrepareImageUC.kt`:
- Around line 45-51: The code in PrepareImageUC.kt that computes orientation
currently catches all Exceptions; narrow this to specific expected exceptions
(e.g., catch (ioe: IOException)) around the
contentResolver.openInputStream(uri)?.use { ... } / ExifInterface(...) call so
only IO problems are swallowed and other unexpected errors still surface;
optionally add an additional catch for SecurityException if permission issues
are possible. Ensure the fallback to ExifInterface.ORIENTATION_NORMAL remains in
the catch block.

In
`@cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/ui/viewmodel/ComputerVisionViewModel.kt`:
- Line 70: When handling ComputerVisionEvent.UpdateGuides in the ViewModel,
sanitize the incoming event.leftPct and event.rightPct before storing: clamp
both values to the [0f, 1f] range, then order them so leftGuidePct <=
rightGuidePct, and finally call _uiState.update with the normalized values
(leftGuidePct and rightGuidePct) to prevent crossed or out-of-range guides from
affecting downstream filtering and annotation parsing.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: ce97d103-0463-426a-8c1d-8d278e025120

📥 Commits

Reviewing files that changed from the base of the PR and between b85bbc2 and b667da3.

📒 Files selected for processing (18)

cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/data/repository/ComputerVisionRepository.kt
cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/data/repository/ComputerVisionRepositoryImpl.kt
cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/data/repository/VisionRepository.kt
cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/data/repository/VisionRepositoryImpl.kt
cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/di/ComputerVisionModule.kt
cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/DetectionScaler.kt
cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/LayoutGeometryProcessor.kt
cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/LayoutTreeBuilder.kt
cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/TextAssociator.kt
cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/YoloToXmlConverter.kt
cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/parser/ValueCleanersImpl.kt
cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/usecase/GenerateXmlUC.kt
cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/usecase/ImportPlaceholderImageUC.kt
cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/usecase/PrepareImageUC.kt
cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/usecase/RemovePlaceholderImageUC.kt
cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/usecase/RunVisionUC.kt
cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/xml/AndroidXmlGenerator.kt
cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/ui/viewmodel/ComputerVisionViewModel.kt

💤 Files with no reviewable changes (3)

cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/data/repository/ComputerVisionRepository.kt
cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/data/repository/ComputerVisionRepositoryImpl.kt
cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/LayoutGeometryProcessor.kt

Encapsulate guide limits, narrow EXIF exceptions, and fix 'im' drawable regex.

jatezzz added 2 commits May 13, 2026 11:13

fix: correct OCR image name parsing

e22caaa

refactor: Resolves incorrect drawable resource mapping and extracts d…

b667da3

…omain logic into use cases.

jatezzz requested review from a team, Daniel-ADFA and avestaadfa May 13, 2026 16:46

coderabbitai Bot reviewed May 13, 2026

View reviewed changes

refactor: improve bounds safety, OCR parsing, and exception handling

9815c28

Encapsulate guide limits, narrow EXIF exceptions, and fix 'im' drawable regex.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ADFA-3813 | Fix OCR image metadata parsing and refactor CV domain#1300

ADFA-3813 | Fix OCR image metadata parsing and refactor CV domain#1300
jatezzz wants to merge 3 commits into
stagefrom
fix/ADFA-3813-ocr-metadata-parsing-experimental

jatezzz commented May 13, 2026 •

edited by atlassian Bot

Loading

Uh oh!

coderabbitai Bot commented May 13, 2026 •

edited

Loading

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jatezzz commented May 13, 2026 • edited by atlassian Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Details

Ticket

Observation

Uh oh!

coderabbitai Bot commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

jatezzz commented May 13, 2026 •

edited by atlassian Bot

Loading

coderabbitai Bot commented May 13, 2026 •

edited

Loading