Skip to content

parisbs/blindo-vision-android

Blindo Vision for Android

License: AGPL v3

Blindo Vision is an Android accessibility tool that describes whatever is currently on screen. When the user moves the accessibility focus to a UI element and takes a screenshot, Blindo Vision automatically crops the focused region, sends it to Microsoft Azure AI Vision, and surfaces a natural-language caption, the extracted OCR text and contextual tags. The result screen is laid out so that the system screen reader (TalkBack or any compatible reader) can speak it aloud.

The app ships as an Android AccessibilityService rather than a normal launcher app — there is no app icon to tap after installation. Setup is done from Settings → Accessibility → Blindo Vision.

Table of contents


End user guide

Requirements

  • An Android device running Android 7.0 Nougat (API 24) or newer.
  • A working screen reader (TalkBack on most devices, or any AccessibilityService that uses the accessibility focus).
  • An active internet connection — every analysis is a network call to Azure.
  • A Microsoft Azure AI Vision resource, free tier F0 or paid tier S1. You will need its endpoint URL and one of its 32-character API keys.
  • No screen-dimmer or always-on overlay. If a dimming overlay sits above the focused element Blindo Vision cannot identify the focused region and the analysis fails.

Install

Two options:

  • Prebuilt APK. Download the latest release from the GitHub Releases page and install it on your device. From a desktop with adb:
    adb install -r blindo-vision-<version>.apk
  • Build it yourself. See Building from source below.

After installation no launcher icon appears — that is expected. The app is reached through Settings → Accessibility → Installed services → Blindo Vision.

Configure Azure credentials

  1. Open Settings → Accessibility → Installed services → Blindo Vision.
  2. Tap the cog or "Settings" entry next to the service to open Blindo Vision Preferences.
  3. Under Azure AI Vision, set:
    • Endpoint — for example https://my-resource.cognitiveservices.azure.com/. Trailing slash is preserved as you typed it; the placeholder https://example.cognitiveservices.azure.com/ and 32 characters api key shown by default are not valid credentials and must be replaced.
    • API Key — the 32-character key from your Azure resource.
  4. Under General, pick a Language for tags and OCR (about 28 languages available — Arabic, Chinese, Czech, Danish, Dutch, English, Finnish, French, German, Greek, Hebrew, Hindi, Hungarian, Indonesian, Italian, Japanese, Korean, Norwegian, Polish, Portuguese, Romanian, Russian, Spanish, Swedish, Thai, Turkish, Vietnamese, plus Auto-detect). Captions (the natural-language description) are always generated in English — this is an Azure AI Vision 4.0 limitation, not a project choice.

Enable the accessibility service

  1. Still in Settings → Accessibility, find Blindo Vision in the installed services list and toggle it On.
  2. Android will prompt to grant BIND_ACCESSIBILITY_SERVICE and warn that the service can observe screen content and take screenshots. Confirm.
  3. Make sure your screen reader (e.g. TalkBack) is also enabled in the same screen — Blindo Vision relies on the accessibility focus that the screen reader manages.

Using the app

  1. Open any app or screen you want to inspect.
  2. Use your screen reader to move the accessibility focus to the element of interest.
  3. Take a screenshot the usual way — on stock Android that is Volume Down + Power, and TalkBack also offers a dedicated screenshot gesture.
  4. Blindo Vision detects the new screenshot, crops the focused region, sends a single Azure AI Vision request that returns the caption, OCR text and tags together, and opens the Analysis Results screen.
  5. The Results screen reads as Description: …, Text: …, Tags: …. Your screen reader speaks each field in order.

Provisioning Azure AI Vision

You can do this entirely from the Azure Portal. The CLI flow below is the scripted equivalent. If you don't have an Azure account, see how to create one for free.

1. Install the Azure CLI

On Windows Subsystem for Linux (Ubuntu recommended):

curl -sL https://aka.ms/InstallAzureCLIDeb | sudo bash
az login

A browser opens for sign-in; otherwise visit https://aka.ms/devicelogin. For other platforms see the Azure CLI install guide.

2. Pick a subscription and create a resource group

az account list
az account set --subscription "Your subscription name"
az group create --name yourResourceGroupName --location eastus

List available locations with az account list-locations.

3. Create the Azure AI Vision resource

Free tier (F0, 5,000 transactions / month) or paid (S1). The ARM resource kind is still named ComputerVision even though Microsoft now markets the service as "Azure AI Vision":

az cognitiveservices account create \
    --kind ComputerVision \
    --name your-resource-name \
    --resource-group yourResourceGroupName \
    --sku F0 \
    --location eastus \
    --custom-domain your-resource-name \
    --yes

The command output contains the endpoint you need. To list it again:

az cognitiveservices account list

4. Get the API keys

az cognitiveservices account keys list \
    --name your-resource-name \
    --resource-group yourResourceGroupName

Use either key1 or key2 (each is 32 hexadecimal characters).

5. Paste them into Blindo Vision

Follow Configure Azure credentials.


Languages and screen reader compatibility

  • Captions: always generated in English. Azure AI Vision 4.0 only supports English for the caption feature; the language preference does not change this.
  • Tag and OCR languages selectable in the app: Arabic, Chinese (Simplified), Czech, Danish, Dutch, English, Finnish, French, German, Greek, Hebrew, Hindi, Hungarian, Indonesian, Italian, Japanese, Korean, Norwegian, Polish, Portuguese, Romanian, Russian, Spanish, Swedish, Thai, Turkish, Vietnamese, plus Auto-detect (sends no language hint to Azure). See Azure AI Vision language support for the authoritative matrix per feature.
  • Screen readers: Blindo Vision uses the standard Android accessibility focus, so it works with any screen reader that drives that focus. TalkBack is the baseline test target. Make sure the screen reader is enabled in Settings → Accessibility alongside Blindo Vision.

Pricing and quota

Blindo Vision itself is free. Cost is whatever Azure charges for the AI Vision resource you point it at — see the official Azure AI Vision pricing page.

Every Blindo Vision analysis is one HTTP call to the Image Analysis 4.0 endpoint, requesting caption, tags and read together. Azure bills Image Analysis 4.0 per feature, per image, so each successful analysis consumes roughly 3 transactions (one per requested feature). On the free F0 tier (5,000 transactions / month) that works out to roughly 1,600 analyses / month. Failed calls can still count toward quota; check your Azure portal for the exact remaining budget and re-read the pricing page before relying on this estimate — Microsoft changes the per-feature rules from time to time.


Building from source

Toolchain prerequisites

Tool Version
JDK 17 (source/target compatibility, Kotlin JVM target 17)
Android SDK platform API 35
Android SDK build-tools 35.0.0
minSdk 24 (Android 7.0)
Gradle 8.7 — pinned via the wrapper, no manual install needed
Android Gradle Plugin 8.5.1
Kotlin 1.9.24

Android Studio Ladybug or newer covers all of the above through its SDK Manager. From the command line you can use sdkmanager "platforms;android-35" "build-tools;35.0.0".

Clone, build, install

git clone https://github.com/parisbs/blindo-vision-android.git
cd blindo-vision-android

# Debug APK
./gradlew assembleDebug
# → app/build/outputs/apk/debug/app-debug.apk

# Install on a connected device/emulator (API 24+)
adb install -r app/build/outputs/apk/debug/app-debug.apk

A release build (./gradlew assembleRelease) enables R8 minification but the repository does not ship a signing config — the produced APK is unsigned and must be signed with your own keystore before distribution.

From Android Studio: File → Open the repo root, let Gradle sync, then run the app configuration on a device or emulator with API 24+.

Running tests

# JVM unit tests (JUnit + Robolectric + MockK + MockWebServer)
./gradlew test

# Instrumented tests on a connected device/emulator (Espresso)
./gradlew connectedAndroidTest

Project structure

blindo-vision-android/
├── app/
│   ├── build.gradle              # :app module config (SDK, deps, build types)
│   └── src/main/
│       ├── AndroidManifest.xml   # Permissions + VisionService declaration
│       ├── java/com/blindo/vision/
│       │   ├── Vision.kt         # Application; wires the Koin modules
│       │   ├── services/
│       │   │   ├── VisionService.kt          # The AccessibilityService entry point
│       │   │   └── detector/                 # Screenshot detection per API level
│       │   │       ├── ScreenshotDetectorFactory.kt  # API >=29 vs 24-28 branching
│       │   │       ├── MediaStoreDetector.kt
│       │   │       ├── LegacyMediaStoreDetector.kt
│       │   │       ├── ScreenCaptureCallbackDetector.kt
│       │   │       └── UiElementCropper.kt
│       │   ├── usecases/         # AnalyzeImage (single unified Azure call)
│       │   ├── data/vision/azure/# Retrofit client + AzureAuthInterceptor
│       │   ├── models/vision/    # DTOs + VisionLanguage enum
│       │   ├── preferences/      # PreferencesManager + settings UI
│       │   ├── permissions/      # PermissionsActivity (runtime READ_MEDIA_IMAGES)
│       │   ├── results/          # ResultsActivity + ResultsViewModel
│       │   └── utils/            # Timber tree, helpers
│       └── res/xml/
│           ├── vision_service_config.xml     # Accessibility service capabilities
│           └── vision_preferences.xml        # Settings UI definition
├── libraries.gradle              # Central version catalog (Groovy)
├── build.gradle                  # Root build script (AGP 8.5.1, Kotlin 1.9.24)
├── settings.gradle
└── gradle/wrapper/               # Gradle 8.7 wrapper

Networking detail worth knowing: there is no compile-time base URL. Retrofit is built with a placeholder host, and AzureAuthInterceptor rewrites every request's URL using the endpoint stored in PreferencesManager and adds the Ocp-Apim-Subscription-Key header — that's why changing the endpoint in Preferences takes effect without an app restart.


End-to-end verification

After installing your own build, run through this checklist to confirm the full pipeline works:

  1. Install the debug APK on a device or emulator running API 24+.
  2. Open Settings → Accessibility → Installed services → Blindo Vision → Settings. Verify the Preferences screen opens — this is the only Blindo Vision UI reachable without an accessibility event, and there is no app icon in the launcher (expected).
  3. Enter a valid Azure endpoint and 32-character key; pick a description language.
  4. Back in Settings → Accessibility, toggle Blindo Vision on and accept the BIND_ACCESSIBILITY_SERVICE prompt.
  5. Enable TalkBack (or another screen reader) in the same Accessibility screen.
  6. Open any other app, focus a UI element with TalkBack, take a screenshot (Volume Down + Power, or TalkBack's screenshot gesture).
  7. The Analysis Results activity should appear within a couple of seconds with non-empty Description, Text and Tags fields, all spoken by the screen reader.

If any step fails, see Troubleshooting.


Troubleshooting

The Results screen never opens after a screenshot.

  • The accessibility service is off — recheck Settings → Accessibility → Blindo Vision.
  • No screen reader is enabled, so there is no accessibility focus to crop.
  • A screen-dimmer or overlay is active and obscures the focus.
  • The screenshot didn't reach MediaStore (some custom ROMs or screenshot tools bypass it). Use the system screenshot shortcut.
  • On API 33+ the app needs READ_MEDIA_IMAGES. If you denied the prompt in PermissionsActivity, grant it from App info → Permissions.

The Results screen opens but every field is empty or shows "Unable to analize element".

  • Endpoint URL is malformed. It must look like https://<your-resource>.cognitiveservices.azure.com/.
  • API key is not exactly 32 characters or belongs to a different region than the endpoint.
  • No internet, or a captive-portal Wi-Fi blocks the request.
  • Azure free-tier quota for the month is exhausted. Check the Azure portal — failed calls still consume transactions.

Build fails with "Failed to find Build Tools revision 35.0.0". Install it through Android Studio's SDK Manager or run sdkmanager "build-tools;35.0.0" "platforms;android-35".

./gradlew fails with a Java version error. The project requires JDK 17. Point JAVA_HOME at a JDK 17 install, or set the Gradle JDK in Android Studio → Settings → Build Tools → Gradle → Gradle JDK.


Security note

The Azure endpoint and API key are stored in plain-text Android SharedPreferences. Anyone with physical access to an unlocked device or a backup extract can read them. Do not load a personal Azure key onto a shared or unattended device. To report a vulnerability privately, follow SECURITY.md.


Important

Neither the owner nor the collaborators of Blindo Vision are responsible for any costs derived from the use of the Microsoft Azure AI Vision services, nor are we directly linked to Microsoft Azure nor do we receive any type of compensation for the use that users may make of the Microsoft Azure Computer Vision services. The privacy of images sent to Microsoft Azure AI Vision servers is the sole responsibility of Microsoft Azure; the owner and contributors are not responsible for the local storage of screenshots and their privacy because, being local and managed by the operating system, they are outside the scope of our responsibility.


License, branding, distribution, contributing

License. Blindo Vision is distributed under the GNU Affero General Public License, version 3 only (AGPL-3.0-only). Anyone can use, study, modify and share the code, but any distributed or network-deployed derivative must also be released as source under AGPL-3.0-only. See LICENSE for the full text and NOTICE for the copyright statement.

Branding and trademark. The AGPL grants broad freedoms over the source code but does not grant rights over the "Blindo Vision" name, logo, or visual identity, which remain unregistered trademarks of the project owner. If you publish a modified version you must rebrand it: change the application name, change the Android applicationId from com.blindo.vision to a namespace you own, replace the launcher icon, and make clear that your fork is a derivative work and is not endorsed by the upstream project. Republishing this app on Google Play, the Amazon Appstore, F-Droid or any other channel under the "Blindo Vision" name or branding is not permitted. See NOTICE for the full policy.

Distribution. The only official channels are this GitHub repository (source code) and its GitHub Releases (signed APKs, when published), plus any future store listings published from the project owner's own developer account and announced here. Any other listing — particularly any paid listing on an app store — is not an official Blindo Vision release. Please open an issue to report unofficial listings.

Contributing. Contributions are welcome. There is no CLA: by submitting a contribution you license it to the project under AGPL-3.0-only on an inbound = outbound basis. See CONTRIBUTING.md for the full guide and CODE_OF_CONDUCT.md for community expectations.

About

Describe any image on your Android screen for accessibility purposes

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Contributors

Languages