Mosaic is a desktop application designed to assist people with memory loss. It identifies visitors via webcam and displays an AI-generated briefing about who they are and what you last talked about — so users always have context before a conversation begins. Originally built for HackMerced XI (Devpost).
- To install, check the releases section and install the version for your os from the latest release
When a known face appears on webcam, Mosaic shows a personalized card with the visitor's name and a summary of past conversations. Audio is transcribed in real time during the session. When recording stops, the transcript is summarized and stored, updating the briefing for the next visit.
Unknown visitors can be enrolled by name during their first appearance. Over time, the system builds a longitudinal memory of each relationship.
flowchart LR
classDef client fill:none,stroke:#4A90D9,stroke-width:2px,color:#4A90D9
classDef service fill:none,stroke:#5BA85A,stroke-width:2px,color:#5BA85A
classDef datastore fill:none,stroke:#8B6BB1,stroke-width:2px,color:#8B6BB1
classDef legend fill:none,stroke:none,color:#888,font-size:12px
Frontend["Frontend<br/>React + Tauri"]:::client
Client["Client<br/>Go"]:::client
FD["Face Detection Service<br/>Go — dlib, face_recognition"]:::service
AT["Audio Transcription Service<br/>Python — Whisper"]:::service
CB["Conversation Briefing Service<br/>Go — Ollama + RAG"]:::service
DB[("PostgreSQL")]:::datastore
Valkey[("Valkey")]:::datastore
Frontend --> Client
Client --> FD
Client --> AT
Client --> CB
FD <--> Valkey
FD <--> DB
AT <--> DB
CB <--> DB
linkStyle 0 stroke:#E8A838,stroke-width:2px
linkStyle 1,2,3 stroke:#00897B,stroke-width:2px
Green arrows — gRPC | Orange arrows — WebSocket
| Service | Language | Description |
|---|---|---|
frontend |
React / Tauri | Captures webcam frames and audio, displays face overlays and briefing cards, streams to client via WebSocket |
client |
Go | Receives webcam and audio streams from the frontend, forwards to backend services via gRPC |
face_detection |
Go | Identifies faces using dlib embeddings, registers new visitors |
audio_transcription |
Python | Transcribes audio in real time using Whisper, persists transcripts |
conversation_briefing |
Go | Generates visitor briefings using a self-hosted Qwen model and RAG over conversation history |
- Transport: gRPC with Protocol Buffers, WebSockets
- Backend services: Go 1.24
- ML services: Python 3.12 (Whisper), Go (dlib / face_recognition)
- LLM: Qwen via Ollama (self-hosted)
- Vector search: pgvector with similarity search
- Databases: PostgreSQL, Valkey
- Desktop shell: Tauri
Each service has its own build tooling. From the repo root:
Audio Transcription (Python)
cd backend/audio_transcription
uv sync --locked --all-extras --dev
make test_allClient / Conversation Briefing / Face Detection (Go)
cd backend/<service>
go mod download
make test_allFrontend
cd frontend
npm install
npm run devProto definitions live in backend/proto/. Re-generate bindings with:
cd backend/proto
make generate