🔍 OmniSense — Temporal Video Search

Find exactly when something was said in a video. No more scrubbing.

Paste a YouTube URL or upload a video → it gets transcribed → search in plain English → click a result → video jumps directly to that moment.

✨ What it does

You remember a speaker mentioned something specific in a long video, but can't find where. OmniSense lets you search the video like a document and jumps you straight to the right moment.

Paste YouTube URL  (or upload a video file)
    ↓
Search: "transformer architecture"
    ↓
#1  [12:34 → 13:02]  94% match
    "...the transformer architecture introduced in Attention is All You Need..."
#2  [38:17 → 38:51]  71% match
    "...unlike RNNs, transformers process all tokens in parallel..."
    ↓
Click #1 → video seeks to 12:34 and plays ▶

🆕 What's new

YouTube URL support — paste any YouTube link directly; no download needed before use
Auto video seek — clicking a result jumps the embedded player to the exact timestamp automatically
Dual input modes — YouTube URL takes priority; file upload is the fallback

🏗 Architecture

YouTube URL  ──► yt-dlp download ──┐
                                   ├──► extract_audio()       ffmpeg → 16kHz mono WAV
File upload  ──────────────────────┘         │
                                             ▼
                                       transcribe()            faster-whisper (int8, CPU)
                                             │                 → List[TranscriptChunk(text, start, end)]
                                             ▼
                                   TranscriptSearchIndex       MiniLM encode → FAISS IndexFlatIP
                                             │
                                             ▼
                                       .search(query)          cosine similarity → ranked SearchHits
                                             │
                                             ▼
                                         Gradio UI             result cards + auto seek-to-timestamp

🚀 Quickstart (local)

git clone https://github.com/cksajil/omnisense.git
cd omnisense
git checkout google-colab

python -m venv .venv
source .venv/bin/activate          # Windows: .venv\Scripts\activate

pip install -r requirements.txt

# ffmpeg must be installed on your system:
# Ubuntu/Debian: sudo apt install ffmpeg
# macOS:         brew install ffmpeg
# Windows:       https://ffmpeg.org/download.html

python -m omnisense.app
# Open http://localhost:7860

☁️ Run on Google Colab (recommended for testing)

No local GPU or heavy setup needed. Run the full app on a free T4 GPU:

# Cell 1 — system dependencies
!apt-get install -y ffmpeg

# Cell 2 — clone and install
!git clone https://github.com/cksajil/omnisense.git
%cd omnisense
!git checkout google-colab
!pip install -r requirements.txt

# Cell 3 — launch (generates a public Gradio URL valid for 72 hrs)
import subprocess
subprocess.Popen(["python", "-m", "omnisense.app"])

Gradio will print a public URL like https://xxxx.gradio.live — open it in your browser.

📁 Project structure

omnisense/
├── omnisense/
│   ├── pipelines/
│   │   ├── audio.py          # faster-whisper transcription → TranscriptChunk
│   │   └── search.py         # MiniLM + FAISS → SearchHit
│   ├── utils/
│   │   ├── download.py       # yt-dlp YouTube downloader
│   │   └── logger.py         # loguru setup
│   └── app.py                # Gradio UI entry point
├── tests/
│   └── test_search.py
├── requirements.txt           # CPU-only, HF Spaces compatible
└── README.md

⚙️ Model guide

Model	Speed on CPU (per hr of audio)	Accuracy
tiny	~2–4 min	Rough — good for quick demos
base	~5–8 min	Good — recommended default ✓
small	~10–15 min	Better
medium	~25–35 min	Best CPU-feasible accuracy

On Google Colab with a T4 GPU, all models run significantly faster than the CPU estimates above.

🤗 Deploy to HuggingFace Spaces

# Add your HF Space as a git remote
git remote add space https://huggingface.co/spaces/YOUR_HF_USERNAME/omnisense

# Push google-colab branch to the Space's main
git push space google-colab:main

Or: HuggingFace → New Space → Gradio → link GitHub repo → select the google-colab branch.

Note: YouTube URL downloading may not work on HF Spaces free tier due to outbound network restrictions. File upload works without any issues.

🧪 Running tests

pytest tests/

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
.gradio		.gradio
.vscode		.vscode
assets		assets
logs		logs
notebooks		notebooks
omnisense		omnisense
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🔍 OmniSense — Temporal Video Search

✨ What it does

🆕 What's new

🏗 Architecture

🚀 Quickstart (local)

☁️ Run on Google Colab (recommended for testing)

📁 Project structure

⚙️ Model guide

🤗 Deploy to HuggingFace Spaces

🧪 Running tests

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🔍 OmniSense — Temporal Video Search

✨ What it does

🆕 What's new

🏗 Architecture

🚀 Quickstart (local)

☁️ Run on Google Colab (recommended for testing)

📁 Project structure

⚙️ Model guide

🤗 Deploy to HuggingFace Spaces

🧪 Running tests

📄 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages