Skip to content

Number16BusShelter/VAI0

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

20 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

VAIO – Video Auto Intelligence Operator

🎬 VAI0 β€” Video Auto Intelligence Operator

🎧 Audio β€’ πŸ’¬ Captions β€’ πŸ“ SEO β€’ 🌐 Translations β€’ 🧠 Knowledge Base
End-to-end AI automation for video processing with contextual intelligence.

VAI0 (Video Auto Intelligence Operator) is an end-to-end CLI workflow that converts your raw videos into multilingual, SEO-optimized YouTube assets β€” including captions, titles, and descriptions β€” enhanced with contextual knowledge for superior content quality.


✨ Features

Stage Description
🎧Audio Extraction Extracts .mp3 from your video using FFmpeg
πŸ’¬Caption Generation Transcribes or translates audio to .srt via Whisper
πŸ“TD Generation Builds SEO-optimizedTitle + Description (TD) using Ollama with template support
🌐TD Translation Localizes TDs into multiple target languages with cultural adaptation
πŸ’¬Caption Translation Produces synchronized .srt subtitles in all supported languages
🧠Knowledge Base Enhances generation with domain-specific context (PDFs, docs, guides)
βš™οΈAuto Resume Tracks progress in .vaio.json, enabling vaio continue

πŸ—οΈ Architecture

VAI0 uses a modular operator model where each stage can run independently or in sequence:

VAI0/

β”œβ”€β”€ config.yml
β”œβ”€β”€ vaio/                # Core framework
β”‚   β”œβ”€β”€ cli.py		 # CLI Controller
β”‚   β”œβ”€β”€ core/            # Base utilities & stage implementations
β”‚   └──  kb/              # Knowledge Base integration
β”œβ”€β”€ knowledge/           # Domain knowledge sources
β”‚   └── default/         # Default reference materials
└── data/                # Persistent data
    └── kb/              # Vector store (ChromaDB)

⚑ Quick Start

# Clone and setup
git clone https://github.com/number16busshelter/vaio.git
cd vaio
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

# Run full automation
vaio ./MyVideo.mp4

VAIO automatically performs:

🎧 Audio extraction β†’ πŸ’¬ Captioning β†’ πŸ“ TD generation β†’ 🌐 Translation β†’ πŸ’¬ Caption translation

All outputs are stored beside the video.


🧠 Knowledge Base Integration

VAI0 can enhance content generation with domain-specific knowledge:

Default Setup

# Knowledge sources go here
knowledge/default/
β”œβ”€β”€ product-guides.pdf
β”œβ”€β”€ brand-guidelines.md
β”œβ”€β”€ technical-specs.txt
└── marketing-materials/

# Vector storage (auto-created)
data/kb/default/

Configuration

Set in your video's .vaio.json:

{
  "knowledge": "/path/to/your/knowledge",
  "language": "en",
  "title": "...",
  "description": "..."
}

KB Management

# Build knowledge base from documents
vaio kb build ./video.mp4

# Set custom knowledge directory
vaio kb set ./video.mp4 --knowledge ./my-docs

# Disable KB for a project
vaio kb set ./video.mp4 --knowledge none

# View KB statistics
vaio kb stats ./video.mp4

# List indexed documents
vaio kb list ./video.mp4

πŸ“ Template-Driven Content Generation

Create tdtmp.txt for structured content generation:

<!-- <Instructions> -->
- Generate high-quality, SEO-optimized content
- Use professional tone
- Preserve all formatting outside semantic blocks
<!-- </Instructions> -->

<!-- <Context> -->
Your brand context and guidelines here
<!-- </Context> -->

<!-- <Video Name> -->
Suggested title inspiration
<!-- </Video Name> -->

<!-- <Video Description> -->
Style and tone guidelines for description
<!-- </Video Description> -->

β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”

πŸ”— Your permanent links
🏷️ Product specifications  
✈️ Global delivery info

β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”

<!-- <Hash tags> -->
#Your #Hashtag #Inspiration
<!-- </Hash tags> -->

VAI0 will:

  • Interpret semantic blocks as guidelines
  • Generate fresh, optimized content
  • Preserve all verbatim formatting exactly
  • Optimize hashtags based on content

🧰 System Requirements

Dependency Purpose Installation
FFmpeg Audio extraction brew install ffmpeg or download
Whisper Speech-to-text pip install openai-whisper
Ollama Local LLM runtime Install Ollama
Python 3.12+ Runtime Python downloads

Verify Installation

vaio check

Expected output:

FFmpeg: βœ… OK
Whisper: βœ… OK
Ollama: βœ… OK
Meta file access: βœ… OK
Knowledge Base: βœ… OK

🧭 Command Reference

Core Operations

Command Purpose
vaio <video> Full automation pipeline
vaio audio <video> Extract audio & generate captions
vaio desc <video> Create SEO title + description
vaio translate <video> Translate TDs into multiple languages
vaio captions <video> Translate .srt subtitles
vaio continue <video> Resume from last completed stage

Knowledge Base Management

Command Purpose
vaio kb build <video> Build/re-build KB index
vaio kb list <video> List indexed documents
vaio kb stats <video> Show KB statistics
vaio kb clear <video> Clear KB index (keep files)
vaio kb set <video> --knowledge <path> Set custom KB path

πŸ“ Output Structure

MyVideo.mp4
β”œβ”€β”€ MyVideo.mp3
β”œβ”€β”€ captions/
β”‚   β”œβ”€β”€ MyVideo.en.srt
β”‚   β”œβ”€β”€ MyVideo.es.srt
β”‚   └── ...
β”œβ”€β”€ description/
β”‚   β”œβ”€β”€ td.en.txt
β”‚   β”œβ”€β”€ td.es.txt
β”‚   └── ...
β”œβ”€β”€ knowledge/           # (if project-specific KB)
β”‚   β”œβ”€β”€ product-info.pdf
β”‚   └── brand-guidelines.md
└── MyVideo.vaio.json   # Progress tracking & config

βš™οΈ Configuration

Core Constants (vaio/core/constants.py)

SOURCE_LANGUAGE = "English"
SOURCE_LANGUAGE_CODE = "en"
TARGET_LANGUAGES = {
    "en": "English",
    "es": "Spanish", 
    "fr": "French",
    "de": "German",
    "ja": "Japanese",
    "zh": "Chinese",
}
WHISPER_MODEL = "large-v3-turbo"
OLLAMA_MODEL = "llama3.1:8b"
DEFAULT_EMBED_MODEL = "sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2"

Supported Knowledge Formats

  • πŸ“„ PDF, TXT, MD, JSON, YAML, CSV
  • 🚫 Auto-ignores: .DS_Store, .git, lock files, system files

🧩 Example Workflow

# 1. Setup knowledge base
cp -r my-product-docs/ knowledge/default/

# 2. Build KB index
vaio kb build ./product-video.mp4

# 3. Create template
cp tdtmp.example.txt product-video-tdtmp.txt
# Edit template with your brand guidelines...

# 4. Run enhanced generation
vaio desc ./product-video.mp4 --template-file product-video-tdtmp.txt

Output:

🧠 KB active: vaio_kb_default (15 documents)
πŸ“‹ Using template: product-video-tdtmp.txt
🧱 Parsed template sections: Instructions, Context, Video Name, Video Description, Hash tags
🧠 Generating FRESH description content...
🧠 Optimizing hashtags...
βœ… TD generated β†’ description/td.en.txt

🐳 Docker Support (Optional)

FROM python:3.12-slim

# Install system dependencies
RUN apt-get update && apt-get install -y ffmpeg && rm -rf /var/lib/apt/lists/*

# Install VAI0
COPY . /app
WORKDIR /app
RUN pip install -r requirements.txt

ENTRYPOINT ["python", "vaio/cli.py"]

Build and run:

docker build -t vaio .
docker run -v $(pwd):/workspace vaio /workspace/MyVideo.mp4

πŸ§‘β€πŸ’» Development

Project Structure

vaio/
β”œβ”€β”€ core/
β”‚   β”œβ”€β”€ audio.py          # Audio extraction
β”‚   β”œβ”€β”€ description.py    # TD generation with templates
β”‚   β”œβ”€β”€ translate.py      # Multilingual translation
β”‚   β”œβ”€β”€ captions.py       # Subtitle processing
β”‚   └── constants.py      # Configuration
β”œβ”€β”€ kb/
β”‚   β”œβ”€β”€ loader.py         # Document loading
β”‚   β”œβ”€β”€ store.py          # Vector storage (Chroma)
β”‚   β”œβ”€β”€ query.py          # Context retrieval
β”‚   └── cli.py            # KB management commands
└── cli.py                # Main entry point

Running Tests

# Test individual stages
vaio audio ./test.mp4
vaio desc ./test.mp4 --template-file tdtmp.example.txt
vaio kb build ./test.mp4
vaio kb stats ./test.mp4

VS Code Integration

Create .vscode/launch.json:

{
  "version": "0.2.0",
  "configurations": [
    {
      "name": "Run VAI0",
      "type": "python",
      "request": "launch",
      "program": "vaio/cli.py",
      "args": ["./test.mp4"],
      "console": "integratedTerminal"
    }
  ]
}

πŸ› οΈ Built With


πŸ“„ License

MIT License Β© 2025 AXID.ONE


🀝 Contributing

We welcome contributions! Please see our Contributing Guidelines and check the issue tracker before submitting pull requests.


πŸ†˜ Support

About

🎬 VAI0 β€” AI-powered CLI tool that automatically extracts audio, generates multilingual captions, and creates SEO-optimized YouTube titles & descriptions using Whisper + Ollama.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors