Skip to content

Corphon/SceneIntruderMCP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

489 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SceneIntruderMCP

SceneIntruderMCP Logo

AI-native storytelling workspace for scenes, comics, scripts, and video

Go Version License

English | 简体中文

What this project is now

SceneIntruderMCP is no longer just a lightweight scene-analysis demo. The current product is a unified creative workspace built around one Go backend and one React SPA, with four connected modules:

  1. Interactive scenes — analyze text into scene data, characters, items, context, and branching story flows.
  2. Comics Studio — a 5-step workflow for analysis, prompts, key elements, references, image generation, and export.
  3. Video Studio — build timeline data from comics results, generate clip assets asynchronously, inspect recovery state, and export bundles.
  4. New Script — create writing projects, generate initial drafts, revise chapters, and export manuscripts.

The system also centralizes LLM / Vision / Video provider configuration, long-running task tracking via SSE, and file-based persistence under data/.

Current capability map

Backend and runtime

  • Go + Gin server
  • SPA hosting from the same binary
  • Unified config in data/config.json
  • Encrypted API key storage via data/.encryption_key or CONFIG_ENCRYPTION_KEY
  • SSE progress endpoint: GET /api/progress/:taskID
  • Plain WebSocket endpoints for scene/user realtime channels
  • File-based storage for scenes, stories, comics, scripts, exports, and users

Frontend workspaces

  • / — scenes home
  • /settings — LLM, Vision, Video settings
  • /scenes/:id — scene detail
  • /scenes/:id/story — story mode
  • /scenes/:id/comic — Comics Studio
  • /scenes/:id/comic/video — Video Studio
  • /scripts / /scripts/:id — script workspace

LLM providers officially wired in backend

The backend currently registers these providers:

  • openai
  • anthropic
  • google
  • deepseek
  • qwen
  • mistral
  • grok
  • glm
  • githubmodels
  • openrouter
  • nvidia

Notes:

  • Reasoning / thinking mode is now default-off across the LLM layer for structured analysis safety.
  • Provider-specific default suppression is applied where supported, including Google, Qwen, and NVIDIA.
  • The frontend still contains an ollama option in Settings UI, but it is not part of the current backend-supported provider matrix and is therefore not documented here as an officially available path.

Vision providers

  • placeholder
  • sdwebui
  • dashscope
  • gemini
  • ark
  • openai
  • glm

Default model and model catalog are delivered through GET /api/settings via vision_default_model, vision_models, and vision_model_providers.

Video providers

  • dashscope
  • kling
  • google
  • vertex
  • ark
  • mock

Default video model routing is also exposed through GET /api/settings via video_default_model, video_models, and video_model_providers.

Architecture overview

frontend (React + Vite + MUI)
                │
                ├─ REST (/api/*)
                ├─ SSE  (/api/progress/:taskID)
                └─ WS   (/ws/*)
                                │
backend (Go + Gin)
                │
                ├─ config / auth / rate limit / API handlers
                ├─ LLM / Vision / Video / Story / Script / Comic services
                └─ file storage under data/

Core source directories:

  • cmd/server — server entry
  • internal/api — router, handlers, middleware
  • internal/app — application bootstrapping and provider registration
  • internal/config — runtime config and encryption handling
  • internal/llm — provider abstraction and reasoning control
  • internal/services — business logic
  • internal/vision — vision providers
  • frontend — SPA client

Quick start

Prerequisites

  • Go 1.21+
  • Node.js 18+
  • npm 9+
  • At least one usable provider credential

1. Install dependencies

go mod download
cd frontend
npm install
cd ..

2. Build frontend assets

cd frontend
npm run build
cd ..

3. Start the server

go run ./cmd/server

Default address: http://localhost:8080

4. Open the app

  • Home: http://localhost:8080/
  • Settings: http://localhost:8080/settings

Configuration model

The runtime config is persisted to data/config.json.

Important top-level fields:

  • llm_provider, llm_config
  • vision_provider, vision_default_model, vision_config
  • video_provider, video_default_model, video_config
  • vision_models, vision_model_providers
  • video_models, video_model_providers

Minimal example:

{
    "port": "8080",
    "data_dir": "data",
    "static_dir": "frontend/dist/assets",
    "templates_dir": "frontend/dist",
    "log_dir": "logs",
    "debug_mode": true,
    "llm_provider": "nvidia",
    "llm_config": {
        "api_key": "",
        "base_url": "https://integrate.api.nvidia.com/v1",
        "default_model": "moonshotai/kimi-k2.5"
    },
    "vision_provider": "glm",
    "vision_default_model": "glm-image",
    "vision_config": {
        "endpoint": "https://open.bigmodel.cn/api/paas/v4",
        "api_key": ""
    },
    "video_provider": "dashscope",
    "video_default_model": "wan2.6-i2v-flash",
    "video_config": {
        "endpoint": "https://dashscope.aliyuncs.com/api/v1",
        "api_key": "",
        "public_base_url": "https://your-domain.example"
    }
}

Encryption notes

  • If CONFIG_ENCRYPTION_KEY is absent in development mode, the app creates data/.encryption_key automatically.
  • Keep data/.encryption_key together with data/config.json.
  • Deleting the key file invalidates previously encrypted credentials.

Recommended first-run path

  1. Build frontend assets.
  2. Start backend.
  3. Open Settings and configure one working LLM provider.
  4. Optionally configure Vision and Video providers.
  5. Create a scene or a standalone comic workspace.

Key operational behaviors

Long-running jobs

Analysis, prompt generation, image generation, script generation, and video generation are asynchronous. The usual pattern is:

  1. Start a job and receive task_id
  2. Subscribe to GET /api/progress/:taskID
  3. Fetch final result from the corresponding GET endpoint

Guest vs authenticated usage

  • Many scene-oriented routes degrade to console_user when auth is missing or invalid.
  • User-scoped routes under /api/users/:user_id/... require authenticated ownership.
  • Scripts routes require authenticated access.

Video provider note

Some video providers need a publicly reachable reference image URL. In practice, that means video_config.public_base_url should usually be configured for deployed environments.

Development commands

Backend:

go test ./...
go run ./cmd/server

Frontend:

cd frontend
npm run dev
npm test
npm run lint
npm run build

Documentation index

Current scope boundary

This repository already contains significantly more than an initial prototype. The maintained documentation now treats it as:

  • a multi-workspace creative platform,
  • with provider-configurable AI services,
  • job-based asynchronous generation,
  • and documented operational deployment requirements.

That is the baseline future changes should preserve.