SceneIntruderMCP

AI-native storytelling workspace for scenes, comics, scripts, and video

What this project is now

SceneIntruderMCP is no longer just a lightweight scene-analysis demo. The current product is a unified creative workspace built around one Go backend and one React SPA, with four connected modules:

Interactive scenes — analyze text into scene data, characters, items, context, and branching story flows.
Comics Studio — a 5-step workflow for analysis, prompts, key elements, references, image generation, and export.
Video Studio — build timeline data from comics results, generate clip assets asynchronously, inspect recovery state, and export bundles.
New Script — create writing projects, generate initial drafts, revise chapters, and export manuscripts.

The system also centralizes LLM / Vision / Video provider configuration, long-running task tracking via SSE, and file-based persistence under data/.

Current capability map

Backend and runtime

Go + Gin server
SPA hosting from the same binary
Unified config in data/config.json
Encrypted API key storage via data/.encryption_key or CONFIG_ENCRYPTION_KEY
SSE progress endpoint: GET /api/progress/:taskID
Plain WebSocket endpoints for scene/user realtime channels
File-based storage for scenes, stories, comics, scripts, exports, and users

Frontend workspaces

/ — scenes home
/settings — LLM, Vision, Video settings
/scenes/:id — scene detail
/scenes/:id/story — story mode
/scenes/:id/comic — Comics Studio
/scenes/:id/comic/video — Video Studio
/scripts / /scripts/:id — script workspace

LLM providers officially wired in backend

The backend currently registers these providers:

openai
anthropic
google
deepseek
qwen
mistral
grok
glm
githubmodels
openrouter
nvidia

Notes:

Reasoning / thinking mode is now default-off across the LLM layer for structured analysis safety.
Provider-specific default suppression is applied where supported, including Google, Qwen, and NVIDIA.
The frontend still contains an ollama option in Settings UI, but it is not part of the current backend-supported provider matrix and is therefore not documented here as an officially available path.

Vision providers

placeholder
sdwebui
dashscope
gemini
ark
openai
glm

Default model and model catalog are delivered through GET /api/settings via vision_default_model, vision_models, and vision_model_providers.

Video providers

dashscope
kling
google
vertex
ark
mock

Default video model routing is also exposed through GET /api/settings via video_default_model, video_models, and video_model_providers.

Architecture overview

frontend (React + Vite + MUI)
                │
                ├─ REST (/api/*)
                ├─ SSE  (/api/progress/:taskID)
                └─ WS   (/ws/*)
                                │
backend (Go + Gin)
                │
                ├─ config / auth / rate limit / API handlers
                ├─ LLM / Vision / Video / Story / Script / Comic services
                └─ file storage under data/

Core source directories:

cmd/server — server entry
internal/api — router, handlers, middleware
internal/app — application bootstrapping and provider registration
internal/config — runtime config and encryption handling
internal/llm — provider abstraction and reasoning control
internal/services — business logic
internal/vision — vision providers
frontend — SPA client

Quick start

Prerequisites

Go 1.21+
Node.js 18+
npm 9+
At least one usable provider credential

1. Install dependencies

go mod download
cd frontend
npm install
cd ..

2. Build frontend assets

cd frontend
npm run build
cd ..

3. Start the server

go run ./cmd/server

Default address: http://localhost:8080

4. Open the app

Home: http://localhost:8080/
Settings: http://localhost:8080/settings

Configuration model

The runtime config is persisted to data/config.json.

Important top-level fields:

llm_provider, llm_config
vision_provider, vision_default_model, vision_config
video_provider, video_default_model, video_config
vision_models, vision_model_providers
video_models, video_model_providers

Minimal example:

{
    "port": "8080",
    "data_dir": "data",
    "static_dir": "frontend/dist/assets",
    "templates_dir": "frontend/dist",
    "log_dir": "logs",
    "debug_mode": true,
    "llm_provider": "nvidia",
    "llm_config": {
        "api_key": "",
        "base_url": "https://integrate.api.nvidia.com/v1",
        "default_model": "moonshotai/kimi-k2.5"
    },
    "vision_provider": "glm",
    "vision_default_model": "glm-image",
    "vision_config": {
        "endpoint": "https://open.bigmodel.cn/api/paas/v4",
        "api_key": ""
    },
    "video_provider": "dashscope",
    "video_default_model": "wan2.6-i2v-flash",
    "video_config": {
        "endpoint": "https://dashscope.aliyuncs.com/api/v1",
        "api_key": "",
        "public_base_url": "https://your-domain.example"
    }
}

Encryption notes

If CONFIG_ENCRYPTION_KEY is absent in development mode, the app creates data/.encryption_key automatically.
Keep data/.encryption_key together with data/config.json.
Deleting the key file invalidates previously encrypted credentials.

Recommended first-run path

Build frontend assets.
Start backend.
Open Settings and configure one working LLM provider.
Optionally configure Vision and Video providers.
Create a scene or a standalone comic workspace.

Key operational behaviors

Long-running jobs

Analysis, prompt generation, image generation, script generation, and video generation are asynchronous. The usual pattern is:

Start a job and receive task_id
Subscribe to GET /api/progress/:taskID
Fetch final result from the corresponding GET endpoint

Guest vs authenticated usage

Many scene-oriented routes degrade to console_user when auth is missing or invalid.
User-scoped routes under /api/users/:user_id/... require authenticated ownership.
Scripts routes require authenticated access.

Video provider note

Some video providers need a publicly reachable reference image URL. In practice, that means video_config.public_base_url should usually be configured for deployed environments.

Development commands

Backend:

go test ./...
go run ./cmd/server

Frontend:

cd frontend
npm run dev
npm test
npm run lint
npm run build

Documentation index

Current scope boundary

This repository already contains significantly more than an initial prototype. The maintained documentation now treats it as:

a multi-workspace creative platform,
with provider-configurable AI services,
job-based asynchronous generation,
and documented operational deployment requirements.

That is the baseline future changes should preserve.

Name		Name	Last commit message	Last commit date
Latest commit History 489 Commits
cmd/server		cmd/server
data		data
docs		docs
frontend/dist		frontend/dist
internal		internal
logs		logs
scenes/create		scenes/create
temp		temp
LICENSE		LICENSE
README.md		README.md
README_CN.md		README_CN.md
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SceneIntruderMCP

What this project is now

Current capability map

Backend and runtime

Frontend workspaces

LLM providers officially wired in backend

Vision providers

Video providers

Architecture overview

Quick start

Prerequisites

1. Install dependencies

2. Build frontend assets

3. Start the server

4. Open the app

Configuration model

Encryption notes

Recommended first-run path

Key operational behaviors

Long-running jobs

Guest vs authenticated usage

Video provider note

Development commands

Documentation index

Current scope boundary

About

Uh oh!

Releases 4

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SceneIntruderMCP

What this project is now

Current capability map

Backend and runtime

Frontend workspaces

LLM providers officially wired in backend

Vision providers

Video providers

Architecture overview

Quick start

Prerequisites

1. Install dependencies

2. Build frontend assets

3. Start the server

4. Open the app

Configuration model

Encryption notes

Recommended first-run path

Key operational behaviors

Long-running jobs

Guest vs authenticated usage

Video provider note

Development commands

Documentation index

Current scope boundary

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages