Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
206 commits
Select commit Hold shift + click to select a range
ee0c008
Fix Cloud Run deployment: bypass DDP init, use /tmp for writes
claude Feb 7, 2026
25aabad
Fix model loading: use renamed .pth weight file and wav2vec2 config.json
claude Feb 7, 2026
9286246
Add gourmet-sp frontend with lip sync fixes
claude Feb 8, 2026
14f48e5
Add concierge.zip avatar (copy of p2-1.zip sample for testing)
claude Feb 8, 2026
c8af922
Fix avatar path: use concierge.zip instead of p2-1.zip
claude Feb 8, 2026
5772bae
Fix lip sync artifacts and reduce TTS-to-speech latency
claude Feb 8, 2026
448d8c4
Fix TTS playback blocked by expression API timeout
claude Feb 8, 2026
db3e205
Fix: make expression generation fully non-blocking (fire-and-forget)
claude Feb 8, 2026
8276013
Integrate audio2exp into backend TTS endpoint, remove frontend direct…
claude Feb 8, 2026
0194d02
Proxy architecture: TTS returns audio only, expression via fire-and-f…
claude Feb 8, 2026
678b30a
Fix lip sync: clear buffer per segment + pre-fetch expression for rem…
claude Feb 8, 2026
7ea0364
Expression in TTS response: zero-delay lip sync from frame 0
claude Feb 8, 2026
7fe6a06
Async expression: background thread + polling (TTS returns instantly)
claude Feb 8, 2026
96b0a4b
Sync expression + ack/LLM parallel: zero-delay lip sync with ~700ms f…
claude Feb 8, 2026
4ea78d1
Fix: skip unlockAudioParams when ack is playing (prevents ttsPlayer i…
claude Feb 8, 2026
fb7d439
Fix 2nd interaction TTS: resolve pendingAckPromise on pause + stopCur…
claude Feb 8, 2026
d9a11ed
Enlarge avatar display area for 512x512 avatar optimization
claude Feb 8, 2026
eb36e5b
Adjust camera to fill avatar head in canvas (reduce empty space above)
claude Feb 8, 2026
85bd7b9
Fix camera: lower lookAt target to show full face (head to chin)
claude Feb 8, 2026
8752cb8
Camera tweak: lower avatar 10% in frame + natural slight top-down angle
claude Feb 8, 2026
5272100
Add files via upload
mirai-gpro Feb 11, 2026
e3dc7c4
Add concierge_modal.py: Modal + Gradio PoC for concierge.zip generation
claude Feb 11, 2026
958d19a
feat: add custom motion video support with VHAP tracking
claude Feb 11, 2026
99d8345
fix: download FLAME models during container build instead of local copy
claude Feb 12, 2026
225c86b
refactor: use local model files instead of HuggingFace download
claude Feb 12, 2026
2c4588f
fix: move path setup to runtime to fix Modal add_local_dir ordering
claude Feb 12, 2026
7f10e1a
fix: symlink entire model_zoo -> assets for unified layout
claude Feb 12, 2026
94115cc
fix: download LAM-20K weights and GLB templates during image build
claude Feb 12, 2026
e6838bf
fix: assets overwritten by mount + gradio schema TypeError
claude Feb 12, 2026
7fc5dd0
fix: use official LAM-assets repo + patch gradio schema bug
claude Feb 12, 2026
6ed1ca9
fix: download sample_oac separately from Alibaba OSS
claude Feb 12, 2026
929d77c
fix: allow concurrent inputs to keep Gradio on single container
claude Feb 12, 2026
769a254
fix: suppress clang narrowing error when building nvdiffrast
claude Feb 12, 2026
8f77df0
fix: patch nvdiffrast source for JIT narrowing error
claude Feb 12, 2026
0c74277
fix: use sed to patch nvdiffrast ops.py for JIT narrowing flag
claude Feb 12, 2026
7cb005a
fix: runtime monkey-patch torch cpp_extension for nvdiffrast build
claude Feb 12, 2026
7301880
fix: suppress torch._dynamo errors to fall back to eager mode
claude Feb 12, 2026
3f2cb1a
fix: install FBX SDK Python bindings in Modal image
claude Feb 12, 2026
4bebb91
fix: use @modal.concurrent and fix FBX wheel install
claude Feb 12, 2026
91d6b23
feat: add /download-zip direct download endpoint
claude Feb 12, 2026
1b795f9
fix: pass motion_choice to pipeline, add tracked face preview
claude Feb 12, 2026
c9c6025
fix: robust Blender GLB pipeline with per-request temp files & valida…
claude Feb 12, 2026
fc9f9fa
fix: clean stale FLAME data + add diagnostics for bird monster debugging
claude Feb 12, 2026
7ac6602
fix: download missing FLAME tracking & parametric models (bird monste…
claude Feb 12, 2026
96d16ff
fix: use absolute script paths for Blender subprocess + capture stdou…
claude Feb 12, 2026
7a89328
fix: inline Blender subprocess calls to bypass Modal module resolution
claude Feb 12, 2026
857958c
fix: remove deprecated Blender 4.2 glTF export params (export_colors)
claude Feb 12, 2026
19b3a5b
fix: stable ZIP path for /download-zip + robust download fallback
claude Feb 13, 2026
825f0dc
fix: Gradio file serving — allowed_paths + stable output dir
claude Feb 13, 2026
d7a43ce
fix: Gradio file serving 404 in Modal ASGI — add FastAPI fallback
claude Feb 13, 2026
923c9cb
debug: add diagnostic info to /download-zip error response
claude Feb 13, 2026
ce12f72
chore: switch GPU from A10G to T4 to reduce costs
claude Feb 13, 2026
60e5026
chore: switch GPU from T4 to L4 for better cost-performance
claude Feb 13, 2026
a3416b3
fix: add allowed_paths to mount_gradio_app to fix ZIP download
claude Feb 13, 2026
f7cc25f
fix: re-encode preview video for browser playback compatibility
claude Feb 13, 2026
6edfe6a
fix: revert GPU to A10G — L4 caused generation pipeline failure
claude Feb 13, 2026
3e91bbc
fix: restore temp paths for gr.File/gr.Video to fix download & preview
claude Feb 13, 2026
7e9df5d
feat: persist ZIP to Modal Volume + lightweight CPU download server
claude Feb 13, 2026
a4e2433
refactor: split GPU generation from CPU UI — GPU L4, idle auto-stop
claude Feb 13, 2026
af8e521
fix: Modal 1.0 compat + download server missing fastapi
claude Feb 13, 2026
c5136e7
fix: add heartbeat to prevent SSE timeout during long VHAP tracking
claude Feb 14, 2026
9c7ac70
fix: replace streaming with fire-and-forget + Volume polling
claude Feb 14, 2026
f38a184
fix: CUDA arch 8.6→8.9 for L4 GPU + add mesh/GPU diagnostics
claude Feb 14, 2026
5808fcb
fix: reorder offset.ply vertices to match GLB vertex order (bird mons…
claude Feb 14, 2026
2885a1f
revert: restore TORCH_CUDA_ARCH_LIST to 8.6 to reuse cached image
claude Feb 14, 2026
3386303
improve error reporting: show full traceback in Gradio UI
claude Feb 14, 2026
789318b
fix: add explicit directory entry to concierge.zip
claude Feb 14, 2026
1f05bd7
Add files via upload
mirai-gpro Feb 14, 2026
912bfe4
fix: increase timeouts to 2 hours for 300-frame VHAP tracking
claude Feb 14, 2026
66c692b
fix: disable vertex reorder, add ZIP/PLY diagnostics
claude Feb 14, 2026
4acb053
Add files via upload
mirai-gpro Feb 14, 2026
9af2be8
fix: strip materials/textures from GLB to match working 3.6MB size
claude Feb 14, 2026
27c3f42
Add files via upload
mirai-gpro Feb 15, 2026
a5535bc
Add files via upload
mirai-gpro Feb 15, 2026
eb67cf1
fix: disable normals/texcoords/morph-normals in GLB export to match ~…
claude Feb 15, 2026
6cf64b5
Delete concierge_now.zip
mirai-gpro Feb 15, 2026
280367d
Add files via upload
mirai-gpro Feb 15, 2026
f3d9460
fix: inline convertFBX2GLB.py script to bypass Modal image cache
claude Feb 15, 2026
b7a24b6
fix: use lightweight image for Gradio UI to stop credit drain
claude Feb 15, 2026
f52d4b5
feat: add CLI mode (modal run) with auto-download and auto-stop
claude Feb 15, 2026
a248686
fix: generate vertex_order.json from same Blender mesh as GLB export
claude Feb 15, 2026
b8352fe
fix: move ui_image/dl_image definitions before web() to fix NameError
claude Feb 15, 2026
ed94bb0
fix: move dl_image/ui_image to top-level (L292) to guarantee definiti…
claude Feb 15, 2026
813b086
fix: resolve absolute paths and improve file-not-found messages in CLI
claude Feb 15, 2026
5a6168b
fix: modal run without args launches Gradio UI instead of requiring l…
claude Feb 15, 2026
5db89eb
fix: remove transform_apply() that destroys skinning on armature-boun…
claude Feb 15, 2026
35abf87
Delete concierge_now.zip
mirai-gpro Feb 15, 2026
6190abf
Add files via upload
mirai-gpro Feb 15, 2026
bd8ca68
fix: disable torch.compile (dynamo) + add Gaussian quality diagnostics
claude Feb 15, 2026
d5a62e2
fix: format string error in Gaussian quality diagnostics
claude Feb 15, 2026
bd54b56
Replace manual weight loading with load_state_dict + missing-key dete…
claude Feb 15, 2026
90828ed
Add encoder feature sanity check (NaN/Inf/stats) before inference
claude Feb 15, 2026
c38edbc
Add checkpoint file size verification against HuggingFace reference
claude Feb 15, 2026
7c39ea0
Separate FLAME buffer keys from critical missing keys in diagnostics
claude Feb 15, 2026
13453e8
Add --smoke-test: minimal GPU inference test bypassing our pipeline
claude Feb 15, 2026
b8bf7b1
Fix bird monster: add xformers + upgrade PyTorch to match official LA…
claude Feb 15, 2026
9c14d1f
Add PyTorch/xformers version to DIAGNOSTICS output
claude Feb 15, 2026
5a1e8bd
Add HF Spaces Docker deployment (Modal-free alternative)
claude Feb 15, 2026
17afde2
Update concierge_modal.py
mirai-gpro Feb 16, 2026
ff0cd19
Finalize fixes in concierge_modal.py for GLB export
mirai-gpro Feb 16, 2026
3d0e991
Replace inline Blender script with official generate_glb pipeline
claude Feb 16, 2026
73aa5ae
Fix stale output: mount all local source dirs + cache-bust + volume c…
claude Feb 16, 2026
33718c5
Redesign Gradio UI: professional theme, progress tracking, error hand…
claude Feb 16, 2026
8becf54
Finalize concierge_modal.py with fixes and optimizations
mirai-gpro Feb 16, 2026
385a4e5
Add files via upload
mirai-gpro Feb 26, 2026
e190f5a
Add batch ZIP model generator and fix Modal build environment
claude Feb 26, 2026
4d936c7
Add comprehensive technical document for LAM ZIP model generation
claude Feb 26, 2026
e0475ca
Add troubleshooting section for modal deploy errors (2026-02-26)
claude Feb 26, 2026
a1847df
Fix lam_avatar_batch.py: add result download, stale data cleanup
claude Feb 26, 2026
3a9b1cf
Fix bird-monster artifact: neutralize @torch.compile on DINOv2 encoder
claude Feb 26, 2026
fcb9ed2
Fix bird-monster: stop overwriting vertex_order.json with sequential …
claude Feb 26, 2026
e6cf9ea
Fix bird-monster v3: remove @torch.compile at build time + comprehens…
claude Feb 26, 2026
3478768
Add files via upload
mirai-gpro Feb 26, 2026
3f720aa
Fix bird-monster v4: remove @torch.compile from local lam/ source files
claude Feb 26, 2026
7bf52e0
Migrate CUDA 11.8 → 12.1: update all GPU dependencies
claude Feb 26, 2026
90e3387
Migrate concierge_modal.py Modal Image to CUDA 12.1
claude Feb 27, 2026
dc9b2d8
Fix 3 CUDA 12.1 migration bugs found in compatibility audit
claude Feb 27, 2026
1b7a1ce
Fix diff-gaussian-rasterization build: add --recursive to git clone
claude Feb 27, 2026
d1748aa
Fix chumpy NumPy 1.24+ compatibility: patch removed numpy.bool/int/fl…
claude Feb 27, 2026
8936fc8
Fix chumpy patch: use find_spec to locate file without triggering bro…
claude Feb 27, 2026
5d2eeb9
fix: delete chumpy __pycache__ after sed patch to prevent stale bytecode
claude Feb 27, 2026
14203b3
fix: patch cpu_nms.pyx to replace deprecated np.int with np.intp
claude Feb 27, 2026
9755e46
test: switch to CUDA 11.8 + official LAM versions to verify CUDA comp…
claude Feb 27, 2026
54b4e3a
fix: add explicit float32 dtype conversion to LAM model before inference
claude Feb 27, 2026
f931d69
revert: restore CUDA 12.1 environment (undo unauthorized 11.8 downgrade)
claude Feb 27, 2026
5525533
fix: correct Usage to input.jpg and remove redundant _setup_model_pat…
claude Feb 27, 2026
af0d0ea
Add files via upload
mirai-gpro Feb 27, 2026
0f3b587
Align Modal pipeline with official ModelScope app.py to fix bird-monster
claude Feb 28, 2026
4f7bda0
Align lam/ model code with working ModelScope demo
claude Feb 28, 2026
8d99994
Add ModelScope migration handoff document
claude Feb 28, 2026
8b1d853
fix: auto-detect image extension (png/jpg/jpeg) in --image-path
claude Feb 28, 2026
27a590c
fix: add --recursive to git clone for diff-gaussian-rasterization
claude Feb 28, 2026
0d48b76
fix: add symlink for pretrained_models/human_model_files path
claude Feb 28, 2026
91f73aa
fix: patch HuggingFace config.json human_model_path before model load
claude Feb 28, 2026
0a06821
fix: align 2 remaining differences with official ModelScope app.py
claude Feb 28, 2026
391e477
revert: restore concierge_modal.py and lam_avatar_batch.py to last wo…
claude Feb 28, 2026
da39749
refactor: align lam_avatar_batch.py with official ModelScope app.py p…
claude Feb 28, 2026
fb0a314
Change default motion to GEM and add --motion-name CLI option
claude Feb 28, 2026
ee2e428
Fix diff-gaussian-rasterization build: add --recursive to git clone
claude Feb 28, 2026
fb784b7
revert: restore concierge_modal.py and lam_avatar_batch.py to pre-ses…
claude Feb 28, 2026
b46b443
docs: add modification plan for aligning Modal files with official Mo…
claude Feb 28, 2026
11a1c7e
docs: update modification plan v2 with app_concierge.py comparison
claude Feb 28, 2026
bd1ed9d
fix: align Modal pipeline with official app.py (Choice A - Gemini dir…
claude Feb 28, 2026
872bbc3
fix: correct OAC generation order comment in lam_avatar_batch.py
claude Feb 28, 2026
c3b3330
docs: add session handoff document for next Claude session
claude Feb 28, 2026
d355db9
docs: update handoff - root cause was ignoring official app.py reference
claude Feb 28, 2026
c4f8fab
feat: use lam-storage volume for model weights instead of image bake-in
claude Mar 1, 2026
99300df
feat: remove app_lam.py dependency, use official ModelScope app.py ap…
claude Mar 1, 2026
762303b
fix: add --recursive to diff-gaussian-rasterization clone for GLM sub…
claude Mar 1, 2026
28d288f
fix: explicitly clone GLM for diff-gaussian-rasterization build
claude Mar 1, 2026
54a12f1
fix: bridge pretrained_models/human_model_files for from_pretrained c…
claude Mar 1, 2026
6b9a602
fix: add pretrained_models to volume symlinks + diagnostic for flame2…
claude Mar 1, 2026
e79febf
add diagnostic script to inspect volume directory structure
claude Mar 1, 2026
042b1b9
add setup_volume.py: download all LAM model files to Modal volume
claude Mar 1, 2026
a171a63
add sample_oac.tar download from Alibaba OSS for template_file.fbx
claude Mar 1, 2026
5ca1dd4
fix: use urllib instead of wget for OAC download
claude Mar 1, 2026
ae7afa7
fix: add robust pretrained_models/human_model_files symlink bridge
claude Mar 1, 2026
9a57592
fix: remap human_model_path in hf_hub.py for Modal volume layout
claude Mar 1, 2026
cb18c9f
fix: remove unsupported max_squen_length arg from prepare_motion_seqs…
claude Mar 1, 2026
a5cc433
fix: use shutil.make_archive instead of zip CLI for OAC ZIP creation
claude Mar 1, 2026
1cf4c5f
revert lam/ to upstream and switch to official _build_model weight lo…
claude Mar 1, 2026
da308d5
Add Gemini instruction doc for faithful ModelScope app.py -> Modal mi…
claude Mar 1, 2026
d5d2fa2
Rename output file from concierge_modal.py to app_modal.py in instruc…
claude Mar 1, 2026
a9bffb4
Add files via upload
mirai-gpro Mar 1, 2026
d50f7b3
Add local_entrypoint for batch processing via modal run
claude Mar 1, 2026
63f5d74
Remove broken Gradio web endpoint (batch processing only for now)
claude Mar 1, 2026
c751059
Fix bird-monster: never restore torch.compile after model loading
claude Mar 1, 2026
f2913c1
Switch lam_avatar_batch.py from concierge_modal to app_modal
claude Mar 1, 2026
a447188
Revert app_modal.py to original uploaded version
claude Mar 1, 2026
968c660
Force full Modal image rebuild by adding .force_build() to imported i…
claude Mar 1, 2026
0a61938
Fix ImportError: define STORAGE_VOL_PATH locally instead of importing…
claude Mar 1, 2026
c972197
Remove .force_build() - use CLI flag --force-build instead
claude Mar 1, 2026
7fb3cb4
Fix FBX SDK install: use Volume whl instead of unreachable Alibaba CDN
claude Mar 1, 2026
583fdfd
Fix typo: pixlwise.py → pixelwise.py in sed command
claude Mar 1, 2026
d149197
Fix bird-monster: add lam.eval() and match original model loading
claude Mar 1, 2026
c058322
Revert app_modal.py to original uploaded version (undo d149197)
claude Mar 1, 2026
e2fc53f
Force full rebuild: add .force_build() to skip Modal cache
claude Mar 1, 2026
b287b8e
Remove .force_build() - restore app_modal.py to exact original
claude Mar 1, 2026
a594322
Add .force_build() in lam_avatar_batch.py to skip Modal cache
claude Mar 1, 2026
4494b3d
Revert .force_build() - use env var instead
claude Mar 1, 2026
c40c8ee
fix: correct typo pixlwise.py → pixelwise.py in torch.compile disable…
claude Mar 1, 2026
600c8cc
revert: restore app_modal.py to previous state (do not touch)
claude Mar 1, 2026
5c90bf7
feat: add Colab notebook for LAM inference without Modal
claude Mar 2, 2026
d10fbcd
fix: correct enlarge_ratio typo [1.0, 1,0] → [1.0, 1.0] in Colab note…
claude Mar 2, 2026
c77553e
fix: match enlarge_ratio with LAM_Large_Avatar_Model/app.py
claude Mar 2, 2026
6c4f522
feat: add Google Drive wheel cache for CUDA builds
claude Mar 2, 2026
9d1deb0
fix: replace GitHub clone with ModelScope source in Colab notebook
claude Mar 2, 2026
d6ebf72
fix: use mirai-gpro/LAM_gpro lam-large-upload branch instead of Model…
claude Mar 2, 2026
1fce68d
fix: remove numpy==1.26.4 downgrade to avoid binary incompatibility o…
claude Mar 2, 2026
04fc745
fix: resolve numpy binary incompatibility (dtype size changed) on Colab
claude Mar 2, 2026
5eca87f
fix: wrap torch._dynamo import in try/except for Colab PyTorch versio…
claude Mar 2, 2026
cc50064
fix: make CUDA wheel cache torch-version-aware to prevent ABI mismatch
claude Mar 2, 2026
f6cf5f7
fix: force-reinstall scipy alongside numpy to prevent RecursionError
claude Mar 2, 2026
f7da580
fix: harden numpy/scipy version pinning across all install cells
claude Mar 2, 2026
b6af04e
fix: force-reinstall numpy/scipy AFTER onnxruntime-gpu install
claude Mar 2, 2026
64cbae4
fix: subprocess-based numpy verification + xformers CUDA auto-detect
claude Mar 2, 2026
7b2656a
fix: defer numpy import until after all pip installs to avoid restart
claude Mar 2, 2026
33ea4c3
fix: add sys.path guard to cells importing from lam package
claude Mar 2, 2026
a4e335b
style: add [section.number] labels to all code cells for Colab naviga…
claude Mar 2, 2026
c3378d3
fix: add dependency check in [4.4] for missing cfg variable
claude Mar 2, 2026
87fe3d7
fix: patch chumpy inspect.getargspec for Python 3.12 compatibility
claude Mar 2, 2026
334aafb
Add files via upload
mirai-gpro Mar 2, 2026
0b1e348
refactor: consolidate setup cells [0.1]-[2.3] into single [Setup] cell
claude Mar 2, 2026
998ce21
fix: auto-restart kernel after [Setup] to resolve numpy version mismatch
claude Mar 2, 2026
c48f89b
fix: patch sys.argv before FlameTrackingSingleImage to avoid argparse…
claude Mar 2, 2026
2d14692
fix: patch torch.load for PyTorch 2.6 compatibility (weights_only def…
claude Mar 2, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
199 changes: 199 additions & 0 deletions ANALYSIS_REQUEST.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,199 @@
# LAM_Audio2Expression 解析・実装依頼

## 依頼の背景

Audio2ExpressionサービスをGoogle Cloud Runにデプロイしようと48時間以上、40回以上試行したが、モデルが「mock」モードのままで正しく初期化されない。対症療法的な修正を繰り返しても解決できないため、根本的なアプローチの見直しが必要。

## 前任AIの反省点

**重要**: 前任AI(Claude)は以下の問題を抱えていた:

1. **古い知識ベースからの推論に依存**
- 一般的な「Cloud Runデプロイ」パターンを適用しようとした
- LAM_Audio2Expression固有の設計思想を理解できていなかった

2. **表面的なコード理解**
- コードを読んだが、なぜそのように設計されているかを理解していなかった
- 元々どのような環境・ユースケースを想定したコードなのかを考慮しなかった

3. **対症療法の繰り返し**
- ログからエラーを見つけ→修正→デプロイ→また別のエラー、の無限ループ
- 根本原因を特定せず、見えている症状だけを修正し続けた

4. **思い込み**
- 「モデルの読み込みや初期化がうまくいっていない」と決めつけていた
- 問題はそこではなく、もっと根本的なアプローチの誤りである可能性がある

**この解析を行う際は、上記の落とし穴にハマらないよう注意してください。**

## 解析対象コード

### 主要ファイル

**1. audio2exp-service/app.py** (現在のサービス実装)
- FastAPI を使用したWebサービス
- `/health`, `/debug`, `/api/audio2expression`, `/ws/{session_id}` エンドポイント
- `Audio2ExpressionEngine` クラスでモデル管理

**2. LAM_Audio2Expression/engines/infer.py**
- `InferBase` クラス: モデル構築の基底クラス
- `Audio2ExpressionInfer` クラス: 音声→表情推論
- `infer_streaming_audio()`: リアルタイムストリーミング推論

**3. LAM_Audio2Expression/models/network.py**
- `Audio2Expression` クラス: PyTorchニューラルネットワーク
- wav2vec2 エンコーダー + Identity Encoder + Decoder構成

**4. LAM_Audio2Expression/engines/defaults.py**
- `default_config_parser()`: 設定ファイル読み込み
- `default_setup()`: バッチサイズ等の設定計算
- `create_ddp_model()`: 分散データ並列ラッパー

## 具体的な解析依頼

### Q1: モデル初期化が完了しない根本原因

```python
# app.py での初期化
self.infer = INFER.build(dict(type=cfg.infer.type, cfg=cfg))
self.infer.model.eval()
```

この処理がCloud Run環境で正常に完了しない理由を特定してください。

考えられる原因:
- [ ] メモリ不足 (8GiBで足りない?)
- [ ] CPU環境での動作制限
- [ ] 分散処理設定が単一インスタンスで問題を起こす
- [ ] ファイルシステムの書き込み権限
- [ ] タイムアウト (コールドスタート時間)
- [ ] その他

### Q2: default_setup() の問題

```python
# defaults.py
def default_setup(cfg):
world_size = comm.get_world_size() # Cloud Runでは1
cfg.num_worker = cfg.num_worker if cfg.num_worker is not None else mp.cpu_count()
cfg.num_worker_per_gpu = cfg.num_worker // world_size
assert cfg.batch_size % world_size == 0 # 失敗する可能性?
```

推論時にこの設定が問題を起こしていないか確認してください。

### Q3: ロガー設定の問題

```python
# infer.py
self.logger = get_root_logger(
log_file=os.path.join(cfg.save_path, "infer.log"),
file_mode="a" if cfg.resume else "w",
)
```

Cloud Runのファイルシステムでログファイル作成が失敗する可能性を確認してください。

### Q4: wav2vec2 モデル読み込み

```python
# network.py
if os.path.exists(pretrained_encoder_path):
self.audio_encoder = Wav2Vec2Model.from_pretrained(pretrained_encoder_path)
else:
config = Wav2Vec2Config.from_pretrained(wav2vec2_config_path)
self.audio_encoder = Wav2Vec2Model(config) # ランダム重み!
```

- wav2vec2-base-960h フォルダの構成は正しいか?
- HuggingFaceからのダウンロードが必要なファイルはないか?

### Q5: 適切なデプロイ方法

Cloud Runが不適切な場合、以下の代替案を検討:
- Google Compute Engine (GPU インスタンス)
- Cloud Run Jobs (バッチ処理)
- Vertex AI Endpoints
- Kubernetes Engine

## 期待する成果

### 1. 分析結果
- 根本原因の特定
- なぜ40回以上の試行で解決できなかったかの説明

### 2. 修正されたコード
```
audio2exp-service/
├── app.py # 修正版
├── Dockerfile # 必要なら修正
└── cloudbuild.yaml # 必要なら修正
```

### 3. 動作確認方法
```bash
# ヘルスチェック
curl https://<service-url>/health
# 期待する応答: {"model_initialized": true, "mode": "inference", ...}

# 推論テスト
curl -X POST https://<service-url>/api/audio2expression \
-H "Content-Type: application/json" \
-d '{"audio_base64": "...", "session_id": "test"}'
```

## 技術スペック

### モデル仕様
| 項目 | 値 |
|------|-----|
| 入力サンプルレート | 24kHz (API) / 16kHz (内部) |
| 出力フレームレート | 30 fps |
| 出力次元 | 52 (ARKit blendshape) |
| モデルファイルサイズ | ~500MB (LAM) + ~400MB (wav2vec2) |

### デプロイ環境
| 項目 | 値 |
|------|-----|
| プラットフォーム | Cloud Run Gen 2 |
| リージョン | asia-northeast1 |
| メモリ | 8GiB |
| CPU | 4 |
| max-instances | 4 |

### 依存関係 (requirements.txt)
```
torch==2.0.1
torchaudio==2.0.2
transformers==4.30.2
librosa==0.10.0
fastapi==0.100.0
uvicorn==0.23.0
numpy==1.24.3
scipy==1.11.1
pydantic==2.0.3
```

## ファイルの場所

```bash
# プロジェクトルート
cd /home/user/LAM_gpro

# メインサービス
cat audio2exp-service/app.py

# 推論エンジン
cat audio2exp-service/LAM_Audio2Expression/engines/infer.py

# ニューラルネットワーク
cat audio2exp-service/LAM_Audio2Expression/models/network.py

# 設定
cat audio2exp-service/LAM_Audio2Expression/engines/defaults.py
cat audio2exp-service/LAM_Audio2Expression/configs/lam_audio2exp_config_streaming.py
```

---

以上、よろしくお願いいたします。
172 changes: 172 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,172 @@
# ============================================================
# Dockerfile for HF Spaces Docker SDK (GPU)
# ============================================================
# Reproduces the exact environment from concierge_modal.py's
# Modal Image definition, but as a standard Dockerfile.
#
# Build: docker build -t lam-concierge .
# Run: docker run --gpus all -p 7860:7860 lam-concierge
# HF: Push to a HF Space with SDK=Docker, Hardware=GPU
# ============================================================

FROM nvidia/cuda:12.1.0-devel-ubuntu22.04

ENV DEBIAN_FRONTEND=noninteractive
ENV PYTHONUNBUFFERED=1

# System packages
RUN apt-get update && apt-get install -y --no-install-recommends \
python3.10 python3.10-dev python3.10-venv python3-pip \
git wget curl ffmpeg tree \
libgl1-mesa-glx libglib2.0-0 libusb-1.0-0 \
build-essential ninja-build clang llvm libclang-dev \
xz-utils libxi6 libxxf86vm1 libxfixes3 \
libxrender1 libxkbcommon0 libsm6 \
&& rm -rf /var/lib/apt/lists/*

# Make python3.10 the default
RUN update-alternatives --install /usr/bin/python python /usr/bin/python3.10 1 && \
update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.10 1

# Upgrade pip
RUN python -m pip install --upgrade pip setuptools wheel

# numpy first (pinned for compatibility — must stay <2.0 for PyTorch 2.4 + mediapipe)
RUN pip install 'numpy==1.26.4'

# ============================================================
# PyTorch 2.4.0 + CUDA 12.1
# ============================================================
RUN pip install torch==2.4.0 torchvision==0.19.0 torchaudio==2.4.0 \
--index-url https://download.pytorch.org/whl/cu121

# ============================================================
# xformers — CRITICAL for DINOv2 MemEffAttention
# Without it, model produces garbage output ("bird monster").
# ============================================================
RUN pip install xformers==0.0.27.post2 \
--index-url https://download.pytorch.org/whl/cu121

# CUDA build environment
ENV FORCE_CUDA=1
ENV CUDA_HOME=/usr/local/cuda
ENV MAX_JOBS=4
ENV TORCH_CUDA_ARCH_LIST="7.0;7.5;8.0;8.6;8.9;9.0"
ENV CC=clang
ENV CXX=clang++

# CUDA extensions (require no-build-isolation)
RUN pip install chumpy==0.70 --no-build-isolation

# pytorch3d — build from source (C++17 required for CUDA 12.1)
ENV CXXFLAGS="-std=c++17"
RUN pip install git+https://github.com/facebookresearch/pytorch3d.git --no-build-isolation

# diff-gaussian-rasterization — patch CUDA 12.1 header issues then build
RUN git clone --recursive https://github.com/ashawkey/diff-gaussian-rasterization.git /tmp/dgr && \
find /tmp/dgr -name '*.cu' -exec sed -i '1i #include <cfloat>' {} + && \
find /tmp/dgr -name '*.h' -path '*/cuda_rasterizer/*' -exec sed -i '1i #include <cstdint>' {} + && \
pip install /tmp/dgr --no-build-isolation && \
rm -rf /tmp/dgr

# simple-knn — patch cfloat for CUDA 12.1 then build
RUN git clone https://github.com/camenduru/simple-knn.git /tmp/simple-knn && \
sed -i '1i #include <cfloat>' /tmp/simple-knn/simple_knn.cu && \
pip install /tmp/simple-knn --no-build-isolation && \
rm -rf /tmp/simple-knn

# nvdiffrast — JIT compilation at runtime (requires -devel image)
RUN pip install git+https://github.com/ShenhanQian/nvdiffrast.git@backface-culling --no-build-isolation

# ============================================================
# Python dependencies
# ============================================================
RUN pip install \
"gradio==4.44.0" \
"gradio_client==1.3.0" \
"fastapi" \
"uvicorn" \
"omegaconf==2.3.0" \
"pandas" \
"scipy<1.14.0" \
"opencv-python-headless==4.9.0.80" \
"imageio[ffmpeg]" \
"moviepy==1.0.3" \
"rembg" \
"scikit-image" \
"pillow" \
"huggingface_hub>=0.24.0" \
"filelock" \
"typeguard" \
"transformers==4.44.2" \
"diffusers==0.30.3" \
"accelerate==0.34.2" \
"tyro==0.8.0" \
"mediapipe==0.10.21" \
"tensorboard" \
"rich" \
"loguru" \
"Cython" \
"PyMCubes" \
"trimesh" \
"einops" \
"plyfile" \
"jaxtyping" \
"ninja" \
"patool" \
"safetensors" \
"decord" \
"numpy==1.26.4"

# onnxruntime-gpu for CUDA 12 — MUST be installed AFTER rembg to prevent
# rembg from pulling in the PyPI default (CUDA 11) build
RUN pip install onnxruntime-gpu==1.18.1 \
--extra-index-url https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/onnxruntime-cuda-12/pypi/simple/

# FBX SDK Python bindings (for OBJ -> FBX -> GLB avatar export)
RUN pip install https://virutalbuy-public.oss-cn-hangzhou.aliyuncs.com/share/aigc3d/data/LAM/fbx-2020.3.4-cp310-cp310-manylinux1_x86_64.whl

# ============================================================
# Blender 4.2 LTS (for GLB generation)
# ============================================================
RUN wget -q https://download.blender.org/release/Blender4.2/blender-4.2.0-linux-x64.tar.xz -O /tmp/blender.tar.xz && \
mkdir -p /opt/blender && \
tar xf /tmp/blender.tar.xz -C /opt/blender --strip-components=1 && \
ln -sf /opt/blender/blender /usr/local/bin/blender && \
rm /tmp/blender.tar.xz

# ============================================================
# Clone LAM repo and build cpu_nms
# ============================================================
RUN git clone https://github.com/aigc3d/LAM.git /app/LAM

# Build cpu_nms for FaceBoxesV2
RUN cd /app/LAM/external/landmark_detection/FaceBoxesV2/utils/nms && \
python -c "\
from setuptools import setup, Extension; \
from Cython.Build import cythonize; \
import numpy; \
setup(ext_modules=cythonize([Extension('cpu_nms', ['cpu_nms.pyx'])]), \
include_dirs=[numpy.get_include()])" \
build_ext --inplace

# ============================================================
# Download model weights (cached in Docker layer)
# ============================================================
COPY download_models.py /app/download_models.py
RUN python /app/download_models.py

# ============================================================
# Copy application code (after model download for cache)
# ============================================================
WORKDIR /app/LAM

# Copy our app into the container
COPY app_concierge.py /app/LAM/app_concierge.py

# HF Spaces expects port 7860
EXPOSE 7860
ENV GRADIO_SERVER_NAME=0.0.0.0
ENV GRADIO_SERVER_PORT=7860

CMD ["python", "app_concierge.py"]
Loading