OpenPI VLA train_step always fails server-side with "Operation failed" (SDK + HTTP demos)

  hello, i've come up with a bug: OpenPI VLA train_step always returns server-side "Operation failed" error. I'm wondering if the vla server pipeline is operating normally.

  Endpoint: https://mint-cn.macaron.xin/
  Base model: openpi/pi0-fast-libero-low-mem-finetune (mintx.OPENPI_FAST_MODEL)
  Reproduces 100%. Both SDK and HTTP demos hit the exact same error.

  Failed request_ids (forward_backward / train_step step):
    - 5db5da02571649c3adeea9340900f23f  (SDK demo, ~10:50 北京时间 2026-05-08)
    - f21f7c3c684547bea5eeccdca3e58c4e  (SDK demo, ~11:30 北京时间 2026-05-08)

  Server response payload contains:
    {"error": "Operation failed. Contact administrator if issue persists."}

  Control: demos/rl/adapters/verifiable_math.py runs end-to-end on the same
  account + endpoint (forward_backward + optim_step + save_weights all
  succeed). So the failure is specific to the OpenPI VLA pipeline server-side,
  not to this account, region, or SDK wrapping.

  Earlier successful steps (both demos): create_session, create_model,
  get_info. Failure happens at the actual forward/backward GPU execution.

  Repo: mint-quickstart @ HEAD
  mindlab-toolkit @ commit f0d3b21fe34a9419fe2c840036b44618e211a596



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OpenPI VLA train_step always fails server-side with "Operation failed" (SDK + HTTP demos) #1

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

OpenPI VLA train_step always fails server-side with "Operation failed" (SDK + HTTP demos) #1

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions