Skip to content

Eval bug: std::runtime_error Invalid diff: #13876

@stargate426

Description

@stargate426

Name and Version

ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
Device 0: NVIDIA GeForce RTX 5090, compute capability 12.0, VMM: yes
version: 5519 (a682474)
built with cc (Debian 12.2.0-14+deb12u1) 12.2.0 for x86_64-linux-gnu

Operating systems

Linux

GGML backends

CUDA

Hardware

Ryzen 5 3600 + RTX 5090

Models

Qwen3 32B q5

Problem description & steps to reproduce

./llama-server -m ~/llm/models/Qwen3-32B-Q5_K_S.gguf -c 16384 -ngl 999 --host 0.0.0.0 --port 5000 --jinja --api-key

This is how I run the program, the issue happens every so often and I can't (in the limited attempts I tried) replicate it with llama-cli

First Bad Commit

No response

Relevant log output

terminate called after throwing an instance of 'std::runtime_error'
  what():  Invalid diff: '<think>Okay, the user mentioned that Docker is taking up a lot of space and they want to delete unused volumes. Now they're saying that something else might be using all the storage and they don't know if it's Docker. I need to help them figure out what's consuming their disk space.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions