Skip to content

Gemma4 initial npu support#179

Open
cavusmustafa wants to merge 5 commits into
ravi9:dev_backend_openvinofrom
cavusmustafa:gemma4_initial_npu_support
Open

Gemma4 initial npu support#179
cavusmustafa wants to merge 5 commits into
ravi9:dev_backend_openvinofrom
cavusmustafa:gemma4_initial_npu_support

Conversation

@cavusmustafa
Copy link
Copy Markdown
Collaborator

@cavusmustafa cavusmustafa commented May 21, 2026

  • Static shape fix for npu (some n_tokens dimensions were not detected)
  • NPU fold was producing dynamic shapes due to slice ops. In certain cases like gemma4 architecture, they are replaced by split ops.
  • A temporary fix for fp16 overflow issue on NPU using CLAMP before GELU.

@wine99 wine99 force-pushed the gemma4_initial_npu_support branch from 2df014c to 0fa0534 Compare May 22, 2026 03:04
int64_t dim_size = static_cast<int64_t>(src_ov_shape[slice_dim]);

size_t stride_at_dim = (slice_dim < static_cast<int>(ndims) - 1) ?
src_stride[slice_dim + 1] : src_stride[slice_dim];
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI openvino_gpu_low_perf failed

/home/llamauser/actions-runner-ravi9/_work/llama.cpp/llama.cpp/ggml/src/ggml-openvino/openvino/op/view.cpp: In function ‘ov::OutputVector ov::frontend::ggml::op::translate_view(const ov::frontend::ggml::NodeContext&)’:
/home/llamauser/actions-runner-ravi9/_work/llama.cpp/llama.cpp/ggml/src/ggml-openvino/openvino/op/view.cpp:80:12: error: unused variable ‘stride_at_dim’ [-Werror=unused-variable]
   80 |     size_t stride_at_dim = (slice_dim < static_cast<int>(ndims) - 1) ?
      |            ^~~~~~~~~~~~~
cc1plus: all warnings being treated as errors

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants