Translate gemma 4B recipes by tanzeel-amd · Pull Request #404 · microsoft/olive-recipes

tanzeel-amd · 2026-05-08T10:32:17Z

Translate gemma 4B recipes

- Introduced benchmark_wmt24pp.py for evaluating translation quality on the WMT24++ dataset using COMET. - Added inference.py for text and image translation capabilities. - Created README.md with setup instructions, usage examples, and model architecture details. - Implemented optimization pipeline in optimize.py for exporting sub-models (text decoder, vision encoder, embedding) with INT4 quantization. - Developed user script in user_script.py for model integration. - Configured export settings in JSON files for different model variants (INT4, AWQ, FP32). - Included test images for demonstration purposes.

Copied from https://ai.google.dev/gemma/terms

Copilot

Pull request overview

Adds a new Olive recipe for exporting and running the gated google/translategemma-4b-it multimodal translation model (text and image translation) via ONNX Runtime GenAI, including an optional WMT24++ + COMET benchmarking script.

Changes:

Added end-to-end export pipeline (builtin/optimize.py) plus Olive JSON configs to export text/vision/embedding ONNX sub-models.
Added runtime scripts for inference (inference.py) and WMT24++ benchmarking (benchmark_wmt24pp.py).
Added recipe documentation and included the Gemma Terms of Use.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
google-translategemma-4b-it/README.md	Documents export/inference/benchmark usage and model architecture.
google-translategemma-4b-it/LICENSE	Adds Gemma Terms of Use text to the recipe directory.
google-translategemma-4b-it/inference.py	CLI for text/image translation using ORT GenAI multimodal pipeline.
google-translategemma-4b-it/benchmark_wmt24pp.py	CLI to run WMT24++ translations and score with COMET, with resume support.
google-translategemma-4b-it/builtin/optimize.py	Orchestrates Olive exports and patches GenAI/processor/tokenizer configs.
google-translategemma-4b-it/builtin/user_script.py	Provides PyTorch wrapper modules + IO/dummy-input helpers for Olive export.
google-translategemma-4b-it/builtin/cpu_and_mobile/text.json	Olive config for INT4 RTN text decoder export.
google-translategemma-4b-it/builtin/cpu_and_mobile/embedding.json	Olive config for embedding/scatter ONNX export.
google-translategemma-4b-it/builtin/cpu_and_mobile/vision.json	Olive config for vision encoder ONNX export.
google-translategemma-4b-it/builtin/cpu_and_mobile_fp32/text.json	Olive config for FP32 text decoder export.
google-translategemma-4b-it/builtin/cpu_and_mobile_fp32/embedding.json	Olive config for FP32 embedding/scatter ONNX export.
google-translategemma-4b-it/builtin/cpu_and_mobile_fp32/vision.json	Olive config for FP32 vision encoder ONNX export.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+# AWQ INT4 text decoder + FP32 vision/embedding (~6.4 GB total)
+python optimize.py --config-dir cpu_and_mobile_awq
+


+translategemma-4b-it/
+  data/                           # Test images


+    parser.add_argument("--hf-model-dir", default=str(SCRIPT_DIR / "model"),
+                        help="HF model path for tokenizer/chat template")


+def translate_batch(model, processor, stream, hf_tok, sources: list[str], target_lang: str, max_length: int) -> list[str]:
+    """Translate a list of source texts one at a time."""
+    translations = []
+    for src in sources:


+        all_hypotheses.extend(hypotheses)
+        all_references.extend(references)
+
+    new_pairs = len(all_sources) // (args.max_segments or 1) if all_sources else 0


+import io
+from pathlib import Path
+
+import numpy as np


+import math
+import os
+import glob
+


tanzeel-amd · 2026-05-08T10:43:40Z

@microsoft-github-policy-service agree company="AMD"

VishalX · 2026-05-11T10:07:09Z

@xieofxie / @devang-ml pls review

VishalX and others added 7 commits April 21, 2026 08:27

Add Gemma3 License file

163cacd

Copied from https://ai.google.dev/gemma/terms

Update Licence

3b0b9e6

Update Licence

238bc39

Update user_script.py

34cc682

Update Licence

a2f13a7

Add MIT license for new files

44d161e

Copilot AI review requested due to automatic review settings May 8, 2026 10:32

Copilot started reviewing on behalf of tanzeel-amd May 8, 2026 10:32 View session

Copilot AI reviewed May 8, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Translate gemma 4B recipes#404

Translate gemma 4B recipes#404
tanzeel-amd wants to merge 7 commits into
microsoft:mainfrom
tanzeel-amd:translate-gemma-4b

tanzeel-amd commented May 8, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

tanzeel-amd commented May 8, 2026

Uh oh!

VishalX commented May 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		# AWQ INT4 text decoder + FP32 vision/embedding (~6.4 GB total)
		python optimize.py --config-dir cpu_and_mobile_awq

		parser.add_argument("--hf-model-dir", default=str(SCRIPT_DIR / "model"),
		help="HF model path for tokenizer/chat template")

Conversation

tanzeel-amd commented May 8, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

tanzeel-amd commented May 8, 2026

Uh oh!

VishalX commented May 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants