From 5824e8797712529dca9db9ad9e15ebf36a87f391 Mon Sep 17 00:00:00 2001
From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com>
Date: Fri, 17 Apr 2026 22:57:46 +0000
Subject: [PATCH 1/4] Initial plan
From 1d2377c77ce09944b9555fa43ddeb35960ecce04 Mon Sep 17 00:00:00 2001
From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com>
Date: Fri, 17 Apr 2026 23:05:43 +0000
Subject: [PATCH 2/4] Document contracts for all custom operators
Agent-Logs-Url: https://github.com/microsoft/onnxruntime-extensions/sessions/cdb9185e-44ff-4191-8d4c-b00889e20918
Co-authored-by: justinchuby <11205048+justinchuby@users.noreply.github.com>
---
docs/custom_ops.md | 2099 +++++++++++++++++++++++++++++++++++---------
1 file changed, 1701 insertions(+), 398 deletions(-)
diff --git a/docs/custom_ops.md b/docs/custom_ops.md
index cacbaba47..ebe12966b 100644
--- a/docs/custom_ops.md
+++ b/docs/custom_ops.md
@@ -531,734 +531,2037 @@ expect(node, inputs=[inputs],
-## String operators
-
-### StringEqual
+### CLIPTokenizer
-StringEqual details
+CLIPTokenizer details
-Compares two strings and returns true if they are equal and false if not.
+Byte-pair-encoding (BPE) tokenizer matching the CLIP text encoder from HuggingFace/OpenAI. Converts input strings into token id sequences.
-#### Inputs
+#### Attributes
-***x: tensor(string)***
+***vocab: string***
-The first string input
+JSON vocabulary mapping tokens to ids (contents of `vocab.json`).
-***x: tensor(string)***
+***merges: string***
-The second string input
+Merge rules (contents of `merges.txt`).
-#### Outputs
+***padding_length: int64_t*** (default is -1)
-***z: tensor(boolean)***
+If positive, the output is right-padded (or truncated) to this length. When -1 no padding is performed and outputs stay ragged.
-String with replacements.
+#### Inputs
-
+***input: tensor(string)***
+1D string tensor containing the input texts.
-### StringHash
+#### Outputs
-
-StringHash details
+***input_ids: tensor(int64)***
+Tensor of token ids.
-Hashes the input string based on the number of buckets
+***attention_mask: tensor(int64)***
-#### Inputs
+Mask with the same shape as `input_ids` (1 for real tokens, 0 for padding).
-***input: tensor(string)***
+***offset_mapping: tensor(int64)*** (optional)
-The string to hash
+If requested, per-token `(begin, end)` byte offsets into the corresponding input string.
-***num_buckets: tensor(int64)***
+
-The number of buckets (must be equal to 1?)
-#### Outputs
+### RobertaTokenizer
-***name: tensor(int64)***
+
+RobertaTokenizer details
-The hash value of the string
+BPE tokenizer compatible with HuggingFace's RoBERTa tokenizer. Uses the same attributes and I/O contract as `CLIPTokenizer`.
-
+#### Attributes
+***vocab: string***
-### StringHashFast
+JSON vocabulary (contents of `vocab.json`).
-
-StringHashFast details
+***merges: string***
+BPE merge rules (contents of `merges.txt`).
-A faster implementation of StringHash.
+***padding_length: int64_t*** (default is -1)
-
+Optional fixed output length. See `CLIPTokenizer`.
+#### Inputs
-### StringJoin
+***input: tensor(string)***
-
-StringJoin details
+1D string tensor of input texts.
+#### Outputs
-Join an array of strings
+***input_ids: tensor(int64)***
-#### Inputs
+Token ids.
-***input_X: tensor(string)***
+***attention_mask: tensor(int64)***
-The input array of strings
+Attention mask, same shape as `input_ids`.
-***input_sep: tensor(string)***
+***offset_mapping: tensor(int64)*** (optional)
-The string separator for the resulting joing
+Per-token byte offsets into each input string.
-***input_axis: tensor(int64)***
+
-The axis along which to joing
-#### Outputs
+### SpmTokenizer
-***out: tensor(string)***
+
+SpmTokenizer details
-The resulting joined string
+SentencePiece-compatible tokenizer built on top of the shared BPE kernel. Produces tokens equivalent to HuggingFace's "fast" SentencePiece tokenizers (e.g. Llama, T5, XLM-RoBERTa).
-#### Examples
+#### Attributes
+***vocab: string***
-```bash
+JSON vocabulary produced from a SentencePiece model.
-input_X = [["a", "b", "c"], ["aa", "bb", ""]]
-input_sep=";"
-input_axis = 1
+***merges: string***
-out = ["a;b;c", "aa;bb;"]
+SentencePiece merge rules.
-input_axis = 0
+***padding_length: int64_t*** (default is -1)
-out = ['a;aa', 'b;bb', 'c;']
+Optional fixed output length.
+#### Inputs
-
+***input: tensor(string)***
+1D string tensor of inputs.
-### StringRegexReplace
+#### Outputs
-
-StringRegexReplace details
+***input_ids: tensor(int64)***
+Tensor of token ids.
-String replacement based on [Re2-format](https://github.com/google/re2/wiki/Syntax) regular expressions.
+***attention_mask: tensor(int64)***
-#### Inputs
+Attention mask with the same shape as `input_ids`.
-***text: tensor(string)***
+***offset_mapping: tensor(int64)*** (optional)
-String tensor to extract slices from.
+Per-token byte offsets.
-***pattern: tensor(string)***
+
-Pattern of the regular expression.
-***rewrite: tensor(string)***
+### HfBertTokenizer
-Replacement.
+
+HfBertTokenizer details
+
+HuggingFace-compatible BERT WordPiece tokenizer. Behaves like `BertTokenizer`'s `__call__` method but with a smaller attribute surface. Produces ids, attention masks and token type ids in a single op.
#### Attributes
-***global_replace: int64*** (default is 1)
+***vocab_file: string***
-Replace all strings matching the pattern or the first one.
+Contents of `vocab.txt`.
-#### Outputs
+***do_lower_case: int64_t*** (default is 1)
-***output: tensor(string)***
+Lowercase inputs before tokenization.
-String with replacements.
+***strip_accents: int64_t*** (default is 1)
-#### Examples
+Strip accents as part of normalization.
-```python
+#### Inputs
-node = onnx.helper.make_node(
- 'StringRegexReplace',
- inputs=['text', 'pattern', 'rewrite'],
- outputs=['y'],
-)
+***input: tensor(string)***
-text = np.array([['def myfunc():'], ['def dummy():']])
-pattern = np.array([r'def\s+([a-zA-Z_][a-zA-Z_0-9]*)\s*\(\s*\):'])
-rewrite = np.array([r'static PyObject* py_\1(void) {'])
-y = [['static PyObject* py_myfunc(void) {'],
- ['static PyObject* py_dummy(void) {']]
+1D string tensor containing the texts to tokenize.
-expect(node, inputs=[text, pattern, rewrite], outputs=[y],
- name='test_string_regex_replace')
-```
+#### Outputs
-
+***input_ids: tensor(int64)***
-### StringECMARegexReplace
+Token ids.
-
-StringECMARegexReplace details
+***attention_mask: tensor(int64)***
-String replacement based on [ECMA-format](https://en.cppreference.com/w/cpp/regex/ecmascript) regular expressions.
+Attention mask, same shape as `input_ids`.
-#### Inputs
+***token_type_ids: tensor(int64)*** (optional)
-***text: tensor(string)***
+Segment ids. All zero for single-sentence input.
-String tensor to extract slices from.
+
-***pattern: tensor(string)***
-Pattern of the regular expression.
+### HfJsonTokenizer
-***rewrite: tensor(string)***
+
+HfJsonTokenizer details
-Replacement.
+Loads a HuggingFace `tokenizer.json` directly and dispatches to the appropriate kernel (BPE or Unigram). Matches HuggingFace fast tokenizers at inference time.
#### Attributes
-***global_replace: int64*** (default is 1)
+***tokenizer_config: string***
-Replace all strings matching the pattern or the first one.
+Contents of `tokenizer.json` (and optionally `tokenizer_config.json`).
+***tokenizer_vocab: string*** (optional)
-***ignore_case: int64*** (default is 0)
+Additional vocabulary data when the tokenizer uses an external vocab file.
-Replace
+#### Inputs
-#### Outputs
+***input: tensor(string)***
-***output: tensor(string)***
+1D string tensor of inputs.
-String with replacements.
+#### Outputs
-#### Examples
+***input_ids: tensor(int64)***
+Token ids.
-```python
+***attention_mask: tensor(int64)***
-node = onnx.helper.make_node(
- 'StringRegexReplace',
- inputs=['text', 'pattern', 'rewrite'],
- outputs=['y'],
-)
+Attention mask matching `input_ids`.
-text = np.array([['def myfunc():'], ['def dummy():']])
-pattern = np.array([r'def\s+([a-zA-Z_][a-zA-Z_0-9]*)\s*\(\s*\):'])
-rewrite = np.array([r'static PyObject* py_$1(void) {'])
-y = [['static PyObject* py_myfunc(void) {'],
- ['static PyObject* py_dummy(void) {']]
+***offset_mapping: tensor(int64)*** (optional)
-expect(node, inputs=[text, pattern, rewrite], outputs=[y],
- name='test_string_regex_replace')
-```
+Per-token byte offsets.
+### SentencepieceDecoder
-### StringSplit
-
-TODO
-
-### StringUpper
+
+SentencepieceDecoder details
-TODO
+Decodes a sequence of SentencePiece ids back into a string.
-### StringLower
+#### Attributes
-TODO
+***model: string***
-### StringLength
+Serialized SentencePiece model (`*.model`).
-
-StringECMARegexReplace details
+#### Inputs
-Get the length of each string element in input tensor. Similar to the function `len("abcde"")` in python.
+***ids: tensor(int64)***
-#### Inputs
+1D or 2D tensor of ids. When 2D the leading dimension must be 1.
-***data: tensor(string)***
+***fairseq: tensor(bool)*** (optional)
-String tensor to get length of its each string element.
+Scalar flag. When true the `fairseq` vocab-id offset convention is applied.
#### Outputs
-***output: tensor(int64)***
+***output: tensor(string)***
-Data length tensor.
+Scalar string containing the decoded text.
-#### Examples
+
-```python
+### BpeDecoder
-node = onnx.helper.make_node(
- 'StringLength',
- inputs=['x'],
- outputs=['y']
-)
+
+BpeDecoder details
-x = ["abcdef", "hijkl"]
-y = np.array([len(x[0]), len(x[1])], dtype=np.int64)
+Decodes BPE token ids (GPT-2 / CLIP / RoBERTa style) back into text.
+#### Attributes
-expect(node, inputs=[x], outputs=[y],
- name='test_string_length')
-```
-
-
-### StringConcat
+***id_vocab: string***
-
-StringConcat details
+Newline-separated token strings indexed by id.
-Concat the corresponding string in the two string tensor. Two input tensors should have the same dimension.
+***byte_decoder: string***
-```python
- output = []
- shape = input1.shape
- input1 = input1.flatten()
- input2 = input2.flatten()
- for i in range(len(input1)):
- output.append(input1[i] + input2[i])
- output = np.array(output).reshape(shape)
-```
+Reverse byte-to-unicode mapping used by GPT-2 BPE encoders.
-#### Inputs
+***added_tokens: string*** (optional)
-***input_1: tensor(string)***
+Extra tokens appended to the base vocabulary.
-The first string tensor.
+***all_special_ids: string*** (optional)
-***input_2: tensor(string)***
+Comma-separated list of special token ids.
-The second string tensor.
+***skip_special_tokens: int64_t*** (default is 0)
+When 1, ids in `all_special_ids` are skipped during decoding.
-#### Outputs
+***en_normalization: int64_t*** (default is 0)
-***output: tensor(string)***
+Apply a minimal English-oriented post-processing step (e.g. undo leading-space markers).
-The result.
+***whitespace_token: string*** (optional)
+***bos_token: string*** (optional)
+***eos_token: string*** (optional)
+***unk_token: string*** (optional)
-#### Examples
+Optional overrides for well-known special tokens.
+#### Inputs
-```python
+***ids: tensor(int64)***
-node = onnx.helper.make_node(
- 'StringConcat',
- inputs=['x', 'y'],
- outputs=['result'],
-)
+1D or 2D tensor of token ids.
-x = np.array(["abcd", "efgh"])
-y = np.array(["wxyz", "stuv"])
-result = np.array([x[0] + y[0], x[1] + y[1]])
+#### Outputs
-expect(node, inputs=[x, y], outputs=[result],
- name='test_string_concat')
-```
+***output: tensor(string)***
+
+Decoded string tensor.
-### StringRegexSplitWithOffsets
-
-StringRegexSplitWithOffsets details
+### TrieTokenizer
-Splits string based on regular expressions.
+
+TrieTokenizer details
-#### Inputs
+Trie-based longest-match tokenizer used by RWKV-style models.
-***text: tensor(string)***
+#### Attributes
-String tensor to extract slices from.
+***vocab: string***
-***delim_regex_pattern: tensor(string)***
+Newline-separated vocab where each line has the form `index token length`. `token` is a Python-repr-encoded byte string.
-Splitting attern of the regular expression.
+#### Inputs
-***keep_delim_regex_pattern: tensor(string)***
+***input: tensor(string)***
-By default, delimiters are not included in the split string results. Delimiters may be included by specifying a regex pattern keep_delim_regex_pattern.
+1D string tensor of inputs.
#### Outputs
-***words: tensor(string)*** Tensor of words.
+***output: tensor(int64)***
-***offsets: tensor(int64)*** 2D tensor with 3 columns:
-sentence index, position of the first character, position of the last one (excluded)
+2D right-padded tensor of token ids; padding uses id `0`.
-***row_indices: tensor(int64)*** Indices of every first token of input sentences.
-`row_indices[i+1] - row_indices[i]` is the number of tokens in input `i`.
-These are updates row indices given as inputs or new ones if the second input is empty.
+
-#### Examples
+### TrieDetokenizer
+
+TrieDetokenizer details
-```python
+Inverse of `TrieTokenizer`. Converts 1D or 2D id tensors back to strings using the same trie vocabulary.
-node = onnx.helper.make_node(
- 'StringRegexSplit',
+#### Attributes
+
+***vocab: string***
+
+Same vocabulary format as `TrieTokenizer`.
+
+#### Inputs
+
+***ids: tensor(int64)***
+
+1D or 2D tensor of token ids.
+
+#### Outputs
+
+***output: tensor(string)***
+
+Decoded text, one string per row.
+
+
+
+
+### BlingFireSentenceBreaker
+
+
+BlingFireSentenceBreaker details
+
+Segments an input string into sentences using a compiled [BlingFire](https://github.com/microsoft/BlingFire) model.
+
+#### Attributes
+
+***model: string***
+
+Raw bytes of the compiled BlingFire sentence-breaking model (`*.bin`).
+
+***max_sentence: int64_t*** (default is -1)
+
+If positive, limits the number of returned sentences.
+
+#### Inputs
+
+***input: tensor(string)***
+
+Scalar input string.
+
+#### Outputs
+
+***output: tensor(string)***
+
+1D tensor of sentences.
+
+
+
+
+## String operators
+
+### StringEqual
+
+
+StringEqual details
+
+Compares two strings and returns true if they are equal and false if not.
+
+#### Inputs
+
+***x: tensor(string)***
+
+The first string input
+
+***x: tensor(string)***
+
+The second string input
+
+#### Outputs
+
+***z: tensor(boolean)***
+
+String with replacements.
+
+
+
+
+### StringHash
+
+
+StringHash details
+
+
+Hashes the input string based on the number of buckets
+
+#### Inputs
+
+***input: tensor(string)***
+
+The string to hash
+
+***num_buckets: tensor(int64)***
+
+The number of buckets (must be equal to 1?)
+
+#### Outputs
+
+***name: tensor(int64)***
+
+The hash value of the string
+
+
+
+
+### StringHashFast
+
+
+StringHashFast details
+
+
+A faster implementation of StringHash. Computes hash values for each input string modulo `num_buckets`.
+
+#### Inputs
+
+***input: tensor(string)***
+
+The strings to hash.
+
+***num_buckets: tensor(int64)***
+
+The number of hash buckets (scalar). Each output value will be in the range `[0, num_buckets)`.
+
+#### Outputs
+
+***output: tensor(int64)***
+
+The hashed values, with the same shape as `input`.
+
+
+
+
+### StringJoin
+
+
+StringJoin details
+
+
+Join an array of strings
+
+#### Inputs
+
+***input_X: tensor(string)***
+
+The input array of strings
+
+***input_sep: tensor(string)***
+
+The string separator for the resulting joing
+
+***input_axis: tensor(int64)***
+
+The axis along which to joing
+
+#### Outputs
+
+***out: tensor(string)***
+
+The resulting joined string
+
+#### Examples
+
+
+```bash
+
+input_X = [["a", "b", "c"], ["aa", "bb", ""]]
+input_sep=";"
+input_axis = 1
+
+out = ["a;b;c", "aa;bb;"]
+
+input_axis = 0
+
+out = ['a;aa', 'b;bb', 'c;']
+
+
+
+
+
+### StringRegexReplace
+
+
+StringRegexReplace details
+
+
+String replacement based on [Re2-format](https://github.com/google/re2/wiki/Syntax) regular expressions.
+
+#### Inputs
+
+***text: tensor(string)***
+
+String tensor to extract slices from.
+
+***pattern: tensor(string)***
+
+Pattern of the regular expression.
+
+***rewrite: tensor(string)***
+
+Replacement.
+
+#### Attributes
+
+***global_replace: int64*** (default is 1)
+
+Replace all strings matching the pattern or the first one.
+
+#### Outputs
+
+***output: tensor(string)***
+
+String with replacements.
+
+#### Examples
+
+```python
+
+node = onnx.helper.make_node(
+ 'StringRegexReplace',
inputs=['text', 'pattern', 'rewrite'],
- outputs=['y', 'begin_end', 'indices'],
+ outputs=['y'],
)
-text = np.array(["hello there"])
-pattern = np.array([r'\s'])
-rewrite = np.array([r'\s'])
-y = np.array(["hello", " ", "there"])
-z1 = np.array([[0, 0, 5],
- [0, 5, 6],
- [0, 6, 11]], dtype=np.int64)
-z2 = np.array([0, 2], dtype=np.int64)
+text = np.array([['def myfunc():'], ['def dummy():']])
+pattern = np.array([r'def\s+([a-zA-Z_][a-zA-Z_0-9]*)\s*\(\s*\):'])
+rewrite = np.array([r'static PyObject* py_\1(void) {'])
+y = [['static PyObject* py_myfunc(void) {'],
+ ['static PyObject* py_dummy(void) {']]
+
+expect(node, inputs=[text, pattern, rewrite], outputs=[y],
+ name='test_string_regex_replace')
+```
+
+
+
+### StringECMARegexReplace
+
+
+StringECMARegexReplace details
+
+String replacement based on [ECMA-format](https://en.cppreference.com/w/cpp/regex/ecmascript) regular expressions.
+
+#### Inputs
+
+***text: tensor(string)***
+
+String tensor to extract slices from.
+
+***pattern: tensor(string)***
+
+Pattern of the regular expression.
+
+***rewrite: tensor(string)***
+
+Replacement.
+
+#### Attributes
+
+***global_replace: int64*** (default is 1)
+
+Replace all strings matching the pattern or the first one.
+
+
+***ignore_case: int64*** (default is 0)
+
+Replace
+
+#### Outputs
+
+***output: tensor(string)***
+
+String with replacements.
+
+#### Examples
+
+
+```python
+
+node = onnx.helper.make_node(
+ 'StringRegexReplace',
+ inputs=['text', 'pattern', 'rewrite'],
+ outputs=['y'],
+)
+
+text = np.array([['def myfunc():'], ['def dummy():']])
+pattern = np.array([r'def\s+([a-zA-Z_][a-zA-Z_0-9]*)\s*\(\s*\):'])
+rewrite = np.array([r'static PyObject* py_$1(void) {'])
+y = [['static PyObject* py_myfunc(void) {'],
+ ['static PyObject* py_dummy(void) {']]
+
+expect(node, inputs=[text, pattern, rewrite], outputs=[y],
+ name='test_string_regex_replace')
+```
+
+
+
+
+
+### StringSplit
+
+
+StringSplit details
+
+Splits each string in the input by a separator, producing a ragged (sparse) representation of the resulting tokens.
+
+#### Inputs
+
+***input: tensor(string)***
+
+1D string tensor to split.
+
+***sep: tensor(string)***
+
+Scalar string separator used to split each element of `input`. If empty, the string is split on whitespace.
+
+***skip_empty: tensor(bool)***
+
+Scalar boolean. When true, empty substrings are removed from the output.
+
+#### Outputs
+
+***indices: tensor(int64)***
+
+2D tensor of shape `[N, 2]` containing `(row, col)` coordinates of each output token in the ragged representation.
+
+***values: tensor(string)***
+
+1D tensor of `N` tokens produced by splitting, in row-major order.
+
+***shape: tensor(int64)***
+
+2-element tensor describing the dense shape `[num_rows, max_row_width]` of the ragged tensor.
+
+
+
+### StringUpper
+
+
+StringUpper details
+
+Converts every ASCII character in each string of the input tensor to uppercase. Non-ASCII bytes are left unchanged.
+
+#### Inputs
+
+***input: tensor(string)***
+
+String tensor of arbitrary shape.
+
+#### Outputs
+
+***output: tensor(string)***
+
+String tensor of the same shape as `input` with uppercased strings.
+
+
+
+### StringLower
+
+
+StringLower details
+
+Converts each string in the input tensor to lowercase using Unicode case folding.
+
+#### Inputs
+
+***input: tensor(string)***
+
+String tensor of arbitrary shape.
+
+#### Outputs
+
+***output: tensor(string)***
+
+String tensor of the same shape as `input` with lowercased strings.
+
+
+
+### StringStrip
+
+
+StringStrip details
+
+Removes leading and trailing whitespace characters from every string in the input tensor. Similar to `str.strip()` in Python.
+
+#### Inputs
+
+***input: tensor(string)***
+
+String tensor of arbitrary shape.
+
+#### Outputs
+
+***output: tensor(string)***
+
+String tensor of the same shape as `input` with whitespace stripped.
+
+
+
+### StringLength
+
+
+StringECMARegexReplace details
+
+Get the length of each string element in input tensor. Similar to the function `len("abcde"")` in python.
+
+#### Inputs
+
+***data: tensor(string)***
+
+String tensor to get length of its each string element.
+
+#### Outputs
+
+***output: tensor(int64)***
+
+Data length tensor.
+
+#### Examples
+
+
+```python
+
+node = onnx.helper.make_node(
+ 'StringLength',
+ inputs=['x'],
+ outputs=['y']
+)
+
+x = ["abcdef", "hijkl"]
+y = np.array([len(x[0]), len(x[1])], dtype=np.int64)
+
+
+expect(node, inputs=[x], outputs=[y],
+ name='test_string_length')
+```
+
+
+### StringConcat
+
+
+StringConcat details
+
+Concat the corresponding string in the two string tensor. Two input tensors should have the same dimension.
+
+```python
+ output = []
+ shape = input1.shape
+ input1 = input1.flatten()
+ input2 = input2.flatten()
+ for i in range(len(input1)):
+ output.append(input1[i] + input2[i])
+ output = np.array(output).reshape(shape)
+```
+
+#### Inputs
+
+***input_1: tensor(string)***
+
+The first string tensor.
+
+***input_2: tensor(string)***
+
+The second string tensor.
+
+
+#### Outputs
+
+***output: tensor(string)***
+
+The result.
+
+#### Examples
+
+
+```python
+
+node = onnx.helper.make_node(
+ 'StringConcat',
+ inputs=['x', 'y'],
+ outputs=['result'],
+)
+
+x = np.array(["abcd", "efgh"])
+y = np.array(["wxyz", "stuv"])
+result = np.array([x[0] + y[0], x[1] + y[1]])
+
+expect(node, inputs=[x, y], outputs=[result],
+ name='test_string_concat')
+```
+
+
+
+### StringRegexSplitWithOffsets
+
+
+StringRegexSplitWithOffsets details
+
+Splits string based on regular expressions.
+
+#### Inputs
+
+***text: tensor(string)***
+
+String tensor to extract slices from.
+
+***delim_regex_pattern: tensor(string)***
+
+Splitting attern of the regular expression.
+
+***keep_delim_regex_pattern: tensor(string)***
+
+By default, delimiters are not included in the split string results. Delimiters may be included by specifying a regex pattern keep_delim_regex_pattern.
+
+#### Outputs
+
+***words: tensor(string)*** Tensor of words.
+
+***offsets: tensor(int64)*** 2D tensor with 3 columns:
+sentence index, position of the first character, position of the last one (excluded)
+
+***row_indices: tensor(int64)*** Indices of every first token of input sentences.
+`row_indices[i+1] - row_indices[i]` is the number of tokens in input `i`.
+These are updates row indices given as inputs or new ones if the second input is empty.
+
+
+#### Examples
+
+
+```python
+
+node = onnx.helper.make_node(
+ 'StringRegexSplit',
+ inputs=['text', 'pattern', 'rewrite'],
+ outputs=['y', 'begin_end', 'indices'],
+)
+
+text = np.array(["hello there"])
+pattern = np.array([r'\s'])
+rewrite = np.array([r'\s'])
+y = np.array(["hello", " ", "there"])
+z1 = np.array([[0, 0, 5],
+ [0, 5, 6],
+ [0, 6, 11]], dtype=np.int64)
+z2 = np.array([0, 2], dtype=np.int64)
+
+expect(node, inputs=[text, pattern, rewrite], outputs=[y, z1, z2],
+ name='test_string_regex_replace')
+```
+
+
+
+
+### StringECMARegexSplitWithOffsets
+
+
+StringECMARegexSplitWithOffsets details
+
+Splits strings using a regular expression in the ECMAScript dialect and reports the byte offsets of every produced token. Provides the same functionality as `StringRegexSplitWithOffsets` but uses `std::regex` instead of `re2`, allowing ECMAScript regex features.
+
+#### Inputs
+
+***input: tensor(string)***
+
+String tensor to split.
+
+***pattern: tensor(string)***
+
+Scalar string containing the ECMAScript regex splitting pattern.
+
+***keep_pattern: tensor(string)***
+
+Scalar string. Delimiter matches that also match this pattern are preserved as tokens in the output. Pass an empty string to drop all delimiters.
+
+#### Attributes
+
+***ignore_case: int64_t*** (default is 0)
+
+When set to 1 the regex is matched case-insensitively.
+
+#### Outputs
+
+***words: tensor(string)***
+
+1D tensor containing the split tokens.
+
+***offsets: tensor(int64)***
+
+2D tensor of shape `[num_tokens, 3]` where each row is `(sentence_index, begin_byte, end_byte)`.
+
+***row_indices: tensor(int64)***
+
+1D tensor of row offsets such that tokens of the i-th input string occupy `[row_indices[i], row_indices[i+1])` in `words`.
+
+
+
+### VectorToString
+
+
+VectorToString details
+
+VectorToString is the contrary operation to the `StringToVector` , they share same format of mapping table:
+
+ \t\s\s...
+
+Unmapped vector will output the value of the attribute `unk`.
+
+Example:
+
+*Attributes:*
+
+- `map`:
+ ```
+ a 0 0 1 2
+ b 0 1 2 3
+ d 0 1 3 4
+ ```
+
+- `unk`: "unknown_word"
+
+*Inputs:*
+- data: [[0,0,1,2],[0,1,3,4],[0,0,0,0]]
+
+*Ouputs:*
+- output: ["a", "d", "unknown_word" ]
+
+#### Attributes
+
+***mapping_file_name***
+
+the formative mapping table
+
+***unmapping_value***
+
+the result returned when a vector aren't found in the map
+
+#### Inputs
+
+***data: tensor(T)***
+
+Input tensor
+
+#### Outputs
+
+***output: tensor(string)***
+
+The mapping result of the input
+
+#### Type Constraints
+***T:tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(bfloat16), tensor(float16), tensor(float), tensor(double), tensor(bool)***
+
+Constrain input and output types to numerical tensors.
+
+
+#### Examples
+
+
+```python
+mapping_table = \
+ """
+ a 0 0 1 2
+ b 0 1 2 3
+ d 0 1 3 4
+ """
+
+node = onnx.helper.make_node(
+ 'VectorToString',
+ inputs=['x'],
+ outputs=['y'],
+ map=mapping_table,
+ unk="unknown_word"
+)
+
+
+x = np.array([[0,0,1,2],[0,1,3,4],[0,0,0,0]], type=np.int64)
+y = ["a", "d", "unknown_word"]
+
+
+expect(node, inputs=[x], outputs=[y],
+ name='test_vector_to_string')
+```
+
+
+
+### StringToVector
+
+
+StringToVector details
+
+StringToVector will map each string element in the input to the corresponding vector according to the mapping file. The mapping file is a utf-8 encoding text file in tsv format:
+
+ \t\s\s...
+
+Unmapped string will output the value of the attribute `unmapping_value`.
+
+Example:
+
+*Attributes:*
+
+- `mapping_file_name`: vocabulary.txt
+ ```
+ a 0 0 1 2
+ b 0 1 2 3
+ d 0 1 3 4
+ ```
+
+- `unmapping_value`: [0 0 0 0]
+
+*Inputs:*
+- data: ["a", "d", "e"]
+
+*Ouputs:*
+- output: [[0,0,1,2],[0,1,3,4],[0,0,0,0]]
+
+#### Attributes
+
+***mapping_file_name:string***
+
+The name of your string to vector mapping file.
+
+***unmapping_value:list(int)***
+
+Mapping result for unmapped string
+
+#### Inputs
+
+***data: tensor(string)***
+
+Input tensor
+
+#### Outputs
+
+***output: tensor(T)***
+
+The mapping result of the input
+
+#### Type Constraints
+***T:tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(bfloat16), tensor(float16), tensor(float), tensor(double), tensor(bool)***
+
+Constrain input and output types to numerical tensors.
+
+#### Examples
+
+
+```python
+# what's in vocabulary.txt
+
+mapping_table = \
+"""
+a 0 0 1 2
+b 0 1 2 3
+d 0 1 3 4
+"""
+
+node = onnx.helper.make_node(
+ 'StringToVector',
+ inputs=['x'],
+ outputs=['y'],
+ mapping_table=mapping_table,
+ unmapping_value=[0,0,0,0]
+)
+
+
+x = ["a", "d", "e"]
+y = np.array([[0,0,1,2],[0,1,3,4],[0,0,0,0]], type=np.int64)
+
+
+expect(node, inputs=[x], outputs=[y],
+ name='test_string_to_vector')
+```
+
+
+
+
+
+### StringSlice
+
+
+StringSlice details
+
+Do the slice operation to each string element in input tensor. Similar to string slice in python
+
+```python
+a = "abcdef"
+b = a[1:2]
+c = a[3:1:-1]
+```
+
+#### Inputs
+
+***data: tensor(string)***
+
+String tensor to extract slices from.
+
+***starts: tensor(int64/int32)***
+
+The tensor of starting indices of corresponding string in data, which has same dimension of data.
+
+***ends: tensor(int64/int32)***
+
+The tensor of ending indices of corresponding string in data, which has same dimension of data.
+
+***steps(optional): tensor(int64/int32)***
+
+The tensor of slice step of corresponding string in data, which has same dimension of data.If steps is empty tensor, we will use default value 1 for each string
+
+#### Outputs
+
+***output: tensor(string)***
+
+Sliced data tensor.
+
+#### Examples
+
+
+```python
+
+node = onnx.helper.make_node(
+ 'StringSlice',
+ inputs=['x', 'starts', 'ends', 'steps'],
+ outputs=['y'],
+)
+
+x = np.array(["abcdef", "hijkl"])
+y = np.array([x[0][1:3:1], x[1][3:1:-1]])
+starts = np.array([1, 3], dtype=np.int64)
+ends = np.array([3, 1], dtype=np.int64)
+axes = np.array([0, 1], dtype=np.int64)
+steps = np.array([1, 1], dtype=np.int64)
+
+expect(node, inputs=[x, starts, ends, axes, steps], outputs=[y],
+ name='test_string_slice')
+```
+
+
+
+
+### MaskedFill
+
+
+MaskedFill details
+
+
+Fills elements of self tensor with value where mask is True. The operator is similar with [`Tensor.masked_fill_`](https://pytorch.org/docs/stable/generated/torch.Tensor.masked_fill_.html#torch.Tensor.masked_fill_) in pytorch.
+
+
+#### Inputs
+
+***value: tensor(string)***
+
+The value to fill in with, currently we only support string type and vector&scalar dimension.
+
+***mask: tensor(bool)***
+
+The boolean mask, the dimension of mask tensor should be same with value.
+
+#### Outputs
+
+***output: tensor(string)***
+
+The filled output of input tensor.
+
+
+#### Examples
+
+
+```python
+
+node = onnx.helper.make_node(
+ 'MaskedFill',
+ inputs=['value', 'mask'],
+ outputs=['output']
+)
+
+
+value = np.array(["a", "b", "c", "d"])
+mask = np.array([True, False, True, False], dtype=bool)
+output = np.array(["a", "c"])
+
+
+expect(node, inputs=[value, mask], outputs=[output],
+ name='test_masked_fill')
+```
+
+
+
+### StringRaggedTensorToDense
+
+
+StringRaggedTensorToDense details
+
+Converts a ragged string tensor to a dense 2D string tensor, padding shorter rows with a fill value.
+
+#### Inputs
+
+***row_splits: tensor(int64)***
+
+1D tensor with the starting position of each row in `values`. Row `i` contains `values[row_splits[i]:row_splits[i+1]]`.
+
+***values: tensor(string)***
+
+1D flat string tensor holding the concatenated row values.
+
+***default_value_shape: tensor(int64)***
+
+1D tensor describing the target dense shape. Only used to determine the number of columns.
+
+***default_value: tensor(string)***
+
+Scalar string used to pad rows that are shorter than the longest row.
+
+#### Outputs
+
+***output: tensor(string)***
+
+2D dense string tensor with padding applied.
+
+
+
+### StringMapping
+
+
+StringMapping details
+
+Maps each element of the input string tensor to another string using a user-supplied dictionary. Strings not found in the dictionary are passed through unchanged.
+
+#### Attributes
+
+***map: string***
+
+A string containing one mapping per line. Each line has the form `key\tvalue`, where key and value are separated by a tab character.
+
+#### Inputs
+
+***input: tensor(string)***
+
+Input string tensor of arbitrary shape.
+
+#### Outputs
+
+***output: tensor(string)***
+
+Output string tensor of the same shape as `input` after mapping.
+
+
+
+## Math operators
+
+
+### Inverse
+
+
+Inverse details
+
+Computes the matrix inverse of a 2D floating-point tensor.
+
+#### Inputs
+
+***input: tensor(float)***
+
+A 2D square matrix of shape `[N, N]`.
+
+#### Outputs
+
+***output: tensor(float)***
+
+The inverse of the input matrix, of shape `[N, N]`.
+
+
+
+### NegPos
+
+
+NegPos details
+
+Splits an input tensor into its negative and positive parts. Equivalent to `min(x, 0)` and `max(x, 0)` returned separately.
+
+#### Inputs
+
+***input: tensor(float)***
+
+Input tensor of arbitrary shape.
+
+#### Outputs
+
+***neg: tensor(float)***
+
+Tensor with the same shape as `input`; contains `x` where `x < 0`, else `0`.
+
+***pos: tensor(float)***
+
+Tensor with the same shape as `input`; contains `x` where `x >= 0`, else `0`.
+
+
+
+### SegmentExtraction
+
+
+SegmentExtraction details
+
+Extracts contiguous non-zero segments from a 1D integer input. For every maximal run of non-zero values, the start and end positions are returned.
+
+#### Inputs
+
+***input: tensor(int64)***
+
+1D input tensor.
+
+#### Outputs
+
+***position: tensor(int64)***
+
+2D tensor of shape `[num_segments, 2]` where each row is `(begin, end)` (end exclusive).
+
+***value: tensor(int64)***
+
+1D tensor of length `num_segments` with the value inside each segment.
+
+
+
+### SegmentSum
+
+
+SegmentSum details
+
+Computes sums along segments of the first axis of a tensor, similar to TensorFlow's `tf.math.segment_sum`.
+
+#### Inputs
+
+***data: tensor(float)***
+
+The values to reduce. The first dimension is the segment axis.
+
+***segment_ids: tensor(int64)***
+
+1D tensor with the same length as `data.shape[0]`. Must be non-decreasing.
+
+#### Outputs
+
+***output: tensor(float)***
+
+Tensor where `output[i]` is the sum of all rows of `data` whose corresponding `segment_ids` equal `i`.
+
+
+
+### StftNorm
+
+
+StftNorm details
+
+Computes a short-time Fourier transform (STFT) of a 1D signal and returns the magnitude spectrogram. The implementation uses a Hann-style sliding window.
+
+#### Attributes
+
+***onesided: int64_t*** (default is 1)
+
+If 1, only the non-redundant positive-frequency half of the spectrum is returned (length `n_fft / 2 + 1`). If 0, the full spectrum is returned.
+
+#### Inputs
+
+***pcm: tensor(float)***
+
+1D audio signal.
+
+***n_fft: tensor(int64)***
+
+Scalar FFT size.
+
+***hop_length: tensor(int64)***
+
+Scalar hop length between consecutive frames.
+
+***window: tensor(float)***
+
+1D window function of length `frame_length`.
+
+***frame_length: tensor(int64)***
+
+Scalar frame length (must equal `n_fft`).
+
+#### Outputs
+
+***output: tensor(float)***
+
+3D tensor of shape `[1, num_frames, num_freq_bins]` containing the magnitude spectrogram.
+
+
+
+### SplitSignalSegments
+
+
+SplitSignalSegments details
+
+Partitions an audio signal into segments of voiced/high-energy regions based on a simple short-time energy threshold.
+
+#### Inputs
+
+***input: tensor(float)***
+
+1D audio signal.
+
+***sr: tensor(int64)***
+
+Scalar sample rate in Hz.
+
+***frame_ms: tensor(int64)***
+
+Scalar analysis frame length in milliseconds.
+
+***hop_ms: tensor(int64)***
+
+Scalar hop length between analysis frames in milliseconds.
+
+***energy_threshold_db: tensor(float)***
+
+Scalar energy threshold in dBFS. Frames with average energy below this are treated as silence.
+
+#### Outputs
+
+***segments: tensor(int64)***
+
+2D tensor of shape `[num_segments, 2]` where each row contains the `(begin_sample, end_sample)` indices of a detected segment.
+
+
+
+### MergeSignalSegments
+
+
+MergeSignalSegments details
+
+Merges adjacent audio segments whose gap is shorter than a configurable threshold. Typically used as a post-processing step after `SplitSignalSegments`.
+
+#### Inputs
+
+***segments: tensor(int64)***
+
+2D tensor of shape `[N, 2]` with `(begin, end)` indices, as produced by `SplitSignalSegments`.
+
+***merge_gap_ms: tensor(int64)***
+
+Scalar gap threshold in milliseconds. Segments separated by less than this value are merged.
+
+#### Outputs
+
+***output: tensor(int64)***
+
+2D tensor of shape `[M, 2]` (M <= N) of the merged segment boundaries.
+
+
+
+## Tensor operators
+
+### RaggedTensorToSparse
+
+
+RaggedTensorToSparse details
+
+Converts a ragged tensor's row lengths to a COO-style sparse indexing representation.
+
+#### Inputs
+
+***n_element: tensor(int64)***
+
+1D tensor holding the number of elements in each row.
+
+#### Outputs
+
+***output_0: tensor(int64)***
+
+2D tensor of `(row, col)` indices for every element.
+
+***output_1: tensor(int64)***
+
+1D tensor of length 2 containing the dense shape `[num_rows, max_row_width]`.
+
+
+
+### RaggedTensorToDense
+
+
+RaggedTensorToDense details
+
+Converts a ragged int64 tensor to a dense 2D tensor, padding shorter rows with a configurable value.
+
+#### Attributes
+
+***missing_value: int64_t*** (default is -1)
+
+Value used to pad short rows.
+
+#### Inputs
+
+***input0: tensor(int64)***
+
+1D row-splits tensor indicating the start index of each row within `input3`.
+
+***input1: tensor(int64)***
+
+1D tensor of flat indices (unused by some consumers; reserved).
+
+***input2: tensor(int64)***
+
+1D tensor of length 2 describing the target dense shape `[num_rows, max_row_width]`.
+
+***input3: tensor(int64)***
+
+1D flat values tensor.
+
+#### Outputs
+
+***output: tensor(int64)***
+
+2D dense tensor with missing elements filled by `missing_value`.
+
+
+
+## Audio operators
+
+### AudioDecoder
+
+
+AudioDecoder details
+
+Decodes a byte stream containing an encoded audio file (WAV, MP3, or FLAC) into a float PCM tensor. Optionally resamples the audio to a target sample rate.
+
+#### Attributes
+
+***downsampling_rate: int64_t*** (default is 0)
+
+Target sample rate. When 0 the native sample rate of the decoded stream is used.
+
+***stereo_to_mono: int64_t*** (default is 1)
+
+If 1, multi-channel audio is mixed down to a single mono channel.
+
+***target_sample_rate: int64_t*** (default is 0)
+
+Alias for `downsampling_rate`; when non-zero the decoded audio is resampled to this rate.
+
+#### Inputs
+
+***input: tensor(uint8)***
+
+1D tensor of raw bytes representing the encoded audio file.
+
+***format: tensor(string)*** (optional)
+
+Scalar describing the container format. Accepted values: `"wav"`, `"mp3"`, `"flac"`. When absent the format is detected from the file header.
+
+#### Outputs
+
+***output: tensor(float)***
+
+2D tensor of shape `[1, num_samples]` with the decoded (and optionally resampled) PCM samples in the range `[-1, 1]`.
+
+
+
+## Vision operators
+
+### DecodeImage
+
+
+DecodeImage details
+
+Decodes an encoded image (PNG, JPEG, BMP, TIFF, …) into an `HxWx3` uint8 tensor.
+
+#### Attributes
+
+***color_space: string*** (default is "BGR")
+
+Color ordering of the output. Valid values are `"RGB"` and `"BGR"`.
+
+#### Inputs
+
+***input: tensor(uint8)***
+
+1D tensor containing the raw encoded image bytes.
+
+#### Outputs
+
+***output: tensor(uint8)***
+
+3D tensor of shape `[H, W, 3]`.
+
+
+
+### EncodeImage
+
+
+EncodeImage details
+
+Encodes a 3-channel `HxWx3` uint8 image tensor to PNG or JPEG bytes.
+
+#### Attributes
+
+***format: string*** (default is "png")
+
+Output image format. Valid values are `"png"` and `"jpg"` (or `"jpeg"`).
+
+#### Inputs
+
+***input: tensor(uint8)***
+
+3D tensor of shape `[H, W, 3]` in BGR order.
+
+#### Outputs
+
+***output: tensor(uint8)***
+
+1D tensor of encoded image bytes.
+
+
+
+### DrawBoundingBoxes
+
+
+DrawBoundingBoxes details
+
+Draws bounding boxes on a BGR image tensor.
+
+#### Attributes
+
+***thickness: int64_t*** (default is 4)
+
+Line thickness of the drawn rectangles, in pixels.
+
+***num_classes: int64_t*** (default is 10)
+
+Number of class colors to cycle through.
+
+***mode: string*** (default is "XYXY")
+
+Interpretation of the box coordinates. One of `"XYXY"`, `"XYWH"`, or `"CENTER_XYWH"`.
+
+***colour_by_classes: int64_t*** (default is 1)
+
+When 1, boxes of the same class share a colour. When 0, each box gets a unique colour from the palette.
+
+#### Inputs
+
+***image: tensor(uint8)***
-expect(node, inputs=[text, pattern, rewrite], outputs=[y, z1, z2],
- name='test_string_regex_replace')
-```
+3D tensor of shape `[H, W, 3]` in BGR order.
-
+***boxes: tensor(float)***
+2D tensor of shape `[N, 6]`. Each row is `(class_id, score, x0, y0, x1, y1)` (or equivalent depending on `mode`).
-### StringECMARegexSplitWithOffsets
+#### Outputs
-TODO
+***output: tensor(uint8)***
-### VectorToString
+Image tensor with boxes drawn, same shape as `image`.
+
+
+
+### GaussianBlur
-VectorToString details
+GaussianBlur details
-VectorToString is the contrary operation to the `StringToVector` , they share same format of mapping table:
+Applies a 2D Gaussian blur to an image tensor using OpenCV's `cv::GaussianBlur`.
- \t\s\s...
+#### Inputs
-Unmapped vector will output the value of the attribute `unk`.
+***input: tensor(float)***
-Example:
+4D image tensor of shape `[N, H, W, C]`.
-*Attributes:*
+***ksize: tensor(int64)***
-- `map`:
- ```
- a 0 0 1 2
- b 0 1 2 3
- d 0 1 3 4
- ```
+1D tensor of length 2 specifying the kernel size `[kx, ky]` (odd positive integers).
-- `unk`: "unknown_word"
+***sigma: tensor(double)***
-*Inputs:*
-- data: [[0,0,1,2],[0,1,3,4],[0,0,0,0]]
+1D tensor of length 2 specifying the Gaussian standard deviation along X and Y.
-*Ouputs:*
-- output: ["a", "d", "unknown_word" ]
+#### Outputs
-#### Attributes
+***output: tensor(float)***
-***mapping_file_name***
+Blurred tensor with the same shape as `input`.
-the formative mapping table
+
-***unmapping_value***
+### ImageDecoder
-the result returned when a vector aren't found in the map
+
+ImageDecoder details
+
+Decodes raw encoded image bytes using OpenCV's `cv::imdecode`. Similar to `DecodeImage` but always returns BGR and does not expose a color-space attribute.
#### Inputs
-***data: tensor(T)***
+***input: tensor(uint8)***
-Input tensor
+1D tensor of encoded image bytes.
#### Outputs
-***output: tensor(string)***
+***output: tensor(uint8)***
-The mapping result of the input
+3D tensor of shape `[H, W, C]` containing the decoded BGR image.
-#### Type Constraints
-***T:tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(bfloat16), tensor(float16), tensor(float), tensor(double), tensor(bool)***
+
-Constrain input and output types to numerical tensors.
+### ImageReader
+
+ImageReader details
-#### Examples
+Reads an image from a file path using OpenCV's `cv::imread` and returns the decoded tensor.
+#### Inputs
-```python
-mapping_table = \
- """
- a 0 0 1 2
- b 0 1 2 3
- d 0 1 3 4
- """
+***input: tensor(string)***
-node = onnx.helper.make_node(
- 'VectorToString',
- inputs=['x'],
- outputs=['y'],
- map=mapping_table,
- unk="unknown_word"
-)
+Scalar string with the path of the image file to read.
+#### Outputs
-x = np.array([[0,0,1,2],[0,1,3,4],[0,0,0,0]], type=np.int64)
-y = ["a", "d", "unknown_word"]
+***output: tensor(uint8)***
+4D tensor of shape `[1, H, W, C]` containing the decoded BGR image.
-expect(node, inputs=[x], outputs=[y],
- name='test_vector_to_string')
-```
+## CUDA operators
-### StringToVector
+The following operators execute on CUDA devices only. They are only registered when the library is built with `USE_CUDA`. Unless otherwise noted each op supports `float`, `float16` (`MFloat16`), and in some cases `bfloat16` (`BFloat16`).
+
+### FastGelu
-StringToVector details
+FastGelu details
-StringToVector will map each string element in the input to the corresponding vector according to the mapping file. The mapping file is a utf-8 encoding text file in tsv format:
+Fused CUDA kernel computing `gelu(x + bias)` using the fast tanh-based approximation.
- \t\s\s...
+#### Inputs
-Unmapped string will output the value of the attribute `unmapping_value`.
+***input: tensor(T)***
-Example:
+Input tensor of any shape. `T` is one of `float`, `float16`, `bfloat16`.
-*Attributes:*
+***bias: tensor(T)*** (optional)
-- `mapping_file_name`: vocabulary.txt
- ```
- a 0 0 1 2
- b 0 1 2 3
- d 0 1 3 4
- ```
-
-- `unmapping_value`: [0 0 0 0]
+Bias added elementwise before applying Gelu. Broadcast to the shape of `input`.
-*Inputs:*
-- data: ["a", "d", "e"]
+#### Outputs
-*Ouputs:*
-- output: [[0,0,1,2],[0,1,3,4],[0,0,0,0]]
+***output: tensor(T)***
-#### Attributes
+Same shape as `input`.
-***mapping_file_name:string***
+
-The name of your string to vector mapping file.
+### MulSigmoid
-***unmapping_value:list(int)***
+
+MulSigmoid details
-Mapping result for unmapped string
+Computes `x * sigmoid(x)` (the SiLU / Swish activation) in a single fused CUDA kernel.
#### Inputs
-***data: tensor(string)***
+***input: tensor(T)***
-Input tensor
+Input tensor. `T` is one of `float`, `float16`, `bfloat16`.
#### Outputs
***output: tensor(T)***
-The mapping result of the input
-
-#### Type Constraints
-***T:tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(bfloat16), tensor(float16), tensor(float), tensor(double), tensor(bool)***
+Same shape as `input`.
-Constrain input and output types to numerical tensors.
+
-#### Examples
+### MulMulSigmoid
+
+MulMulSigmoid details
-```python
-# what's in vocabulary.txt
+Computes `x * y * sigmoid(y)` in a single fused CUDA kernel. Tensors must have the same shape.
-mapping_table = \
-"""
-a 0 0 1 2
-b 0 1 2 3
-d 0 1 3 4
-"""
+#### Inputs
-node = onnx.helper.make_node(
- 'StringToVector',
- inputs=['x'],
- outputs=['y'],
- mapping_table=mapping_table,
- unmapping_value=[0,0,0,0]
-)
+***x: tensor(T)***, ***y: tensor(T)***
+`T` is one of `float`, `float16`, `bfloat16`.
-x = ["a", "d", "e"]
-y = np.array([[0,0,1,2],[0,1,3,4],[0,0,0,0]], type=np.int64)
+#### Outputs
+***output: tensor(T)***
-expect(node, inputs=[x], outputs=[y],
- name='test_string_to_vector')
-```
+Tensor with the same shape as the inputs.
+### NegXPlus1
+
+NegXPlus1 details
-### StringSlice
+Computes `1 - x` elementwise on CUDA.
+
+#### Inputs
+
+***input: tensor(T)***
+
+`T` is one of `float`, `float16`, `bfloat16`.
+
+#### Outputs
+
+***output: tensor(T)***
+
+Same shape as `input`.
+
+
+
+### ReplaceZero
-StringSlice details
+ReplaceZero details
-Do the slice operation to each string element in input tensor. Similar to string slice in python
+Replaces every zero element of the input with a scalar value.
-```python
-a = "abcdef"
-b = a[1:2]
-c = a[3:1:-1]
-```
+#### Attributes
+
+***by: float*** (default is 0.0)
+
+Replacement value for zero entries.
#### Inputs
-***data: tensor(string)***
+***input: tensor(T)***
-String tensor to extract slices from.
+`T` is one of `float`, `float16`, `bfloat16`.
-***starts: tensor(int64/int32)***
+#### Outputs
-The tensor of starting indices of corresponding string in data, which has same dimension of data.
+***output: tensor(T)***
-***ends: tensor(int64/int32)***
+Same shape as `input`.
-The tensor of ending indices of corresponding string in data, which has same dimension of data.
+
-***steps(optional): tensor(int64/int32)***
+### AddSharedInput
-The tensor of slice step of corresponding string in data, which has same dimension of data.If steps is empty tensor, we will use default value 1 for each string
+
+AddSharedInput details
+
+Computes `A + B` and `A + C` in one kernel launch, sharing the read of `A`.
+
+#### Inputs
+
+***A: tensor(T)***, ***B: tensor(T)***, ***C: tensor(T)***
+
+`T` is one of `float`, `float16`, `bfloat16`. `B` and `C` must have the same shape as `A`.
#### Outputs
-***output: tensor(string)***
+***AB: tensor(T)***, ***AC: tensor(T)***
-Sliced data tensor.
+Elementwise sums `A + B` and `A + C`.
-#### Examples
+
+### MulSharedInput
-```python
+
+MulSharedInput details
-node = onnx.helper.make_node(
- 'StringSlice',
- inputs=['x', 'starts', 'ends', 'steps'],
- outputs=['y'],
-)
+Computes `A * B` and `A * C` in one kernel launch, sharing the read of `A`.
-x = np.array(["abcdef", "hijkl"])
-y = np.array([x[0][1:3:1], x[1][3:1:-1]])
-starts = np.array([1, 3], dtype=np.int64)
-ends = np.array([3, 1], dtype=np.int64)
-axes = np.array([0, 1], dtype=np.int64)
-steps = np.array([1, 1], dtype=np.int64)
+#### Inputs
-expect(node, inputs=[x, starts, ends, axes, steps], outputs=[y],
- name='test_string_slice')
-```
+***A: tensor(T)***, ***B: tensor(T)***, ***C: tensor(T)***
-
+`T` is one of `float`, `float16`, `bfloat16`.
+
+#### Outputs
+***AB: tensor(T)***, ***AC: tensor(T)***
-### MaskedFill
+Elementwise products `A * B` and `A * C`.
+
+
+
+### ScatterNDOfShape
-MaskedFill details
+ScatterNDOfShape details
+Allocates a zero tensor of the given shape and applies a `ScatterND` reduction. Equivalent to `ScatterND(ConstantOfShape(shape, 0), indices, updates, reduction=...)` but fused.
-Fills elements of self tensor with value where mask is True. The operator is similar with [`Tensor.masked_fill_`](https://pytorch.org/docs/stable/generated/torch.Tensor.masked_fill_.html#torch.Tensor.masked_fill_) in pytorch.
+#### Attributes
+***reduction: string*** (default is "add")
+
+Reduction to apply to scattered updates. One of `"add"`, `"mul"`, `"min"`, `"max"`.
#### Inputs
-***value: tensor(string)***
+***shape: tensor(int64)***
-The value to fill in with, currently we only support string type and vector&scalar dimension.
+1D tensor describing the output shape. Must live on CPU.
-***mask: tensor(bool)***
+***indices: tensor(int64)***
-The boolean mask, the dimension of mask tensor should be same with value.
+Indices into the output, as in standard ScatterND.
+
+***updates: tensor(T)***
+
+Values to scatter. `T` is one of `float`, `float16`, `bfloat16`.
#### Outputs
-***output: tensor(string)***
+***output: tensor(T)***
-The filled output of input tensor.
+Tensor of the requested shape with updates applied.
+
-#### Examples
+### MaskedScatterNDOfShape
+
+MaskedScatterNDOfShape details
-```python
+Variant of `ScatterNDOfShape` that ignores entries of `indices` equal to a configurable mask value.
-node = onnx.helper.make_node(
- 'MaskedFill',
- inputs=['value', 'mask'],
- outputs=['output']
-)
+#### Attributes
+***reduction: string*** (default is "add")
-value = np.array(["a", "b", "c", "d"])
-mask = np.array([True, False, True, False], dtype=bool)
-output = np.array(["a", "c"])
+Same as `ScatterNDOfShape`.
+***maskedValue: int64_t***
+
+Index value that causes the corresponding update to be skipped.
+
+#### Inputs
+
+Same as `ScatterNDOfShape`.
+
+#### Outputs
+
+Same as `ScatterNDOfShape`.
-expect(node, inputs=[value, mask], outputs=[output],
- name='test_masked_fill')
-```
+### Transpose2DCastFP16
-### StringRaggedTensorToDense
+
+Transpose2DCastFP16 details
-TODO
+Fused 2D transpose + cast from `float` to `float16`.
-### StringMapping
+#### Inputs
-TODO
+***input: tensor(float)***
-## Math operators
+2D tensor of shape `[M, N]`.
+#### Outputs
-### Inverse
+***output: tensor(float16)***
-TODO
+2D tensor of shape `[N, M]`.
-### NegPos
+
-TODO
+### Transpose2DCastFP32
-### SegmentExtraction
+
+Transpose2DCastFP32 details
-TODO
+Fused 2D transpose + cast from `float16` to `float`.
-### SegmentSum
+#### Inputs
-TODO
+***input: tensor(float16)***
-## Tensor operators
+2D tensor of shape `[M, N]`.
-### RaggedTensorToSparse
+#### Outputs
-TODO
+***output: tensor(float)***
-### RaggedTensorToDense
+2D tensor of shape `[N, M]`.
-TODO
+
### Template
From 290f917dbd66178a2b8200326c902cfc5bcd144b Mon Sep 17 00:00:00 2001
From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com>
Date: Fri, 17 Apr 2026 23:06:51 +0000
Subject: [PATCH 3/4] Clarify ASCII vs Unicode casing for
StringUpper/StringLower
Agent-Logs-Url: https://github.com/microsoft/onnxruntime-extensions/sessions/cdb9185e-44ff-4191-8d4c-b00889e20918
Co-authored-by: justinchuby <11205048+justinchuby@users.noreply.github.com>
---
docs/custom_ops.md | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/docs/custom_ops.md b/docs/custom_ops.md
index ebe12966b..5a8f0b2d2 100644
--- a/docs/custom_ops.md
+++ b/docs/custom_ops.md
@@ -1212,7 +1212,7 @@ Scalar boolean. When true, empty substrings are removed from the output.
StringUpper details
-Converts every ASCII character in each string of the input tensor to uppercase. Non-ASCII bytes are left unchanged.
+Converts every ASCII character in each string of the input tensor to uppercase using `std::toupper`. This operator is ASCII-only; non-ASCII bytes are passed through unchanged. For full Unicode case folding, pre-process inputs accordingly or use `StringLower` as a reference for Unicode handling.
#### Inputs
@@ -1233,7 +1233,7 @@ String tensor of the same shape as `input` with uppercased strings.
StringLower details
-Converts each string in the input tensor to lowercase using Unicode case folding.
+Converts each string in the input tensor to lowercase. Unlike `StringUpper`, this operator decodes input bytes as UTF-8 and performs Unicode-aware case folding on each code point before re-encoding the result.
#### Inputs
From 56a19d6d07e185ca1ea293ea1969cc0aa6071266 Mon Sep 17 00:00:00 2001
From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com>
Date: Sat, 18 Apr 2026 02:55:16 +0000
Subject: [PATCH 4/4] Address PR review feedback: align docs with C++
implementations
Agent-Logs-Url: https://github.com/microsoft/onnxruntime-extensions/sessions/3f620564-a099-495a-8067-d1d71deb349b
Co-authored-by: justinchuby <11205048+justinchuby@users.noreply.github.com>
---
docs/custom_ops.md | 170 ++++++++++++++++++++++++---------------------
1 file changed, 91 insertions(+), 79 deletions(-)
diff --git a/docs/custom_ops.md b/docs/custom_ops.md
index 5a8f0b2d2..7bf8b026e 100644
--- a/docs/custom_ops.md
+++ b/docs/custom_ops.md
@@ -550,7 +550,7 @@ Merge rules (contents of `merges.txt`).
***padding_length: int64_t*** (default is -1)
-If positive, the output is right-padded (or truncated) to this length. When -1 no padding is performed and outputs stay ragged.
+If positive, the output is right-padded (or truncated) to this length. When -1, the output is padded to the maximum sequence length in the batch; the operator still returns a dense tensor with a dynamic second dimension.
#### Inputs
@@ -564,7 +564,7 @@ If positive, the output is right-padded (or truncated) to this length. When -1 n
Tensor of token ids.
-***attention_mask: tensor(int64)***
+***attention_mask: tensor(int64)*** (optional)
Mask with the same shape as `input_ids` (1 for real tokens, 0 for padding).
@@ -594,7 +594,7 @@ BPE merge rules (contents of `merges.txt`).
***padding_length: int64_t*** (default is -1)
-Optional fixed output length. See `CLIPTokenizer`.
+See `CLIPTokenizer`.
#### Inputs
@@ -608,7 +608,7 @@ Optional fixed output length. See `CLIPTokenizer`.
Token ids.
-***attention_mask: tensor(int64)***
+***attention_mask: tensor(int64)*** (optional)
Attention mask, same shape as `input_ids`.
@@ -638,7 +638,7 @@ SentencePiece merge rules.
***padding_length: int64_t*** (default is -1)
-Optional fixed output length.
+See `CLIPTokenizer`.
#### Inputs
@@ -652,7 +652,7 @@ Optional fixed output length.
Tensor of token ids.
-***attention_mask: tensor(int64)***
+***attention_mask: tensor(int64)*** (optional)
Attention mask with the same shape as `input_ids`.
@@ -680,7 +680,7 @@ Contents of `vocab.txt`.
Lowercase inputs before tokenization.
-***strip_accents: int64_t*** (default is 1)
+***strip_accents: int64_t*** (default is 0)
Strip accents as part of normalization.
@@ -704,6 +704,10 @@ Attention mask, same shape as `input_ids`.
Segment ids. All zero for single-sentence input.
+***offset_mapping: tensor(int64)*** (optional)
+
+Per-token `(begin, end)` byte offsets into the corresponding input string.
+
@@ -736,7 +740,7 @@ Additional vocabulary data when the tokenizer uses an external vocab file.
Token ids.
-***attention_mask: tensor(int64)***
+***attention_mask: tensor(int64)*** (optional)
Attention mask matching `input_ids`.
@@ -774,7 +778,7 @@ Scalar flag. When true the `fairseq` vocab-id offset convention is applied.
***output: tensor(string)***
-Scalar string containing the decoded text.
+1D tensor with one string element containing the decoded text.
@@ -929,61 +933,59 @@ Scalar input string.
StringEqual details
-Compares two strings and returns true if they are equal and false if not.
+Compares two strings elementwise and returns true when they are equal.
#### Inputs
***x: tensor(string)***
-The first string input
+The first string input.
-***x: tensor(string)***
+***y: tensor(string)***
-The second string input
+The second string input. Must have the same shape as `x` (or be broadcastable).
#### Outputs
-***z: tensor(boolean)***
+***z: tensor(bool)***
-String with replacements.
+Boolean tensor with the same shape as the broadcasted inputs; `true` where the inputs are equal.
-### StringHash
+### StringToHashBucket
-StringHash details
+StringToHashBucket details
-
-Hashes the input string based on the number of buckets
+Hashes each input string into one of `num_buckets` buckets using the internal FarmHash-like 64-bit hash implementation.
#### Inputs
***input: tensor(string)***
-The string to hash
+The input string tensor to hash.
***num_buckets: tensor(int64)***
-The number of buckets (must be equal to 1?)
+Scalar number of hash buckets. Must be greater than 0.
#### Outputs
-***name: tensor(int64)***
+***output: tensor(int64)***
-The hash value of the string
+Tensor of the same shape as `input` containing the hash-bucket index for each input string. Each value lies in the range `[0, num_buckets)`.
-### StringHashFast
+### StringToHashBucketFast
-StringHashFast details
+StringToHashBucketFast details
-
-A faster implementation of StringHash. Computes hash values for each input string modulo `num_buckets`.
+A faster variant of `StringToHashBucket` that uses `std::hash` internally. Hash values are not stable across platforms or compilers, so the op is intended for stateless in-process hashing rather than persisted lookup tables.
#### Inputs
@@ -993,7 +995,7 @@ The strings to hash.
***num_buckets: tensor(int64)***
-The number of hash buckets (scalar). Each output value will be in the range `[0, num_buckets)`.
+Scalar number of hash buckets. Must be greater than 0.
#### Outputs
@@ -1020,11 +1022,11 @@ The input array of strings
***input_sep: tensor(string)***
-The string separator for the resulting joing
+The string separator for the resulting joining.
***input_axis: tensor(int64)***
-The axis along which to joing
+The axis along which to join.
#### Outputs
@@ -1137,7 +1139,7 @@ Replace all strings matching the pattern or the first one.
***ignore_case: int64*** (default is 0)
-Replace
+Whether to perform case-insensitive ECMAScript regular expression matching.
#### Outputs
@@ -1151,7 +1153,7 @@ String with replacements.
```python
node = onnx.helper.make_node(
- 'StringRegexReplace',
+ 'StringECMARegexReplace',
inputs=['text', 'pattern', 'rewrite'],
outputs=['y'],
)
@@ -1273,15 +1275,15 @@ String tensor of the same shape as `input` with whitespace stripped.
### StringLength
-StringECMARegexReplace details
+StringLength details
-Get the length of each string element in input tensor. Similar to the function `len("abcde"")` in python.
+Get the length of each string element in the input tensor. Similar to the function `len("abcde")` in Python.
#### Inputs
-***data: tensor(string)***
+***input: tensor(string)***
-String tensor to get length of its each string element.
+String tensor to get the length of each string element from.
#### Outputs
@@ -1369,33 +1371,39 @@ expect(node, inputs=[x, y], outputs=[result],
StringRegexSplitWithOffsets details
-Splits string based on regular expressions.
+Splits strings based on regular expressions (RE2 dialect) and reports the byte offsets of each produced token.
#### Inputs
***text: tensor(string)***
-String tensor to extract slices from.
+String tensor to split.
***delim_regex_pattern: tensor(string)***
-Splitting attern of the regular expression.
+Splitting pattern of the regular expression.
***keep_delim_regex_pattern: tensor(string)***
-By default, delimiters are not included in the split string results. Delimiters may be included by specifying a regex pattern keep_delim_regex_pattern.
+By default, delimiters are not included in the split string results. Delimiters may be included by specifying a regex pattern via `keep_delim_regex_pattern`.
#### Outputs
-***words: tensor(string)*** Tensor of words.
+***tokens: tensor(string)***
-***offsets: tensor(int64)*** 2D tensor with 3 columns:
-sentence index, position of the first character, position of the last one (excluded)
+1D tensor of tokens produced by splitting, in row-major order.
-***row_indices: tensor(int64)*** Indices of every first token of input sentences.
-`row_indices[i+1] - row_indices[i]` is the number of tokens in input `i`.
-These are updates row indices given as inputs or new ones if the second input is empty.
+***begin_offsets: tensor(int64)***
+
+1D tensor with the begin byte offset of each token in the corresponding input string.
+***end_offsets: tensor(int64)***
+
+1D tensor with the end byte offset (exclusive) of each token in the corresponding input string.
+
+***row_offsets: tensor(int64)***
+
+1D tensor of row offsets such that tokens of the i-th input string occupy `[row_offsets[i], row_offsets[i+1])` in `tokens`.
#### Examples
@@ -1403,22 +1411,22 @@ These are updates row indices given as inputs or new ones if the second input is
```python
node = onnx.helper.make_node(
- 'StringRegexSplit',
- inputs=['text', 'pattern', 'rewrite'],
- outputs=['y', 'begin_end', 'indices'],
+ 'StringRegexSplitWithOffsets',
+ inputs=['text', 'pattern', 'keep_pattern'],
+ outputs=['tokens', 'begin_offsets', 'end_offsets', 'row_offsets'],
)
text = np.array(["hello there"])
pattern = np.array([r'\s'])
-rewrite = np.array([r'\s'])
-y = np.array(["hello", " ", "there"])
-z1 = np.array([[0, 0, 5],
- [0, 5, 6],
- [0, 6, 11]], dtype=np.int64)
-z2 = np.array([0, 2], dtype=np.int64)
-
-expect(node, inputs=[text, pattern, rewrite], outputs=[y, z1, z2],
- name='test_string_regex_replace')
+keep_pattern = np.array([""])
+tokens = np.array(["hello", "there"])
+begin_offsets = np.array([0, 6], dtype=np.int64)
+end_offsets = np.array([5, 11], dtype=np.int64)
+row_offsets = np.array([0, 2], dtype=np.int64)
+
+expect(node, inputs=[text, pattern, keep_pattern],
+ outputs=[tokens, begin_offsets, end_offsets, row_offsets],
+ name='test_string_regex_split_with_offsets')
```
@@ -1453,17 +1461,21 @@ When set to 1 the regex is matched case-insensitively.
#### Outputs
-***words: tensor(string)***
+***tokens: tensor(string)***
1D tensor containing the split tokens.
-***offsets: tensor(int64)***
+***begin_offsets: tensor(int64)***
+
+1D tensor with the begin byte offset of each token in the corresponding input string.
-2D tensor of shape `[num_tokens, 3]` where each row is `(sentence_index, begin_byte, end_byte)`.
+***end_offsets: tensor(int64)***
-***row_indices: tensor(int64)***
+1D tensor with the end byte offset (exclusive) of each token in the corresponding input string.
-1D tensor of row offsets such that tokens of the i-th input string occupy `[row_indices[i], row_indices[i+1])` in `words`.
+***row_offsets: tensor(int64)***
+
+1D tensor of row offsets such that tokens of the i-th input string occupy `[row_offsets[i], row_offsets[i+1])` in `tokens`.
@@ -2098,17 +2110,13 @@ Decodes a byte stream containing an encoded audio file (WAV, MP3, or FLAC) into
#### Attributes
-***downsampling_rate: int64_t*** (default is 0)
-
-Target sample rate. When 0 the native sample rate of the decoded stream is used.
+***downsampling_rate: int64_t*** (default is -1)
-***stereo_to_mono: int64_t*** (default is 1)
+Target sample rate to resample the decoded audio to. When -1, the native sample rate of the decoded stream is used.
-If 1, multi-channel audio is mixed down to a single mono channel.
+***stereo_to_mono: int64_t*** (default is 0)
-***target_sample_rate: int64_t*** (default is 0)
-
-Alias for `downsampling_rate`; when non-zero the decoded audio is resampled to this rate.
+If set to 1, multi-channel audio is mixed down to a single mono channel.
#### Inputs
@@ -2139,9 +2147,9 @@ Decodes an encoded image (PNG, JPEG, BMP, TIFF, …) into an `HxWx3` uint8 tenso
#### Attributes
-***color_space: string*** (default is "BGR")
+***color_space: string*** (default is "bgr")
-Color ordering of the output. Valid values are `"RGB"` and `"BGR"`.
+Color ordering of the output. Valid values are `"rgb"` and `"bgr"` (case-insensitive).
#### Inputs
@@ -2162,19 +2170,23 @@ Color ordering of the output. Valid values are `"RGB"` and `"BGR"`.
EncodeImage details
-Encodes a 3-channel `HxWx3` uint8 image tensor to PNG or JPEG bytes.
+Encodes a 3-channel `HxWx3` uint8 image tensor to image bytes.
#### Attributes
***format: string*** (default is "png")
-Output image format. Valid values are `"png"` and `"jpg"` (or `"jpeg"`).
+Output image format. Valid values are `"png"` and `"jpg"`.
+
+***color_space: string*** (default is "bgr")
+
+Color space / channel order of the input image. Supported values are `"bgr"` and `"rgb"`.
#### Inputs
***input: tensor(uint8)***
-3D tensor of shape `[H, W, 3]` in BGR order.
+3D tensor of shape `[H, W, 3]`. The expected channel order depends on `color_space`: BGR for `"bgr"` and RGB for `"rgb"`.
#### Outputs
@@ -2232,13 +2244,13 @@ Image tensor with boxes drawn, same shape as `image`.
GaussianBlur details
-Applies a 2D Gaussian blur to an image tensor using OpenCV's `cv::GaussianBlur`.
+Applies a 2D Gaussian blur to an image tensor using OpenCV's `cv::GaussianBlur`. The current kernel wraps the input buffer as a single `CV_32FC3` matrix, so inputs must have `N == 1` and `C == 3` channels.
#### Inputs
***input: tensor(float)***
-4D image tensor of shape `[N, H, W, C]`.
+4D image tensor of shape `[1, H, W, 3]`.
***ksize: tensor(int64)***
@@ -2288,7 +2300,7 @@ Reads an image from a file path using OpenCV's `cv::imread` and returns the deco
***input: tensor(string)***
-Scalar string with the path of the image file to read.
+1D string tensor of shape `[1]` containing the path of the image file to read.
#### Outputs