From 5824e8797712529dca9db9ad9e15ebf36a87f391 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Fri, 17 Apr 2026 22:57:46 +0000 Subject: [PATCH 1/4] Initial plan From 1d2377c77ce09944b9555fa43ddeb35960ecce04 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Fri, 17 Apr 2026 23:05:43 +0000 Subject: [PATCH 2/4] Document contracts for all custom operators Agent-Logs-Url: https://github.com/microsoft/onnxruntime-extensions/sessions/cdb9185e-44ff-4191-8d4c-b00889e20918 Co-authored-by: justinchuby <11205048+justinchuby@users.noreply.github.com> --- docs/custom_ops.md | 2099 +++++++++++++++++++++++++++++++++++--------- 1 file changed, 1701 insertions(+), 398 deletions(-) diff --git a/docs/custom_ops.md b/docs/custom_ops.md index cacbaba47..ebe12966b 100644 --- a/docs/custom_ops.md +++ b/docs/custom_ops.md @@ -531,734 +531,2037 @@ expect(node, inputs=[inputs], -## String operators - -### StringEqual +### CLIPTokenizer
-StringEqual details +CLIPTokenizer details -Compares two strings and returns true if they are equal and false if not. +Byte-pair-encoding (BPE) tokenizer matching the CLIP text encoder from HuggingFace/OpenAI. Converts input strings into token id sequences. -#### Inputs +#### Attributes -***x: tensor(string)*** +***vocab: string*** -The first string input +JSON vocabulary mapping tokens to ids (contents of `vocab.json`). -***x: tensor(string)*** +***merges: string*** -The second string input +Merge rules (contents of `merges.txt`). -#### Outputs +***padding_length: int64_t*** (default is -1) -***z: tensor(boolean)*** +If positive, the output is right-padded (or truncated) to this length. When -1 no padding is performed and outputs stay ragged. -String with replacements. +#### Inputs -
+***input: tensor(string)*** +1D string tensor containing the input texts. -### StringHash +#### Outputs -
-StringHash details +***input_ids: tensor(int64)*** +Tensor of token ids. -Hashes the input string based on the number of buckets +***attention_mask: tensor(int64)*** -#### Inputs +Mask with the same shape as `input_ids` (1 for real tokens, 0 for padding). -***input: tensor(string)*** +***offset_mapping: tensor(int64)*** (optional) -The string to hash +If requested, per-token `(begin, end)` byte offsets into the corresponding input string. -***num_buckets: tensor(int64)*** +
-The number of buckets (must be equal to 1?) -#### Outputs +### RobertaTokenizer -***name: tensor(int64)*** +
+RobertaTokenizer details -The hash value of the string +BPE tokenizer compatible with HuggingFace's RoBERTa tokenizer. Uses the same attributes and I/O contract as `CLIPTokenizer`. -
+#### Attributes +***vocab: string*** -### StringHashFast +JSON vocabulary (contents of `vocab.json`). -
-StringHashFast details +***merges: string*** +BPE merge rules (contents of `merges.txt`). -A faster implementation of StringHash. +***padding_length: int64_t*** (default is -1) -
+Optional fixed output length. See `CLIPTokenizer`. +#### Inputs -### StringJoin +***input: tensor(string)*** -
-StringJoin details +1D string tensor of input texts. +#### Outputs -Join an array of strings +***input_ids: tensor(int64)*** -#### Inputs +Token ids. -***input_X: tensor(string)*** +***attention_mask: tensor(int64)*** -The input array of strings +Attention mask, same shape as `input_ids`. -***input_sep: tensor(string)*** +***offset_mapping: tensor(int64)*** (optional) -The string separator for the resulting joing +Per-token byte offsets into each input string. -***input_axis: tensor(int64)*** +
-The axis along which to joing -#### Outputs +### SpmTokenizer -***out: tensor(string)*** +
+SpmTokenizer details -The resulting joined string +SentencePiece-compatible tokenizer built on top of the shared BPE kernel. Produces tokens equivalent to HuggingFace's "fast" SentencePiece tokenizers (e.g. Llama, T5, XLM-RoBERTa). -#### Examples +#### Attributes +***vocab: string*** -```bash +JSON vocabulary produced from a SentencePiece model. -input_X = [["a", "b", "c"], ["aa", "bb", ""]] -input_sep=";" -input_axis = 1 +***merges: string*** -out = ["a;b;c", "aa;bb;"] +SentencePiece merge rules. -input_axis = 0 +***padding_length: int64_t*** (default is -1) -out = ['a;aa', 'b;bb', 'c;'] +Optional fixed output length. +#### Inputs -
+***input: tensor(string)*** +1D string tensor of inputs. -### StringRegexReplace +#### Outputs -
-StringRegexReplace details +***input_ids: tensor(int64)*** +Tensor of token ids. -String replacement based on [Re2-format](https://github.com/google/re2/wiki/Syntax) regular expressions. +***attention_mask: tensor(int64)*** -#### Inputs +Attention mask with the same shape as `input_ids`. -***text: tensor(string)*** +***offset_mapping: tensor(int64)*** (optional) -String tensor to extract slices from. +Per-token byte offsets. -***pattern: tensor(string)*** +
-Pattern of the regular expression. -***rewrite: tensor(string)*** +### HfBertTokenizer -Replacement. +
+HfBertTokenizer details + +HuggingFace-compatible BERT WordPiece tokenizer. Behaves like `BertTokenizer`'s `__call__` method but with a smaller attribute surface. Produces ids, attention masks and token type ids in a single op. #### Attributes -***global_replace: int64*** (default is 1) +***vocab_file: string*** -Replace all strings matching the pattern or the first one. +Contents of `vocab.txt`. -#### Outputs +***do_lower_case: int64_t*** (default is 1) -***output: tensor(string)*** +Lowercase inputs before tokenization. -String with replacements. +***strip_accents: int64_t*** (default is 1) -#### Examples +Strip accents as part of normalization. -```python +#### Inputs -node = onnx.helper.make_node( - 'StringRegexReplace', - inputs=['text', 'pattern', 'rewrite'], - outputs=['y'], -) +***input: tensor(string)*** -text = np.array([['def myfunc():'], ['def dummy():']]) -pattern = np.array([r'def\s+([a-zA-Z_][a-zA-Z_0-9]*)\s*\(\s*\):']) -rewrite = np.array([r'static PyObject* py_\1(void) {']) -y = [['static PyObject* py_myfunc(void) {'], - ['static PyObject* py_dummy(void) {']] +1D string tensor containing the texts to tokenize. -expect(node, inputs=[text, pattern, rewrite], outputs=[y], - name='test_string_regex_replace') -``` +#### Outputs -
+***input_ids: tensor(int64)*** -### StringECMARegexReplace +Token ids. -
-StringECMARegexReplace details +***attention_mask: tensor(int64)*** -String replacement based on [ECMA-format](https://en.cppreference.com/w/cpp/regex/ecmascript) regular expressions. +Attention mask, same shape as `input_ids`. -#### Inputs +***token_type_ids: tensor(int64)*** (optional) -***text: tensor(string)*** +Segment ids. All zero for single-sentence input. -String tensor to extract slices from. +
-***pattern: tensor(string)*** -Pattern of the regular expression. +### HfJsonTokenizer -***rewrite: tensor(string)*** +
+HfJsonTokenizer details -Replacement. +Loads a HuggingFace `tokenizer.json` directly and dispatches to the appropriate kernel (BPE or Unigram). Matches HuggingFace fast tokenizers at inference time. #### Attributes -***global_replace: int64*** (default is 1) +***tokenizer_config: string*** -Replace all strings matching the pattern or the first one. +Contents of `tokenizer.json` (and optionally `tokenizer_config.json`). +***tokenizer_vocab: string*** (optional) -***ignore_case: int64*** (default is 0) +Additional vocabulary data when the tokenizer uses an external vocab file. -Replace +#### Inputs -#### Outputs +***input: tensor(string)*** -***output: tensor(string)*** +1D string tensor of inputs. -String with replacements. +#### Outputs -#### Examples +***input_ids: tensor(int64)*** +Token ids. -```python +***attention_mask: tensor(int64)*** -node = onnx.helper.make_node( - 'StringRegexReplace', - inputs=['text', 'pattern', 'rewrite'], - outputs=['y'], -) +Attention mask matching `input_ids`. -text = np.array([['def myfunc():'], ['def dummy():']]) -pattern = np.array([r'def\s+([a-zA-Z_][a-zA-Z_0-9]*)\s*\(\s*\):']) -rewrite = np.array([r'static PyObject* py_$1(void) {']) -y = [['static PyObject* py_myfunc(void) {'], - ['static PyObject* py_dummy(void) {']] +***offset_mapping: tensor(int64)*** (optional) -expect(node, inputs=[text, pattern, rewrite], outputs=[y], - name='test_string_regex_replace') -``` +Per-token byte offsets.
+### SentencepieceDecoder -### StringSplit - -TODO - -### StringUpper +
+SentencepieceDecoder details -TODO +Decodes a sequence of SentencePiece ids back into a string. -### StringLower +#### Attributes -TODO +***model: string*** -### StringLength +Serialized SentencePiece model (`*.model`). -
-StringECMARegexReplace details +#### Inputs -Get the length of each string element in input tensor. Similar to the function `len("abcde"")` in python. +***ids: tensor(int64)*** -#### Inputs +1D or 2D tensor of ids. When 2D the leading dimension must be 1. -***data: tensor(string)*** +***fairseq: tensor(bool)*** (optional) -String tensor to get length of its each string element. +Scalar flag. When true the `fairseq` vocab-id offset convention is applied. #### Outputs -***output: tensor(int64)*** +***output: tensor(string)*** -Data length tensor. +Scalar string containing the decoded text. -#### Examples +
-```python +### BpeDecoder -node = onnx.helper.make_node( - 'StringLength', - inputs=['x'], - outputs=['y'] -) +
+BpeDecoder details -x = ["abcdef", "hijkl"] -y = np.array([len(x[0]), len(x[1])], dtype=np.int64) +Decodes BPE token ids (GPT-2 / CLIP / RoBERTa style) back into text. +#### Attributes -expect(node, inputs=[x], outputs=[y], - name='test_string_length') -``` -
- -### StringConcat +***id_vocab: string*** -
-StringConcat details +Newline-separated token strings indexed by id. -Concat the corresponding string in the two string tensor. Two input tensors should have the same dimension. +***byte_decoder: string*** -```python - output = [] - shape = input1.shape - input1 = input1.flatten() - input2 = input2.flatten() - for i in range(len(input1)): - output.append(input1[i] + input2[i]) - output = np.array(output).reshape(shape) -``` +Reverse byte-to-unicode mapping used by GPT-2 BPE encoders. -#### Inputs +***added_tokens: string*** (optional) -***input_1: tensor(string)*** +Extra tokens appended to the base vocabulary. -The first string tensor. +***all_special_ids: string*** (optional) -***input_2: tensor(string)*** +Comma-separated list of special token ids. -The second string tensor. +***skip_special_tokens: int64_t*** (default is 0) +When 1, ids in `all_special_ids` are skipped during decoding. -#### Outputs +***en_normalization: int64_t*** (default is 0) -***output: tensor(string)*** +Apply a minimal English-oriented post-processing step (e.g. undo leading-space markers). -The result. +***whitespace_token: string*** (optional) +***bos_token: string*** (optional) +***eos_token: string*** (optional) +***unk_token: string*** (optional) -#### Examples +Optional overrides for well-known special tokens. +#### Inputs -```python +***ids: tensor(int64)*** -node = onnx.helper.make_node( - 'StringConcat', - inputs=['x', 'y'], - outputs=['result'], -) +1D or 2D tensor of token ids. -x = np.array(["abcd", "efgh"]) -y = np.array(["wxyz", "stuv"]) -result = np.array([x[0] + y[0], x[1] + y[1]]) +#### Outputs -expect(node, inputs=[x, y], outputs=[result], - name='test_string_concat') -``` +***output: tensor(string)*** + +Decoded string tensor.
-### StringRegexSplitWithOffsets -
-StringRegexSplitWithOffsets details +### TrieTokenizer -Splits string based on regular expressions. +
+TrieTokenizer details -#### Inputs +Trie-based longest-match tokenizer used by RWKV-style models. -***text: tensor(string)*** +#### Attributes -String tensor to extract slices from. +***vocab: string*** -***delim_regex_pattern: tensor(string)*** +Newline-separated vocab where each line has the form `index token length`. `token` is a Python-repr-encoded byte string. -Splitting attern of the regular expression. +#### Inputs -***keep_delim_regex_pattern: tensor(string)*** +***input: tensor(string)*** -By default, delimiters are not included in the split string results. Delimiters may be included by specifying a regex pattern keep_delim_regex_pattern. +1D string tensor of inputs. #### Outputs -***words: tensor(string)*** Tensor of words. +***output: tensor(int64)*** -***offsets: tensor(int64)*** 2D tensor with 3 columns: -sentence index, position of the first character, position of the last one (excluded) +2D right-padded tensor of token ids; padding uses id `0`. -***row_indices: tensor(int64)*** Indices of every first token of input sentences. -`row_indices[i+1] - row_indices[i]` is the number of tokens in input `i`. -These are updates row indices given as inputs or new ones if the second input is empty. +
-#### Examples +### TrieDetokenizer +
+TrieDetokenizer details -```python +Inverse of `TrieTokenizer`. Converts 1D or 2D id tensors back to strings using the same trie vocabulary. -node = onnx.helper.make_node( - 'StringRegexSplit', +#### Attributes + +***vocab: string*** + +Same vocabulary format as `TrieTokenizer`. + +#### Inputs + +***ids: tensor(int64)*** + +1D or 2D tensor of token ids. + +#### Outputs + +***output: tensor(string)*** + +Decoded text, one string per row. + +
+ + +### BlingFireSentenceBreaker + +
+BlingFireSentenceBreaker details + +Segments an input string into sentences using a compiled [BlingFire](https://github.com/microsoft/BlingFire) model. + +#### Attributes + +***model: string*** + +Raw bytes of the compiled BlingFire sentence-breaking model (`*.bin`). + +***max_sentence: int64_t*** (default is -1) + +If positive, limits the number of returned sentences. + +#### Inputs + +***input: tensor(string)*** + +Scalar input string. + +#### Outputs + +***output: tensor(string)*** + +1D tensor of sentences. + +
+ + +## String operators + +### StringEqual + +
+StringEqual details + +Compares two strings and returns true if they are equal and false if not. + +#### Inputs + +***x: tensor(string)*** + +The first string input + +***x: tensor(string)*** + +The second string input + +#### Outputs + +***z: tensor(boolean)*** + +String with replacements. + +
+ + +### StringHash + +
+StringHash details + + +Hashes the input string based on the number of buckets + +#### Inputs + +***input: tensor(string)*** + +The string to hash + +***num_buckets: tensor(int64)*** + +The number of buckets (must be equal to 1?) + +#### Outputs + +***name: tensor(int64)*** + +The hash value of the string + +
+ + +### StringHashFast + +
+StringHashFast details + + +A faster implementation of StringHash. Computes hash values for each input string modulo `num_buckets`. + +#### Inputs + +***input: tensor(string)*** + +The strings to hash. + +***num_buckets: tensor(int64)*** + +The number of hash buckets (scalar). Each output value will be in the range `[0, num_buckets)`. + +#### Outputs + +***output: tensor(int64)*** + +The hashed values, with the same shape as `input`. + +
+ + +### StringJoin + +
+StringJoin details + + +Join an array of strings + +#### Inputs + +***input_X: tensor(string)*** + +The input array of strings + +***input_sep: tensor(string)*** + +The string separator for the resulting joing + +***input_axis: tensor(int64)*** + +The axis along which to joing + +#### Outputs + +***out: tensor(string)*** + +The resulting joined string + +#### Examples + + +```bash + +input_X = [["a", "b", "c"], ["aa", "bb", ""]] +input_sep=";" +input_axis = 1 + +out = ["a;b;c", "aa;bb;"] + +input_axis = 0 + +out = ['a;aa', 'b;bb', 'c;'] + + +
+ + +### StringRegexReplace + +
+StringRegexReplace details + + +String replacement based on [Re2-format](https://github.com/google/re2/wiki/Syntax) regular expressions. + +#### Inputs + +***text: tensor(string)*** + +String tensor to extract slices from. + +***pattern: tensor(string)*** + +Pattern of the regular expression. + +***rewrite: tensor(string)*** + +Replacement. + +#### Attributes + +***global_replace: int64*** (default is 1) + +Replace all strings matching the pattern or the first one. + +#### Outputs + +***output: tensor(string)*** + +String with replacements. + +#### Examples + +```python + +node = onnx.helper.make_node( + 'StringRegexReplace', inputs=['text', 'pattern', 'rewrite'], - outputs=['y', 'begin_end', 'indices'], + outputs=['y'], ) -text = np.array(["hello there"]) -pattern = np.array([r'\s']) -rewrite = np.array([r'\s']) -y = np.array(["hello", " ", "there"]) -z1 = np.array([[0, 0, 5], - [0, 5, 6], - [0, 6, 11]], dtype=np.int64) -z2 = np.array([0, 2], dtype=np.int64) +text = np.array([['def myfunc():'], ['def dummy():']]) +pattern = np.array([r'def\s+([a-zA-Z_][a-zA-Z_0-9]*)\s*\(\s*\):']) +rewrite = np.array([r'static PyObject* py_\1(void) {']) +y = [['static PyObject* py_myfunc(void) {'], + ['static PyObject* py_dummy(void) {']] + +expect(node, inputs=[text, pattern, rewrite], outputs=[y], + name='test_string_regex_replace') +``` + +
+ +### StringECMARegexReplace + +
+StringECMARegexReplace details + +String replacement based on [ECMA-format](https://en.cppreference.com/w/cpp/regex/ecmascript) regular expressions. + +#### Inputs + +***text: tensor(string)*** + +String tensor to extract slices from. + +***pattern: tensor(string)*** + +Pattern of the regular expression. + +***rewrite: tensor(string)*** + +Replacement. + +#### Attributes + +***global_replace: int64*** (default is 1) + +Replace all strings matching the pattern or the first one. + + +***ignore_case: int64*** (default is 0) + +Replace + +#### Outputs + +***output: tensor(string)*** + +String with replacements. + +#### Examples + + +```python + +node = onnx.helper.make_node( + 'StringRegexReplace', + inputs=['text', 'pattern', 'rewrite'], + outputs=['y'], +) + +text = np.array([['def myfunc():'], ['def dummy():']]) +pattern = np.array([r'def\s+([a-zA-Z_][a-zA-Z_0-9]*)\s*\(\s*\):']) +rewrite = np.array([r'static PyObject* py_$1(void) {']) +y = [['static PyObject* py_myfunc(void) {'], + ['static PyObject* py_dummy(void) {']] + +expect(node, inputs=[text, pattern, rewrite], outputs=[y], + name='test_string_regex_replace') +``` + +
+ + + +### StringSplit + +
+StringSplit details + +Splits each string in the input by a separator, producing a ragged (sparse) representation of the resulting tokens. + +#### Inputs + +***input: tensor(string)*** + +1D string tensor to split. + +***sep: tensor(string)*** + +Scalar string separator used to split each element of `input`. If empty, the string is split on whitespace. + +***skip_empty: tensor(bool)*** + +Scalar boolean. When true, empty substrings are removed from the output. + +#### Outputs + +***indices: tensor(int64)*** + +2D tensor of shape `[N, 2]` containing `(row, col)` coordinates of each output token in the ragged representation. + +***values: tensor(string)*** + +1D tensor of `N` tokens produced by splitting, in row-major order. + +***shape: tensor(int64)*** + +2-element tensor describing the dense shape `[num_rows, max_row_width]` of the ragged tensor. + +
+ +### StringUpper + +
+StringUpper details + +Converts every ASCII character in each string of the input tensor to uppercase. Non-ASCII bytes are left unchanged. + +#### Inputs + +***input: tensor(string)*** + +String tensor of arbitrary shape. + +#### Outputs + +***output: tensor(string)*** + +String tensor of the same shape as `input` with uppercased strings. + +
+ +### StringLower + +
+StringLower details + +Converts each string in the input tensor to lowercase using Unicode case folding. + +#### Inputs + +***input: tensor(string)*** + +String tensor of arbitrary shape. + +#### Outputs + +***output: tensor(string)*** + +String tensor of the same shape as `input` with lowercased strings. + +
+ +### StringStrip + +
+StringStrip details + +Removes leading and trailing whitespace characters from every string in the input tensor. Similar to `str.strip()` in Python. + +#### Inputs + +***input: tensor(string)*** + +String tensor of arbitrary shape. + +#### Outputs + +***output: tensor(string)*** + +String tensor of the same shape as `input` with whitespace stripped. + +
+ +### StringLength + +
+StringECMARegexReplace details + +Get the length of each string element in input tensor. Similar to the function `len("abcde"")` in python. + +#### Inputs + +***data: tensor(string)*** + +String tensor to get length of its each string element. + +#### Outputs + +***output: tensor(int64)*** + +Data length tensor. + +#### Examples + + +```python + +node = onnx.helper.make_node( + 'StringLength', + inputs=['x'], + outputs=['y'] +) + +x = ["abcdef", "hijkl"] +y = np.array([len(x[0]), len(x[1])], dtype=np.int64) + + +expect(node, inputs=[x], outputs=[y], + name='test_string_length') +``` +
+ +### StringConcat + +
+StringConcat details + +Concat the corresponding string in the two string tensor. Two input tensors should have the same dimension. + +```python + output = [] + shape = input1.shape + input1 = input1.flatten() + input2 = input2.flatten() + for i in range(len(input1)): + output.append(input1[i] + input2[i]) + output = np.array(output).reshape(shape) +``` + +#### Inputs + +***input_1: tensor(string)*** + +The first string tensor. + +***input_2: tensor(string)*** + +The second string tensor. + + +#### Outputs + +***output: tensor(string)*** + +The result. + +#### Examples + + +```python + +node = onnx.helper.make_node( + 'StringConcat', + inputs=['x', 'y'], + outputs=['result'], +) + +x = np.array(["abcd", "efgh"]) +y = np.array(["wxyz", "stuv"]) +result = np.array([x[0] + y[0], x[1] + y[1]]) + +expect(node, inputs=[x, y], outputs=[result], + name='test_string_concat') +``` + +
+ +### StringRegexSplitWithOffsets + +
+StringRegexSplitWithOffsets details + +Splits string based on regular expressions. + +#### Inputs + +***text: tensor(string)*** + +String tensor to extract slices from. + +***delim_regex_pattern: tensor(string)*** + +Splitting attern of the regular expression. + +***keep_delim_regex_pattern: tensor(string)*** + +By default, delimiters are not included in the split string results. Delimiters may be included by specifying a regex pattern keep_delim_regex_pattern. + +#### Outputs + +***words: tensor(string)*** Tensor of words. + +***offsets: tensor(int64)*** 2D tensor with 3 columns: +sentence index, position of the first character, position of the last one (excluded) + +***row_indices: tensor(int64)*** Indices of every first token of input sentences. +`row_indices[i+1] - row_indices[i]` is the number of tokens in input `i`. +These are updates row indices given as inputs or new ones if the second input is empty. + + +#### Examples + + +```python + +node = onnx.helper.make_node( + 'StringRegexSplit', + inputs=['text', 'pattern', 'rewrite'], + outputs=['y', 'begin_end', 'indices'], +) + +text = np.array(["hello there"]) +pattern = np.array([r'\s']) +rewrite = np.array([r'\s']) +y = np.array(["hello", " ", "there"]) +z1 = np.array([[0, 0, 5], + [0, 5, 6], + [0, 6, 11]], dtype=np.int64) +z2 = np.array([0, 2], dtype=np.int64) + +expect(node, inputs=[text, pattern, rewrite], outputs=[y, z1, z2], + name='test_string_regex_replace') +``` + +
+ + +### StringECMARegexSplitWithOffsets + +
+StringECMARegexSplitWithOffsets details + +Splits strings using a regular expression in the ECMAScript dialect and reports the byte offsets of every produced token. Provides the same functionality as `StringRegexSplitWithOffsets` but uses `std::regex` instead of `re2`, allowing ECMAScript regex features. + +#### Inputs + +***input: tensor(string)*** + +String tensor to split. + +***pattern: tensor(string)*** + +Scalar string containing the ECMAScript regex splitting pattern. + +***keep_pattern: tensor(string)*** + +Scalar string. Delimiter matches that also match this pattern are preserved as tokens in the output. Pass an empty string to drop all delimiters. + +#### Attributes + +***ignore_case: int64_t*** (default is 0) + +When set to 1 the regex is matched case-insensitively. + +#### Outputs + +***words: tensor(string)*** + +1D tensor containing the split tokens. + +***offsets: tensor(int64)*** + +2D tensor of shape `[num_tokens, 3]` where each row is `(sentence_index, begin_byte, end_byte)`. + +***row_indices: tensor(int64)*** + +1D tensor of row offsets such that tokens of the i-th input string occupy `[row_indices[i], row_indices[i+1])` in `words`. + +
+ +### VectorToString + +
+VectorToString details + +VectorToString is the contrary operation to the `StringToVector` , they share same format of mapping table: + + \t\s\s... + +Unmapped vector will output the value of the attribute `unk`. + +Example: + +*Attributes:* + +- `map`: + ``` + a 0 0 1 2 + b 0 1 2 3 + d 0 1 3 4 + ``` + +- `unk`: "unknown_word" + +*Inputs:* +- data: [[0,0,1,2],[0,1,3,4],[0,0,0,0]] + +*Ouputs:* +- output: ["a", "d", "unknown_word" ] + +#### Attributes + +***mapping_file_name*** + +the formative mapping table + +***unmapping_value*** + +the result returned when a vector aren't found in the map + +#### Inputs + +***data: tensor(T)*** + +Input tensor + +#### Outputs + +***output: tensor(string)*** + +The mapping result of the input + +#### Type Constraints +***T:tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(bfloat16), tensor(float16), tensor(float), tensor(double), tensor(bool)*** + +Constrain input and output types to numerical tensors. + + +#### Examples + + +```python +mapping_table = \ + """ + a 0 0 1 2 + b 0 1 2 3 + d 0 1 3 4 + """ + +node = onnx.helper.make_node( + 'VectorToString', + inputs=['x'], + outputs=['y'], + map=mapping_table, + unk="unknown_word" +) + + +x = np.array([[0,0,1,2],[0,1,3,4],[0,0,0,0]], type=np.int64) +y = ["a", "d", "unknown_word"] + + +expect(node, inputs=[x], outputs=[y], + name='test_vector_to_string') +``` +
+ + +### StringToVector + +
+StringToVector details + +StringToVector will map each string element in the input to the corresponding vector according to the mapping file. The mapping file is a utf-8 encoding text file in tsv format: + + \t\s\s... + +Unmapped string will output the value of the attribute `unmapping_value`. + +Example: + +*Attributes:* + +- `mapping_file_name`: vocabulary.txt + ``` + a 0 0 1 2 + b 0 1 2 3 + d 0 1 3 4 + ``` + +- `unmapping_value`: [0 0 0 0] + +*Inputs:* +- data: ["a", "d", "e"] + +*Ouputs:* +- output: [[0,0,1,2],[0,1,3,4],[0,0,0,0]] + +#### Attributes + +***mapping_file_name:string*** + +The name of your string to vector mapping file. + +***unmapping_value:list(int)*** + +Mapping result for unmapped string + +#### Inputs + +***data: tensor(string)*** + +Input tensor + +#### Outputs + +***output: tensor(T)*** + +The mapping result of the input + +#### Type Constraints +***T:tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(bfloat16), tensor(float16), tensor(float), tensor(double), tensor(bool)*** + +Constrain input and output types to numerical tensors. + +#### Examples + + +```python +# what's in vocabulary.txt + +mapping_table = \ +""" +a 0 0 1 2 +b 0 1 2 3 +d 0 1 3 4 +""" + +node = onnx.helper.make_node( + 'StringToVector', + inputs=['x'], + outputs=['y'], + mapping_table=mapping_table, + unmapping_value=[0,0,0,0] +) + + +x = ["a", "d", "e"] +y = np.array([[0,0,1,2],[0,1,3,4],[0,0,0,0]], type=np.int64) + + +expect(node, inputs=[x], outputs=[y], + name='test_string_to_vector') +``` + +
+ + + +### StringSlice + +
+StringSlice details + +Do the slice operation to each string element in input tensor. Similar to string slice in python + +```python +a = "abcdef" +b = a[1:2] +c = a[3:1:-1] +``` + +#### Inputs + +***data: tensor(string)*** + +String tensor to extract slices from. + +***starts: tensor(int64/int32)*** + +The tensor of starting indices of corresponding string in data, which has same dimension of data. + +***ends: tensor(int64/int32)*** + +The tensor of ending indices of corresponding string in data, which has same dimension of data. + +***steps(optional): tensor(int64/int32)*** + +The tensor of slice step of corresponding string in data, which has same dimension of data.If steps is empty tensor, we will use default value 1 for each string + +#### Outputs + +***output: tensor(string)*** + +Sliced data tensor. + +#### Examples + + +```python + +node = onnx.helper.make_node( + 'StringSlice', + inputs=['x', 'starts', 'ends', 'steps'], + outputs=['y'], +) + +x = np.array(["abcdef", "hijkl"]) +y = np.array([x[0][1:3:1], x[1][3:1:-1]]) +starts = np.array([1, 3], dtype=np.int64) +ends = np.array([3, 1], dtype=np.int64) +axes = np.array([0, 1], dtype=np.int64) +steps = np.array([1, 1], dtype=np.int64) + +expect(node, inputs=[x, starts, ends, axes, steps], outputs=[y], + name='test_string_slice') +``` + +
+ + +### MaskedFill + +
+MaskedFill details + + +Fills elements of self tensor with value where mask is True. The operator is similar with [`Tensor.masked_fill_`](https://pytorch.org/docs/stable/generated/torch.Tensor.masked_fill_.html#torch.Tensor.masked_fill_) in pytorch. + + +#### Inputs + +***value: tensor(string)*** + +The value to fill in with, currently we only support string type and vector&scalar dimension. + +***mask: tensor(bool)*** + +The boolean mask, the dimension of mask tensor should be same with value. + +#### Outputs + +***output: tensor(string)*** + +The filled output of input tensor. + + +#### Examples + + +```python + +node = onnx.helper.make_node( + 'MaskedFill', + inputs=['value', 'mask'], + outputs=['output'] +) + + +value = np.array(["a", "b", "c", "d"]) +mask = np.array([True, False, True, False], dtype=bool) +output = np.array(["a", "c"]) + + +expect(node, inputs=[value, mask], outputs=[output], + name='test_masked_fill') +``` +
+ + +### StringRaggedTensorToDense + +
+StringRaggedTensorToDense details + +Converts a ragged string tensor to a dense 2D string tensor, padding shorter rows with a fill value. + +#### Inputs + +***row_splits: tensor(int64)*** + +1D tensor with the starting position of each row in `values`. Row `i` contains `values[row_splits[i]:row_splits[i+1]]`. + +***values: tensor(string)*** + +1D flat string tensor holding the concatenated row values. + +***default_value_shape: tensor(int64)*** + +1D tensor describing the target dense shape. Only used to determine the number of columns. + +***default_value: tensor(string)*** + +Scalar string used to pad rows that are shorter than the longest row. + +#### Outputs + +***output: tensor(string)*** + +2D dense string tensor with padding applied. + +
+ +### StringMapping + +
+StringMapping details + +Maps each element of the input string tensor to another string using a user-supplied dictionary. Strings not found in the dictionary are passed through unchanged. + +#### Attributes + +***map: string*** + +A string containing one mapping per line. Each line has the form `key\tvalue`, where key and value are separated by a tab character. + +#### Inputs + +***input: tensor(string)*** + +Input string tensor of arbitrary shape. + +#### Outputs + +***output: tensor(string)*** + +Output string tensor of the same shape as `input` after mapping. + +
+ +## Math operators + + +### Inverse + +
+Inverse details + +Computes the matrix inverse of a 2D floating-point tensor. + +#### Inputs + +***input: tensor(float)*** + +A 2D square matrix of shape `[N, N]`. + +#### Outputs + +***output: tensor(float)*** + +The inverse of the input matrix, of shape `[N, N]`. + +
+ +### NegPos + +
+NegPos details + +Splits an input tensor into its negative and positive parts. Equivalent to `min(x, 0)` and `max(x, 0)` returned separately. + +#### Inputs + +***input: tensor(float)*** + +Input tensor of arbitrary shape. + +#### Outputs + +***neg: tensor(float)*** + +Tensor with the same shape as `input`; contains `x` where `x < 0`, else `0`. + +***pos: tensor(float)*** + +Tensor with the same shape as `input`; contains `x` where `x >= 0`, else `0`. + +
+ +### SegmentExtraction + +
+SegmentExtraction details + +Extracts contiguous non-zero segments from a 1D integer input. For every maximal run of non-zero values, the start and end positions are returned. + +#### Inputs + +***input: tensor(int64)*** + +1D input tensor. + +#### Outputs + +***position: tensor(int64)*** + +2D tensor of shape `[num_segments, 2]` where each row is `(begin, end)` (end exclusive). + +***value: tensor(int64)*** + +1D tensor of length `num_segments` with the value inside each segment. + +
+ +### SegmentSum + +
+SegmentSum details + +Computes sums along segments of the first axis of a tensor, similar to TensorFlow's `tf.math.segment_sum`. + +#### Inputs + +***data: tensor(float)*** + +The values to reduce. The first dimension is the segment axis. + +***segment_ids: tensor(int64)*** + +1D tensor with the same length as `data.shape[0]`. Must be non-decreasing. + +#### Outputs + +***output: tensor(float)*** + +Tensor where `output[i]` is the sum of all rows of `data` whose corresponding `segment_ids` equal `i`. + +
+ +### StftNorm + +
+StftNorm details + +Computes a short-time Fourier transform (STFT) of a 1D signal and returns the magnitude spectrogram. The implementation uses a Hann-style sliding window. + +#### Attributes + +***onesided: int64_t*** (default is 1) + +If 1, only the non-redundant positive-frequency half of the spectrum is returned (length `n_fft / 2 + 1`). If 0, the full spectrum is returned. + +#### Inputs + +***pcm: tensor(float)*** + +1D audio signal. + +***n_fft: tensor(int64)*** + +Scalar FFT size. + +***hop_length: tensor(int64)*** + +Scalar hop length between consecutive frames. + +***window: tensor(float)*** + +1D window function of length `frame_length`. + +***frame_length: tensor(int64)*** + +Scalar frame length (must equal `n_fft`). + +#### Outputs + +***output: tensor(float)*** + +3D tensor of shape `[1, num_frames, num_freq_bins]` containing the magnitude spectrogram. + +
+ +### SplitSignalSegments + +
+SplitSignalSegments details + +Partitions an audio signal into segments of voiced/high-energy regions based on a simple short-time energy threshold. + +#### Inputs + +***input: tensor(float)*** + +1D audio signal. + +***sr: tensor(int64)*** + +Scalar sample rate in Hz. + +***frame_ms: tensor(int64)*** + +Scalar analysis frame length in milliseconds. + +***hop_ms: tensor(int64)*** + +Scalar hop length between analysis frames in milliseconds. + +***energy_threshold_db: tensor(float)*** + +Scalar energy threshold in dBFS. Frames with average energy below this are treated as silence. + +#### Outputs + +***segments: tensor(int64)*** + +2D tensor of shape `[num_segments, 2]` where each row contains the `(begin_sample, end_sample)` indices of a detected segment. + +
+ +### MergeSignalSegments + +
+MergeSignalSegments details + +Merges adjacent audio segments whose gap is shorter than a configurable threshold. Typically used as a post-processing step after `SplitSignalSegments`. + +#### Inputs + +***segments: tensor(int64)*** + +2D tensor of shape `[N, 2]` with `(begin, end)` indices, as produced by `SplitSignalSegments`. + +***merge_gap_ms: tensor(int64)*** + +Scalar gap threshold in milliseconds. Segments separated by less than this value are merged. + +#### Outputs + +***output: tensor(int64)*** + +2D tensor of shape `[M, 2]` (M <= N) of the merged segment boundaries. + +
+ +## Tensor operators + +### RaggedTensorToSparse + +
+RaggedTensorToSparse details + +Converts a ragged tensor's row lengths to a COO-style sparse indexing representation. + +#### Inputs + +***n_element: tensor(int64)*** + +1D tensor holding the number of elements in each row. + +#### Outputs + +***output_0: tensor(int64)*** + +2D tensor of `(row, col)` indices for every element. + +***output_1: tensor(int64)*** + +1D tensor of length 2 containing the dense shape `[num_rows, max_row_width]`. + +
+ +### RaggedTensorToDense + +
+RaggedTensorToDense details + +Converts a ragged int64 tensor to a dense 2D tensor, padding shorter rows with a configurable value. + +#### Attributes + +***missing_value: int64_t*** (default is -1) + +Value used to pad short rows. + +#### Inputs + +***input0: tensor(int64)*** + +1D row-splits tensor indicating the start index of each row within `input3`. + +***input1: tensor(int64)*** + +1D tensor of flat indices (unused by some consumers; reserved). + +***input2: tensor(int64)*** + +1D tensor of length 2 describing the target dense shape `[num_rows, max_row_width]`. + +***input3: tensor(int64)*** + +1D flat values tensor. + +#### Outputs + +***output: tensor(int64)*** + +2D dense tensor with missing elements filled by `missing_value`. + +
+ +## Audio operators + +### AudioDecoder + +
+AudioDecoder details + +Decodes a byte stream containing an encoded audio file (WAV, MP3, or FLAC) into a float PCM tensor. Optionally resamples the audio to a target sample rate. + +#### Attributes + +***downsampling_rate: int64_t*** (default is 0) + +Target sample rate. When 0 the native sample rate of the decoded stream is used. + +***stereo_to_mono: int64_t*** (default is 1) + +If 1, multi-channel audio is mixed down to a single mono channel. + +***target_sample_rate: int64_t*** (default is 0) + +Alias for `downsampling_rate`; when non-zero the decoded audio is resampled to this rate. + +#### Inputs + +***input: tensor(uint8)*** + +1D tensor of raw bytes representing the encoded audio file. + +***format: tensor(string)*** (optional) + +Scalar describing the container format. Accepted values: `"wav"`, `"mp3"`, `"flac"`. When absent the format is detected from the file header. + +#### Outputs + +***output: tensor(float)*** + +2D tensor of shape `[1, num_samples]` with the decoded (and optionally resampled) PCM samples in the range `[-1, 1]`. + +
+ +## Vision operators + +### DecodeImage + +
+DecodeImage details + +Decodes an encoded image (PNG, JPEG, BMP, TIFF, …) into an `HxWx3` uint8 tensor. + +#### Attributes + +***color_space: string*** (default is "BGR") + +Color ordering of the output. Valid values are `"RGB"` and `"BGR"`. + +#### Inputs + +***input: tensor(uint8)*** + +1D tensor containing the raw encoded image bytes. + +#### Outputs + +***output: tensor(uint8)*** + +3D tensor of shape `[H, W, 3]`. + +
+ +### EncodeImage + +
+EncodeImage details + +Encodes a 3-channel `HxWx3` uint8 image tensor to PNG or JPEG bytes. + +#### Attributes + +***format: string*** (default is "png") + +Output image format. Valid values are `"png"` and `"jpg"` (or `"jpeg"`). + +#### Inputs + +***input: tensor(uint8)*** + +3D tensor of shape `[H, W, 3]` in BGR order. + +#### Outputs + +***output: tensor(uint8)*** + +1D tensor of encoded image bytes. + +
+ +### DrawBoundingBoxes + +
+DrawBoundingBoxes details + +Draws bounding boxes on a BGR image tensor. + +#### Attributes + +***thickness: int64_t*** (default is 4) + +Line thickness of the drawn rectangles, in pixels. + +***num_classes: int64_t*** (default is 10) + +Number of class colors to cycle through. + +***mode: string*** (default is "XYXY") + +Interpretation of the box coordinates. One of `"XYXY"`, `"XYWH"`, or `"CENTER_XYWH"`. + +***colour_by_classes: int64_t*** (default is 1) + +When 1, boxes of the same class share a colour. When 0, each box gets a unique colour from the palette. + +#### Inputs + +***image: tensor(uint8)*** -expect(node, inputs=[text, pattern, rewrite], outputs=[y, z1, z2], - name='test_string_regex_replace') -``` +3D tensor of shape `[H, W, 3]` in BGR order. -
+***boxes: tensor(float)*** +2D tensor of shape `[N, 6]`. Each row is `(class_id, score, x0, y0, x1, y1)` (or equivalent depending on `mode`). -### StringECMARegexSplitWithOffsets +#### Outputs -TODO +***output: tensor(uint8)*** -### VectorToString +Image tensor with boxes drawn, same shape as `image`. + +
+ +### GaussianBlur
-VectorToString details +GaussianBlur details -VectorToString is the contrary operation to the `StringToVector` , they share same format of mapping table: +Applies a 2D Gaussian blur to an image tensor using OpenCV's `cv::GaussianBlur`. - \t\s\s... +#### Inputs -Unmapped vector will output the value of the attribute `unk`. +***input: tensor(float)*** -Example: +4D image tensor of shape `[N, H, W, C]`. -*Attributes:* +***ksize: tensor(int64)*** -- `map`: - ``` - a 0 0 1 2 - b 0 1 2 3 - d 0 1 3 4 - ``` +1D tensor of length 2 specifying the kernel size `[kx, ky]` (odd positive integers). -- `unk`: "unknown_word" +***sigma: tensor(double)*** -*Inputs:* -- data: [[0,0,1,2],[0,1,3,4],[0,0,0,0]] +1D tensor of length 2 specifying the Gaussian standard deviation along X and Y. -*Ouputs:* -- output: ["a", "d", "unknown_word" ] +#### Outputs -#### Attributes +***output: tensor(float)*** -***mapping_file_name*** +Blurred tensor with the same shape as `input`. -the formative mapping table +
-***unmapping_value*** +### ImageDecoder -the result returned when a vector aren't found in the map +
+ImageDecoder details + +Decodes raw encoded image bytes using OpenCV's `cv::imdecode`. Similar to `DecodeImage` but always returns BGR and does not expose a color-space attribute. #### Inputs -***data: tensor(T)*** +***input: tensor(uint8)*** -Input tensor +1D tensor of encoded image bytes. #### Outputs -***output: tensor(string)*** +***output: tensor(uint8)*** -The mapping result of the input +3D tensor of shape `[H, W, C]` containing the decoded BGR image. -#### Type Constraints -***T:tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(bfloat16), tensor(float16), tensor(float), tensor(double), tensor(bool)*** +
-Constrain input and output types to numerical tensors. +### ImageReader +
+ImageReader details -#### Examples +Reads an image from a file path using OpenCV's `cv::imread` and returns the decoded tensor. +#### Inputs -```python -mapping_table = \ - """ - a 0 0 1 2 - b 0 1 2 3 - d 0 1 3 4 - """ +***input: tensor(string)*** -node = onnx.helper.make_node( - 'VectorToString', - inputs=['x'], - outputs=['y'], - map=mapping_table, - unk="unknown_word" -) +Scalar string with the path of the image file to read. +#### Outputs -x = np.array([[0,0,1,2],[0,1,3,4],[0,0,0,0]], type=np.int64) -y = ["a", "d", "unknown_word"] +***output: tensor(uint8)*** +4D tensor of shape `[1, H, W, C]` containing the decoded BGR image. -expect(node, inputs=[x], outputs=[y], - name='test_vector_to_string') -```
+## CUDA operators -### StringToVector +The following operators execute on CUDA devices only. They are only registered when the library is built with `USE_CUDA`. Unless otherwise noted each op supports `float`, `float16` (`MFloat16`), and in some cases `bfloat16` (`BFloat16`). + +### FastGelu
-StringToVector details +FastGelu details -StringToVector will map each string element in the input to the corresponding vector according to the mapping file. The mapping file is a utf-8 encoding text file in tsv format: +Fused CUDA kernel computing `gelu(x + bias)` using the fast tanh-based approximation. - \t\s\s... +#### Inputs -Unmapped string will output the value of the attribute `unmapping_value`. +***input: tensor(T)*** -Example: +Input tensor of any shape. `T` is one of `float`, `float16`, `bfloat16`. -*Attributes:* +***bias: tensor(T)*** (optional) -- `mapping_file_name`: vocabulary.txt - ``` - a 0 0 1 2 - b 0 1 2 3 - d 0 1 3 4 - ``` - -- `unmapping_value`: [0 0 0 0] +Bias added elementwise before applying Gelu. Broadcast to the shape of `input`. -*Inputs:* -- data: ["a", "d", "e"] +#### Outputs -*Ouputs:* -- output: [[0,0,1,2],[0,1,3,4],[0,0,0,0]] +***output: tensor(T)*** -#### Attributes +Same shape as `input`. -***mapping_file_name:string*** +
-The name of your string to vector mapping file. +### MulSigmoid -***unmapping_value:list(int)*** +
+MulSigmoid details -Mapping result for unmapped string +Computes `x * sigmoid(x)` (the SiLU / Swish activation) in a single fused CUDA kernel. #### Inputs -***data: tensor(string)*** +***input: tensor(T)*** -Input tensor +Input tensor. `T` is one of `float`, `float16`, `bfloat16`. #### Outputs ***output: tensor(T)*** -The mapping result of the input - -#### Type Constraints -***T:tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(bfloat16), tensor(float16), tensor(float), tensor(double), tensor(bool)*** +Same shape as `input`. -Constrain input and output types to numerical tensors. +
-#### Examples +### MulMulSigmoid +
+MulMulSigmoid details -```python -# what's in vocabulary.txt +Computes `x * y * sigmoid(y)` in a single fused CUDA kernel. Tensors must have the same shape. -mapping_table = \ -""" -a 0 0 1 2 -b 0 1 2 3 -d 0 1 3 4 -""" +#### Inputs -node = onnx.helper.make_node( - 'StringToVector', - inputs=['x'], - outputs=['y'], - mapping_table=mapping_table, - unmapping_value=[0,0,0,0] -) +***x: tensor(T)***, ***y: tensor(T)*** +`T` is one of `float`, `float16`, `bfloat16`. -x = ["a", "d", "e"] -y = np.array([[0,0,1,2],[0,1,3,4],[0,0,0,0]], type=np.int64) +#### Outputs +***output: tensor(T)*** -expect(node, inputs=[x], outputs=[y], - name='test_string_to_vector') -``` +Tensor with the same shape as the inputs.
+### NegXPlus1 +
+NegXPlus1 details -### StringSlice +Computes `1 - x` elementwise on CUDA. + +#### Inputs + +***input: tensor(T)*** + +`T` is one of `float`, `float16`, `bfloat16`. + +#### Outputs + +***output: tensor(T)*** + +Same shape as `input`. + +
+ +### ReplaceZero
-StringSlice details +ReplaceZero details -Do the slice operation to each string element in input tensor. Similar to string slice in python +Replaces every zero element of the input with a scalar value. -```python -a = "abcdef" -b = a[1:2] -c = a[3:1:-1] -``` +#### Attributes + +***by: float*** (default is 0.0) + +Replacement value for zero entries. #### Inputs -***data: tensor(string)*** +***input: tensor(T)*** -String tensor to extract slices from. +`T` is one of `float`, `float16`, `bfloat16`. -***starts: tensor(int64/int32)*** +#### Outputs -The tensor of starting indices of corresponding string in data, which has same dimension of data. +***output: tensor(T)*** -***ends: tensor(int64/int32)*** +Same shape as `input`. -The tensor of ending indices of corresponding string in data, which has same dimension of data. +
-***steps(optional): tensor(int64/int32)*** +### AddSharedInput -The tensor of slice step of corresponding string in data, which has same dimension of data.If steps is empty tensor, we will use default value 1 for each string +
+AddSharedInput details + +Computes `A + B` and `A + C` in one kernel launch, sharing the read of `A`. + +#### Inputs + +***A: tensor(T)***, ***B: tensor(T)***, ***C: tensor(T)*** + +`T` is one of `float`, `float16`, `bfloat16`. `B` and `C` must have the same shape as `A`. #### Outputs -***output: tensor(string)*** +***AB: tensor(T)***, ***AC: tensor(T)*** -Sliced data tensor. +Elementwise sums `A + B` and `A + C`. -#### Examples +
+### MulSharedInput -```python +
+MulSharedInput details -node = onnx.helper.make_node( - 'StringSlice', - inputs=['x', 'starts', 'ends', 'steps'], - outputs=['y'], -) +Computes `A * B` and `A * C` in one kernel launch, sharing the read of `A`. -x = np.array(["abcdef", "hijkl"]) -y = np.array([x[0][1:3:1], x[1][3:1:-1]]) -starts = np.array([1, 3], dtype=np.int64) -ends = np.array([3, 1], dtype=np.int64) -axes = np.array([0, 1], dtype=np.int64) -steps = np.array([1, 1], dtype=np.int64) +#### Inputs -expect(node, inputs=[x, starts, ends, axes, steps], outputs=[y], - name='test_string_slice') -``` +***A: tensor(T)***, ***B: tensor(T)***, ***C: tensor(T)*** -
+`T` is one of `float`, `float16`, `bfloat16`. + +#### Outputs +***AB: tensor(T)***, ***AC: tensor(T)*** -### MaskedFill +Elementwise products `A * B` and `A * C`. + +
+ +### ScatterNDOfShape
-MaskedFill details +ScatterNDOfShape details +Allocates a zero tensor of the given shape and applies a `ScatterND` reduction. Equivalent to `ScatterND(ConstantOfShape(shape, 0), indices, updates, reduction=...)` but fused. -Fills elements of self tensor with value where mask is True. The operator is similar with [`Tensor.masked_fill_`](https://pytorch.org/docs/stable/generated/torch.Tensor.masked_fill_.html#torch.Tensor.masked_fill_) in pytorch. +#### Attributes +***reduction: string*** (default is "add") + +Reduction to apply to scattered updates. One of `"add"`, `"mul"`, `"min"`, `"max"`. #### Inputs -***value: tensor(string)*** +***shape: tensor(int64)*** -The value to fill in with, currently we only support string type and vector&scalar dimension. +1D tensor describing the output shape. Must live on CPU. -***mask: tensor(bool)*** +***indices: tensor(int64)*** -The boolean mask, the dimension of mask tensor should be same with value. +Indices into the output, as in standard ScatterND. + +***updates: tensor(T)*** + +Values to scatter. `T` is one of `float`, `float16`, `bfloat16`. #### Outputs -***output: tensor(string)*** +***output: tensor(T)*** -The filled output of input tensor. +Tensor of the requested shape with updates applied. +
-#### Examples +### MaskedScatterNDOfShape +
+MaskedScatterNDOfShape details -```python +Variant of `ScatterNDOfShape` that ignores entries of `indices` equal to a configurable mask value. -node = onnx.helper.make_node( - 'MaskedFill', - inputs=['value', 'mask'], - outputs=['output'] -) +#### Attributes +***reduction: string*** (default is "add") -value = np.array(["a", "b", "c", "d"]) -mask = np.array([True, False, True, False], dtype=bool) -output = np.array(["a", "c"]) +Same as `ScatterNDOfShape`. +***maskedValue: int64_t*** + +Index value that causes the corresponding update to be skipped. + +#### Inputs + +Same as `ScatterNDOfShape`. + +#### Outputs + +Same as `ScatterNDOfShape`. -expect(node, inputs=[value, mask], outputs=[output], - name='test_masked_fill') -```
+### Transpose2DCastFP16 -### StringRaggedTensorToDense +
+Transpose2DCastFP16 details -TODO +Fused 2D transpose + cast from `float` to `float16`. -### StringMapping +#### Inputs -TODO +***input: tensor(float)*** -## Math operators +2D tensor of shape `[M, N]`. +#### Outputs -### Inverse +***output: tensor(float16)*** -TODO +2D tensor of shape `[N, M]`. -### NegPos +
-TODO +### Transpose2DCastFP32 -### SegmentExtraction +
+Transpose2DCastFP32 details -TODO +Fused 2D transpose + cast from `float16` to `float`. -### SegmentSum +#### Inputs -TODO +***input: tensor(float16)*** -## Tensor operators +2D tensor of shape `[M, N]`. -### RaggedTensorToSparse +#### Outputs -TODO +***output: tensor(float)*** -### RaggedTensorToDense +2D tensor of shape `[N, M]`. -TODO +
### Template From 290f917dbd66178a2b8200326c902cfc5bcd144b Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Fri, 17 Apr 2026 23:06:51 +0000 Subject: [PATCH 3/4] Clarify ASCII vs Unicode casing for StringUpper/StringLower Agent-Logs-Url: https://github.com/microsoft/onnxruntime-extensions/sessions/cdb9185e-44ff-4191-8d4c-b00889e20918 Co-authored-by: justinchuby <11205048+justinchuby@users.noreply.github.com> --- docs/custom_ops.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/custom_ops.md b/docs/custom_ops.md index ebe12966b..5a8f0b2d2 100644 --- a/docs/custom_ops.md +++ b/docs/custom_ops.md @@ -1212,7 +1212,7 @@ Scalar boolean. When true, empty substrings are removed from the output.
StringUpper details -Converts every ASCII character in each string of the input tensor to uppercase. Non-ASCII bytes are left unchanged. +Converts every ASCII character in each string of the input tensor to uppercase using `std::toupper`. This operator is ASCII-only; non-ASCII bytes are passed through unchanged. For full Unicode case folding, pre-process inputs accordingly or use `StringLower` as a reference for Unicode handling. #### Inputs @@ -1233,7 +1233,7 @@ String tensor of the same shape as `input` with uppercased strings.
StringLower details -Converts each string in the input tensor to lowercase using Unicode case folding. +Converts each string in the input tensor to lowercase. Unlike `StringUpper`, this operator decodes input bytes as UTF-8 and performs Unicode-aware case folding on each code point before re-encoding the result. #### Inputs From 56a19d6d07e185ca1ea293ea1969cc0aa6071266 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Sat, 18 Apr 2026 02:55:16 +0000 Subject: [PATCH 4/4] Address PR review feedback: align docs with C++ implementations Agent-Logs-Url: https://github.com/microsoft/onnxruntime-extensions/sessions/3f620564-a099-495a-8067-d1d71deb349b Co-authored-by: justinchuby <11205048+justinchuby@users.noreply.github.com> --- docs/custom_ops.md | 170 ++++++++++++++++++++++++--------------------- 1 file changed, 91 insertions(+), 79 deletions(-) diff --git a/docs/custom_ops.md b/docs/custom_ops.md index 5a8f0b2d2..7bf8b026e 100644 --- a/docs/custom_ops.md +++ b/docs/custom_ops.md @@ -550,7 +550,7 @@ Merge rules (contents of `merges.txt`). ***padding_length: int64_t*** (default is -1) -If positive, the output is right-padded (or truncated) to this length. When -1 no padding is performed and outputs stay ragged. +If positive, the output is right-padded (or truncated) to this length. When -1, the output is padded to the maximum sequence length in the batch; the operator still returns a dense tensor with a dynamic second dimension. #### Inputs @@ -564,7 +564,7 @@ If positive, the output is right-padded (or truncated) to this length. When -1 n Tensor of token ids. -***attention_mask: tensor(int64)*** +***attention_mask: tensor(int64)*** (optional) Mask with the same shape as `input_ids` (1 for real tokens, 0 for padding). @@ -594,7 +594,7 @@ BPE merge rules (contents of `merges.txt`). ***padding_length: int64_t*** (default is -1) -Optional fixed output length. See `CLIPTokenizer`. +See `CLIPTokenizer`. #### Inputs @@ -608,7 +608,7 @@ Optional fixed output length. See `CLIPTokenizer`. Token ids. -***attention_mask: tensor(int64)*** +***attention_mask: tensor(int64)*** (optional) Attention mask, same shape as `input_ids`. @@ -638,7 +638,7 @@ SentencePiece merge rules. ***padding_length: int64_t*** (default is -1) -Optional fixed output length. +See `CLIPTokenizer`. #### Inputs @@ -652,7 +652,7 @@ Optional fixed output length. Tensor of token ids. -***attention_mask: tensor(int64)*** +***attention_mask: tensor(int64)*** (optional) Attention mask with the same shape as `input_ids`. @@ -680,7 +680,7 @@ Contents of `vocab.txt`. Lowercase inputs before tokenization. -***strip_accents: int64_t*** (default is 1) +***strip_accents: int64_t*** (default is 0) Strip accents as part of normalization. @@ -704,6 +704,10 @@ Attention mask, same shape as `input_ids`. Segment ids. All zero for single-sentence input. +***offset_mapping: tensor(int64)*** (optional) + +Per-token `(begin, end)` byte offsets into the corresponding input string. +
@@ -736,7 +740,7 @@ Additional vocabulary data when the tokenizer uses an external vocab file. Token ids. -***attention_mask: tensor(int64)*** +***attention_mask: tensor(int64)*** (optional) Attention mask matching `input_ids`. @@ -774,7 +778,7 @@ Scalar flag. When true the `fairseq` vocab-id offset convention is applied. ***output: tensor(string)*** -Scalar string containing the decoded text. +1D tensor with one string element containing the decoded text.
@@ -929,61 +933,59 @@ Scalar input string.
StringEqual details -Compares two strings and returns true if they are equal and false if not. +Compares two strings elementwise and returns true when they are equal. #### Inputs ***x: tensor(string)*** -The first string input +The first string input. -***x: tensor(string)*** +***y: tensor(string)*** -The second string input +The second string input. Must have the same shape as `x` (or be broadcastable). #### Outputs -***z: tensor(boolean)*** +***z: tensor(bool)*** -String with replacements. +Boolean tensor with the same shape as the broadcasted inputs; `true` where the inputs are equal.
-### StringHash +### StringToHashBucket
-StringHash details +StringToHashBucket details - -Hashes the input string based on the number of buckets +Hashes each input string into one of `num_buckets` buckets using the internal FarmHash-like 64-bit hash implementation. #### Inputs ***input: tensor(string)*** -The string to hash +The input string tensor to hash. ***num_buckets: tensor(int64)*** -The number of buckets (must be equal to 1?) +Scalar number of hash buckets. Must be greater than 0. #### Outputs -***name: tensor(int64)*** +***output: tensor(int64)*** -The hash value of the string +Tensor of the same shape as `input` containing the hash-bucket index for each input string. Each value lies in the range `[0, num_buckets)`.
-### StringHashFast +### StringToHashBucketFast
-StringHashFast details +StringToHashBucketFast details - -A faster implementation of StringHash. Computes hash values for each input string modulo `num_buckets`. +A faster variant of `StringToHashBucket` that uses `std::hash` internally. Hash values are not stable across platforms or compilers, so the op is intended for stateless in-process hashing rather than persisted lookup tables. #### Inputs @@ -993,7 +995,7 @@ The strings to hash. ***num_buckets: tensor(int64)*** -The number of hash buckets (scalar). Each output value will be in the range `[0, num_buckets)`. +Scalar number of hash buckets. Must be greater than 0. #### Outputs @@ -1020,11 +1022,11 @@ The input array of strings ***input_sep: tensor(string)*** -The string separator for the resulting joing +The string separator for the resulting joining. ***input_axis: tensor(int64)*** -The axis along which to joing +The axis along which to join. #### Outputs @@ -1137,7 +1139,7 @@ Replace all strings matching the pattern or the first one. ***ignore_case: int64*** (default is 0) -Replace +Whether to perform case-insensitive ECMAScript regular expression matching. #### Outputs @@ -1151,7 +1153,7 @@ String with replacements. ```python node = onnx.helper.make_node( - 'StringRegexReplace', + 'StringECMARegexReplace', inputs=['text', 'pattern', 'rewrite'], outputs=['y'], ) @@ -1273,15 +1275,15 @@ String tensor of the same shape as `input` with whitespace stripped. ### StringLength
-StringECMARegexReplace details +StringLength details -Get the length of each string element in input tensor. Similar to the function `len("abcde"")` in python. +Get the length of each string element in the input tensor. Similar to the function `len("abcde")` in Python. #### Inputs -***data: tensor(string)*** +***input: tensor(string)*** -String tensor to get length of its each string element. +String tensor to get the length of each string element from. #### Outputs @@ -1369,33 +1371,39 @@ expect(node, inputs=[x, y], outputs=[result],
StringRegexSplitWithOffsets details -Splits string based on regular expressions. +Splits strings based on regular expressions (RE2 dialect) and reports the byte offsets of each produced token. #### Inputs ***text: tensor(string)*** -String tensor to extract slices from. +String tensor to split. ***delim_regex_pattern: tensor(string)*** -Splitting attern of the regular expression. +Splitting pattern of the regular expression. ***keep_delim_regex_pattern: tensor(string)*** -By default, delimiters are not included in the split string results. Delimiters may be included by specifying a regex pattern keep_delim_regex_pattern. +By default, delimiters are not included in the split string results. Delimiters may be included by specifying a regex pattern via `keep_delim_regex_pattern`. #### Outputs -***words: tensor(string)*** Tensor of words. +***tokens: tensor(string)*** -***offsets: tensor(int64)*** 2D tensor with 3 columns: -sentence index, position of the first character, position of the last one (excluded) +1D tensor of tokens produced by splitting, in row-major order. -***row_indices: tensor(int64)*** Indices of every first token of input sentences. -`row_indices[i+1] - row_indices[i]` is the number of tokens in input `i`. -These are updates row indices given as inputs or new ones if the second input is empty. +***begin_offsets: tensor(int64)*** + +1D tensor with the begin byte offset of each token in the corresponding input string. +***end_offsets: tensor(int64)*** + +1D tensor with the end byte offset (exclusive) of each token in the corresponding input string. + +***row_offsets: tensor(int64)*** + +1D tensor of row offsets such that tokens of the i-th input string occupy `[row_offsets[i], row_offsets[i+1])` in `tokens`. #### Examples @@ -1403,22 +1411,22 @@ These are updates row indices given as inputs or new ones if the second input is ```python node = onnx.helper.make_node( - 'StringRegexSplit', - inputs=['text', 'pattern', 'rewrite'], - outputs=['y', 'begin_end', 'indices'], + 'StringRegexSplitWithOffsets', + inputs=['text', 'pattern', 'keep_pattern'], + outputs=['tokens', 'begin_offsets', 'end_offsets', 'row_offsets'], ) text = np.array(["hello there"]) pattern = np.array([r'\s']) -rewrite = np.array([r'\s']) -y = np.array(["hello", " ", "there"]) -z1 = np.array([[0, 0, 5], - [0, 5, 6], - [0, 6, 11]], dtype=np.int64) -z2 = np.array([0, 2], dtype=np.int64) - -expect(node, inputs=[text, pattern, rewrite], outputs=[y, z1, z2], - name='test_string_regex_replace') +keep_pattern = np.array([""]) +tokens = np.array(["hello", "there"]) +begin_offsets = np.array([0, 6], dtype=np.int64) +end_offsets = np.array([5, 11], dtype=np.int64) +row_offsets = np.array([0, 2], dtype=np.int64) + +expect(node, inputs=[text, pattern, keep_pattern], + outputs=[tokens, begin_offsets, end_offsets, row_offsets], + name='test_string_regex_split_with_offsets') ```
@@ -1453,17 +1461,21 @@ When set to 1 the regex is matched case-insensitively. #### Outputs -***words: tensor(string)*** +***tokens: tensor(string)*** 1D tensor containing the split tokens. -***offsets: tensor(int64)*** +***begin_offsets: tensor(int64)*** + +1D tensor with the begin byte offset of each token in the corresponding input string. -2D tensor of shape `[num_tokens, 3]` where each row is `(sentence_index, begin_byte, end_byte)`. +***end_offsets: tensor(int64)*** -***row_indices: tensor(int64)*** +1D tensor with the end byte offset (exclusive) of each token in the corresponding input string. -1D tensor of row offsets such that tokens of the i-th input string occupy `[row_indices[i], row_indices[i+1])` in `words`. +***row_offsets: tensor(int64)*** + +1D tensor of row offsets such that tokens of the i-th input string occupy `[row_offsets[i], row_offsets[i+1])` in `tokens`.
@@ -2098,17 +2110,13 @@ Decodes a byte stream containing an encoded audio file (WAV, MP3, or FLAC) into #### Attributes -***downsampling_rate: int64_t*** (default is 0) - -Target sample rate. When 0 the native sample rate of the decoded stream is used. +***downsampling_rate: int64_t*** (default is -1) -***stereo_to_mono: int64_t*** (default is 1) +Target sample rate to resample the decoded audio to. When -1, the native sample rate of the decoded stream is used. -If 1, multi-channel audio is mixed down to a single mono channel. +***stereo_to_mono: int64_t*** (default is 0) -***target_sample_rate: int64_t*** (default is 0) - -Alias for `downsampling_rate`; when non-zero the decoded audio is resampled to this rate. +If set to 1, multi-channel audio is mixed down to a single mono channel. #### Inputs @@ -2139,9 +2147,9 @@ Decodes an encoded image (PNG, JPEG, BMP, TIFF, …) into an `HxWx3` uint8 tenso #### Attributes -***color_space: string*** (default is "BGR") +***color_space: string*** (default is "bgr") -Color ordering of the output. Valid values are `"RGB"` and `"BGR"`. +Color ordering of the output. Valid values are `"rgb"` and `"bgr"` (case-insensitive). #### Inputs @@ -2162,19 +2170,23 @@ Color ordering of the output. Valid values are `"RGB"` and `"BGR"`.
EncodeImage details -Encodes a 3-channel `HxWx3` uint8 image tensor to PNG or JPEG bytes. +Encodes a 3-channel `HxWx3` uint8 image tensor to image bytes. #### Attributes ***format: string*** (default is "png") -Output image format. Valid values are `"png"` and `"jpg"` (or `"jpeg"`). +Output image format. Valid values are `"png"` and `"jpg"`. + +***color_space: string*** (default is "bgr") + +Color space / channel order of the input image. Supported values are `"bgr"` and `"rgb"`. #### Inputs ***input: tensor(uint8)*** -3D tensor of shape `[H, W, 3]` in BGR order. +3D tensor of shape `[H, W, 3]`. The expected channel order depends on `color_space`: BGR for `"bgr"` and RGB for `"rgb"`. #### Outputs @@ -2232,13 +2244,13 @@ Image tensor with boxes drawn, same shape as `image`.
GaussianBlur details -Applies a 2D Gaussian blur to an image tensor using OpenCV's `cv::GaussianBlur`. +Applies a 2D Gaussian blur to an image tensor using OpenCV's `cv::GaussianBlur`. The current kernel wraps the input buffer as a single `CV_32FC3` matrix, so inputs must have `N == 1` and `C == 3` channels. #### Inputs ***input: tensor(float)*** -4D image tensor of shape `[N, H, W, C]`. +4D image tensor of shape `[1, H, W, 3]`. ***ksize: tensor(int64)*** @@ -2288,7 +2300,7 @@ Reads an image from a file path using OpenCV's `cv::imread` and returns the deco ***input: tensor(string)*** -Scalar string with the path of the image file to read. +1D string tensor of shape `[1]` containing the path of the image file to read. #### Outputs