From fd35409fbd090259b30908fc46c400413149bbd9 Mon Sep 17 00:00:00 2001
From: "mintlify[bot]" <109931778+mintlify[bot]@users.noreply.github.com>
Date: Sun, 12 Apr 2026 05:02:14 +0000
Subject: [PATCH] Document Arabizi script mirroring for reply endpoint and
dialect detection
Generated-By: mintlify-agent
---
api-reference/comments-reply.mdx | 70 +++++++++++++++++++++++++++++
api-reference/detect.mdx | 15 +++++++
changelog.mdx | 11 +++++
guides/arabic-dialect-detection.mdx | 27 +++++++++++
4 files changed, 123 insertions(+)
diff --git a/api-reference/comments-reply.mdx b/api-reference/comments-reply.mdx
index d6321d7..37b5f5a 100644
--- a/api-reference/comments-reply.mdx
+++ b/api-reference/comments-reply.mdx
@@ -7,6 +7,8 @@ api: "POST https://api.trynawa.com/v1/comments/{id}/reply"
Generate an AI-powered reply that matches the commenter's language and cultural context. For Arabic comments, replies match the detected dialect (Gulf, Egyptian, Levantine, MSA). For English comments, replies are natural and platform-appropriate. Language is auto-detected unless overridden.
+When a comment is written in **Arabizi** (Latin-script Arabic with number substitutions like `7abibi`, `3ashan`, `9ba7`), NAWA automatically detects the script and replies in the same Arabizi style. The reply mirrors the commenter's dialect register and uses matching Latin-letter and digit conventions. This applies to all supported dialects: Gulf, Egyptian, Levantine, and Iraqi.
+
Cost: **$0.008** per request (8 credits). Semantic cache hits are free (`X-NAWA-Cache: HIT`).
@@ -89,3 +91,71 @@ result = nawa.comments.reply(
| `tone` | string | The tone used for the reply |
| `original_intent` | string | Detected intent of the original comment |
| `original_dialect` | string | Detected dialect of the original comment |
+
+## Arabizi script mirroring
+
+When the original comment is written in Arabizi, the reply is generated in the same Latin-script format. NAWA detects the number-letter conventions (e.g. `7` for ح, `3` for ع, `5` for خ) and the dialect register, then instructs the model to reply conversationally in Arabizi.
+
+### Arabizi example request
+
+
+
+```bash cURL
+curl -X POST https://api.trynawa.com/v1/comments/cmt_xyz789/reply \
+ -H "Authorization: Bearer nawa_test_sk_xxx" \
+ -H "Content-Type: application/json" \
+ -d '{
+ "tone": "friendly",
+ "context": "Cooking channel focused on Gulf recipes"
+ }'
+```
+
+```typescript TypeScript
+const { data, error } = await nawa.comments.reply('cmt_xyz789', {
+ tone: 'friendly',
+ context: 'Cooking channel focused on Gulf recipes'
+})
+```
+
+```python Python
+result = nawa.comments.reply(
+ comment_id="cmt_xyz789",
+ tone="friendly",
+ context="Cooking channel focused on Gulf recipes"
+)
+```
+
+
+
+If the original comment was `"9ba7ooo ya 7abibi el video wayid 7elw"` (Gulf Arabizi), the response looks like:
+
+```json
+{
+ "success": true,
+ "result": {
+ "comment_id": "cmt_xyz789",
+ "reply_text": "teslam ya '7abibi! wayid yesaadni asma3 chithii, el jay a7san inshallah 🔥",
+ "reply_dialect": "gulf",
+ "tone": "friendly",
+ "original_intent": "praise",
+ "original_dialect": "gulf"
+ },
+ "errors": [],
+ "request_id": "req_rep456abc789"
+}
+```
+
+
+ Arabizi replies are routed to Claude, which handles Latin-script Arabic output. The API shape and pricing are identical to standard Arabic replies.
+
+
+### Supported Arabizi conventions
+
+| Number | Arabic letter | Example |
+|--------|--------------|---------|
+| `7` | ح (ha) | `7abibi` = حبيبي |
+| `3` | ع (ain) | `3ashan` = عشان |
+| `5` | خ (kha) | `5alas` = خلاص |
+| `6` | ط (ta) | `6ayeb` = طيب |
+| `9` | ص (sad) | `9ba7` = صباح |
+| `2` | ء (hamza) | `2ana` = أنا |
diff --git a/api-reference/detect.mdx b/api-reference/detect.mdx
index 2169b2c..26868c7 100644
--- a/api-reference/detect.mdx
+++ b/api-reference/detect.mdx
@@ -7,6 +7,8 @@ api: "POST https://api.trynawa.com/v1/detect"
Detect language, dialect, script, and text direction from any text input. Uses local NAGL modules only -- no external AI call, so responses return in under 100ms.
+Arabizi (Latin-script Arabic with number substitutions like `7abibi`, `3ashan`) is also detected. The dialect is identified from vocabulary markers even when the text is written entirely in Latin characters.
+
Cost: **$0.002** per request (2 credits). Semantic cache hits are free (`X-NAWA-Cache: HIT`).
@@ -139,6 +141,19 @@ curl -X POST https://api.trynawa.com/v1/detect \
-H "Authorization: Bearer nawa_test_sk_xxx" \
-H "Content-Type: application/json" \
-d '{"text": "awesome محتوى"}'
+```
+
+
+
+ **Input:** "9ba7ooo ya 7abibi el video wayid 7elw"
+ **Result:** language: `ar`, dialect: `gulf`, script: `latin`, direction: `ltr`
+
+ Arabizi text uses only Latin characters and digits, so `script` returns `latin` and `direction` returns `ltr`. The dialect is still detected from the vocabulary markers (`wayid`, `7elw` indicate Gulf).
+```bash
+curl -X POST https://api.trynawa.com/v1/detect \
+ -H "Authorization: Bearer nawa_test_sk_xxx" \
+ -H "Content-Type: application/json" \
+ -d '{"text": "9ba7ooo ya 7abibi el video wayid 7elw"}'
```
diff --git a/changelog.mdx b/changelog.mdx
index 43e9d75..0f28d32 100644
--- a/changelog.mdx
+++ b/changelog.mdx
@@ -4,6 +4,17 @@ description: "NAWA API platform updates and releases"
rss: true
---
+
+## Improved Arabizi reply mirroring
+
+Replies to Arabizi (Latin-script Arabic) comments now more reliably mirror the commenter's script. Previously, replies could fall back to structured Arabic-script output. NAWA now prioritizes the Arabizi script-mirroring instruction, producing conversational Latin-script replies that match the commenter's dialect and number-letter conventions (`7`, `3`, `5`, etc.).
+
+- **Reply endpoint** (`/v1/comments/:id/reply`) -- Arabizi input consistently produces Arabizi output across Gulf, Egyptian, Levantine, and Iraqi dialects
+- **No API changes** -- request and response shapes are unchanged. This is a quality improvement to reply generation.
+- [Arabizi script mirroring docs](/api-reference/comments-reply#arabizi-script-mirroring)
+- [Arabizi dialect detection](/guides/arabic-dialect-detection#arabizi-latin-script-arabic)
+
+
## Intelligence Report API + English support
diff --git a/guides/arabic-dialect-detection.mdx b/guides/arabic-dialect-detection.mdx
index 3c63a24..9294dd8 100644
--- a/guides/arabic-dialect-detection.mdx
+++ b/guides/arabic-dialect-detection.mdx
@@ -129,3 +129,30 @@ curl -X POST https://api.trynawa.com/v1/feedback \
```
RLHF feedback is incorporated into model fine-tuning cycles, continuously improving accuracy across dialects.
+
+## Arabizi (Latin-script Arabic)
+
+NAWA also detects **Arabizi**, the informal Latin-script writing system used across social media where Arabic speakers substitute numbers for Arabic letters that have no Latin equivalent. Common conventions include `7` for ح, `3` for ع, `5` for خ, and `9` for ص.
+
+### How Arabizi detection works
+
+The NAGL pipeline identifies Arabizi by scanning for Latin-letter tokens that contain Arabic-number substitutions (`7abibi`, `3ashan`, `9ba7`) and matching them against a built-in Arabizi dictionary. When Arabizi is detected:
+
+1. **Classification endpoints** (`/v1/classify`, `/v1/rubric/classify`) transliterate the text internally before sending it to the LLM, so dialect and intent classification work correctly.
+2. **Reply endpoint** (`/v1/comments/:id/reply`) mirrors the Arabizi script in the response. If a commenter writes in Gulf Arabizi, the generated reply comes back in the same Latin-script format.
+3. **Detect endpoint** (`/v1/detect`) returns the detected language and dialect. Arabizi text that mixes Latin letters with Arabic-script text returns `script: "mixed"`.
+
+### Supported Arabizi dialects
+
+Arabizi detection covers the same four dialect groups, each with distinct vocabulary markers:
+
+| Dialect | Example Arabizi | Markers |
+|---------|----------------|---------|
+| **Gulf** | `shlonak, wayid 7elw, ya7lailak` | `shlon`, `wayid`, `7elw` |
+| **Egyptian** | `ezayak, gamed, 3ashan, ba2a` | `ezayak`, `gamed`, `3ashan` |
+| **Levantine** | `kifak, shu, wen, ktir` | `kifak`, `shu`, `wen` |
+| **Iraqi** | `shaku maku, shlon, aku` | `shaku`, `maku`, `shlon` |
+
+
+ Arabizi is most common in YouTube comments, Twitter replies, and chat-style platforms. If your audience uses these platforms, expect a mix of Arabic-script and Arabizi comments. NAWA handles both transparently with no changes to your API calls.
+