[Bug] sanitizeText silently strips emoji and CJK Extension B characters

### OpenClaw Version
N/A - code-trace bug, no runtime required.

### Plugin Version
0.3.4 (current main, package.json)

### Operating System
Any. The bug is in a JS regex and is OS/platform independent.

### Describe the bug
`UNSAFE_CHAR_RE` at `src/offload/storage.ts:165` includes the full surrogate range `[\uD800-\uDFFF]` but the regex has no `u` flag. Because JS strings are UTF-16, every non-BMP code point (emoji, CJK Extension B, math bold, etc.) is stored as a surrogate PAIR, and the regex strips each half independently. `sanitizeText` and `sanitizeJsonLine` therefore destroy any non-BMP character in tool params, tool results, and ref-md archives written by the offload pipeline.

### To Reproduce
```
node -e '
const re = /[� ---\uD800-\uDFFF​-‏  ﻿]/g;
console.log(JSON.stringify("CJK ext-B \u{20BB7} here".replace(re, "")));'
// prints: "CJK ext-B  here"   (the 𠮷 character is gone)
```

Same problem for 🎉 (U+1F389), 𝐀 (U+1D400, math bold A), and every other supplementary character.

### Expected behavior
CJK Extension B 𠮷 and other non-BMP characters should pass through `sanitizeText` unchanged. The original intent of including `[\uD800-\uDFFF]` in the class is to strip LONE (malformed) surrogates, not to destroy well-formed supplementary characters.

### Error Logs / Screenshots
Silent data corruption - there is no log, the characters just disappear from the offloaded JSONL entries and ref-md files.

### Additional context
Affected callers (all in `src/offload/`):
- `sanitizeText`: `index.ts:429-430, 459-460` (every tool-call params/result)
- `sanitizeJsonLine`: `storage.ts:174` (every JSONL line written via `safeStringifyEntry`)
- `parseJsonlSafe`: `storage.ts:232` (second-pass strip on read)

Introduced in commit db8f3e51 (v0.3.3 release).

Suggested fix: add the `u` flag. With `u`, `[\uD800-\uDFFF]` matches only lone surrogates because paired surrogates have already been combined into a single code point.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] sanitizeText silently strips emoji and CJK Extension B characters #30

OpenClaw Version

Plugin Version

Operating System

Describe the bug

To Reproduce

Expected behavior

Error Logs / Screenshots

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug] sanitizeText silently strips emoji and CJK Extension B characters #30

Description

OpenClaw Version

Plugin Version

Operating System

Describe the bug

To Reproduce

Expected behavior

Error Logs / Screenshots

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions