feat(emoji): generate catalog from Unicode emoji-test.txt#344
Open
dmnyc wants to merge 1 commit into
Open
Conversation
Replace the hand-maintained `EmojiData` catalog (a frozen port of the Android client) with one generated from the official Unicode 16.0 `emoji-test.txt` via `scripts/generate_emoji_data.py`. The old list silently lagged new Unicode releases, so common emoji — arrows, keycap digits, money, many flags, and newer additions like the splatter — were missing without anyone noticing. The catalog now covers ~1,869 emoji across 9 categories (adds a dedicated Flags category with all country/region/subdivision flags). Skin-tone variants are excluded (the app applies tones at render time via its own selector — separate change), as are clock-face variants and Japanese ideograph buttons. Only fully-qualified forms are emitted, so entries are in canonical reaction-matching form. Re-run the script to pull in future Unicode releases. `EmojiData`'s public API (`categories`, `allEmojis`, `searchEmojis`, `defaultQuickReactions`) is unchanged, so consumers need no changes.
5c48055 to
857190a
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Replace the hand-maintained
EmojiDatacatalog (a frozen port of the Android client) with one generated from the official Unicode 16.0emoji-test.txtviascripts/generate_emoji_data.py. The old list silently lagged new Unicode releases, so common emoji — arrows, keycap digits, money, many flags, and newer additions like splatter — were missing without anyone noticing.The catalog now covers ~1,869 emoji across 9 categories (adds a dedicated Flags category with all country/region/subdivision flags). Skin-tone variants are excluded (the app applies tones at render time via its own selector — see #304 / PR C), as are clock-face variants and Japanese ideograph buttons. Only fully-qualified forms are emitted, so entries are in canonical reaction-matching form. Re-run the script to pull in future Unicode releases.
Category curation
GROUP_OVERRIDEmap so re-runs keep this in Symbols.Search restoration
The generator commit initially dropped
EmojiData.searchEmojis's consultation ofCldrEmojiKeywords— searches went down to "Unicode name substring" only and lost coverage like "vulcan" → 🖖, "salute" → 🖖. Restored in the same commit by re-wiringCldrEmojiKeywords.keywordsByEmojilookups and adding a smallkeywordAliasestable for cultural shorthand that CLDR doesn't ship:spock/llap/live long and prosper/trek→ 🖖pepe→ 🐸lol/lmao→ 😂ded/dead→ 💀fire/lit→ 🔥hundred/perfect/based→ 💯o7→ 🫡bored→ 🥱clown→ 🤡eyes/looking→ 👀zap/bolt/lightning→ ⚡bitcoin/orange→ 🟠 / 🟧EmojiData's public APIcategories,allEmojis,searchEmojis,defaultQuickReactionsare unchanged. Existing call sites (post-card reaction display, picker grids) need no changes.Files
EmojiData.swift— replaced; retains hand-curateddefaultQuickReactionsandkeywordAliases, everything else is generator outputscripts/generate_emoji_data.py— new generator script withGROUP_MAP(display labels + tab icons) andGROUP_OVERRIDE(per-emoji category corrections)scripts/emoji-test-16.0.txt— new, Unicode 16.0 source data (5,331 lines)Test plan