Using spacy to make NER - Named Entity Recognition model for chinese food for transcripts.
ai_xing_story.txt - non-food - motorcycle injury
aijian.txt - transcipt of a recipe
chef_wang1.txt - resturant review
shunde.txt - vlog review on the shunde food scene
sixi_wanzi.txt - recipe created by ai from transcript.
sixi_wanzi_chinese.txt - recipe translated from above
chinese_food_entities.json - entities that can be changed
build_chinese_food_ner_model.py - builds the model from json. chinese_food_analysis.py - cli tool for analysis with NER
usage: chinese_food_analysis.py [-h] (--text TEXT | --file FILE) [--model_path MODEL_PATH]
Analyzes Chinese food text using a custom spaCy NER model.
options:
-h, --help show this help message and exit
--text TEXT A single string of Chinese text to analyze.
--file FILE A text file to analyze (one sentence per line).
--model_path MODEL_PATH
Path to the saved spaCy model.
牛肉 PROTEIN 2 4
--- Line 234: 中碗的话就四坨牛肉 ---
Entity Label Start End
------------------------------------------
牛肉 PROTEIN 7 9
--- Line 241: 不吃阆中的这一碗牛肉面 ---
Entity Label Start End
------------------------------------------
牛肉 PROTEIN 8 10
==============================
Entity Category Summary
==============================
Category Count
------------------------------
PROTEIN 22
VEGETABLE 16
SPICE 9
TECHNIQUE 7
SAUCE 5
INSTRUCTION 1
MEASUREMENT 1
------------------------------
Total 61
==============================