Skip to content

jon2allen/spacy_china_food

Repository files navigation

spacy_china_food

Using spacy to make NER - Named Entity Recognition model for chinese food for transcripts.

Test data:

   ai_xing_story.txt  - non-food - motorcycle injury
   aijian.txt  -  transcipt of a recipe 
   chef_wang1.txt - resturant review  
   shunde.txt - vlog review on the shunde food scene
   sixi_wanzi.txt - recipe created by ai from transcript.
   sixi_wanzi_chinese.txt - recipe translated from above

JSON for NER

chinese_food_entities.json - entities that can be changed

Programs

build_chinese_food_ner_model.py - builds the model from json. chinese_food_analysis.py - cli tool for analysis with NER

  usage: chinese_food_analysis.py [-h] (--text TEXT | --file FILE) [--model_path MODEL_PATH]
  Analyzes Chinese food text using a custom spaCy NER model.
  options:
     -h, --help            show this help message and exit
     --text TEXT           A single string of Chinese text to analyze.
     --file FILE           A text file to analyze (one sentence per line).
     --model_path MODEL_PATH
                           Path to the saved spaCy model.

Sample output:

牛肉             PROTEIN            2    4
--- Line 234: 中碗的话就四坨牛肉 ---
Entity         Label          Start  End
------------------------------------------
牛肉             PROTEIN            7    9
--- Line 241: 不吃阆中的这一碗牛肉面 ---
Entity         Label          Start  End
------------------------------------------
牛肉             PROTEIN            8   10
==============================
   Entity Category Summary
==============================
Category            Count
------------------------------
PROTEIN             22
VEGETABLE           16
SPICE               9
TECHNIQUE           7
SAUCE               5
INSTRUCTION         1
MEASUREMENT         1
------------------------------
Total               61
==============================

About

Using spacy to make NER - Named Entity Recognition model for chinese food for transcripts.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors