Data Scientist | NLP & LLM Evaluation
Master of Information Technology, Whitireia New Zealand (2025). Research focus: readability assessment of content generated by large language models, with particular attention to age-appropriate narrative texts.
Multidisciplinary background: linguistics → marketing → big data → applied IT research.
Based in Moscow.
- Readability of LLM-generated narrative text
- Comparative evaluation of LLMs for children's content generation
- Applied statistical analysis (ANOVA, Tukey HSD) on NLP output
- Classical readability metrics: Flesch–Kincaid, Coleman–Liau, SMOG, Gunning Fog, ARI, Flesch Reading Ease
Master's research (2025). Comparative analysis of narratives produced by 19 large language models against 16 classical fairy tales, evaluated with textstat and statsmodels.
- Story-generation-process — generation pipeline across 19 LLMs and 16 source tales
- LLM_readability_score_analysis — statistical evaluation (ANOVA, Tukey HSD) of readability metrics
Key finding: no single model fully replicates the readability of authentic fairy tales, but individual LLMs excel in specific readability dimensions.
| Area | Tools |
|---|---|
| Python core | pandas, numpy, scikit-learn |
| LLM APIs | OpenAI, Anthropic, HuggingFace |
| Deep Learning | PyTorch, TensorFlow |
| Statistical analysis | statsmodels, scipy |
| Visualization | matplotlib, plotly, seaborn |
| Other | SQL, R, Jupyter, Google Colab, Git |
Language: English · Русский