Skip to content
View kmink3225's full-sized avatar
  • Seegene
  • Seoul

Block or report kmink3225

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
kmink3225/README.md

Kwangmin Kim (김광민)

AI Engineer / Data Scientist · 7+ years

I architect and build enterprise AI platforms end-to-end — RAG, LLM agents, and NLP — and I back them with statistically rigorous evaluation. My background is biostatistics (Columbia) applied to diagnostics, so I care as much about measuring a system as building it.

  • AI/LLM Engineer — technical lead / architect on a company-wide multi-agent RAG platform (Self-RAG / CRAG / Graph RAG), self-built orchestration harnesses, and LLM-as-judge evaluation pipelines. I build systems that are grounded and measured, not demos.
  • Data Scientist — experiment design, causal inference, survival analysis, and deep learning / NLP, with 1,900+ in-depth technical articles documenting the theory behind the practice.

🔧 What I work with

Python · R · SQL · RAG / Agentic RAG / Graph RAG · LangChain · LangGraph · Azure OpenAI · Azure AI Search · PyTorch · Hugging Face Transformers · KLUE-RoBERTa / KoBERT / ALBERT · scikit-learn · FastAPI · Airflow · Docker · Quarto


📈 Selected results

Impact Metric
Enterprise knowledge QnA chatbot (9 sub-agent Self-RAG/CRAG) ~98% user satisfaction · 4.66s avg response · 96.9% citation rate
Self-built agent orchestration vs. general-purpose CLI (7-variant benchmark, ~400K-line codebase grounded into a code graph) composite 0.977 (1st) at up to ~17× lower cost per query — paired t-test / McNemar / bootstrap CI
NLP-based data standardization system validation time 8h → 0.73s (99%↓) · metadata consistency 8.4% → 98.7% · completeness → 100%
Domain classifier (8-model benchmark, 14 classes) KLUE-RoBERTa 96.88%; a 671K-param BiLSTM proved on par with a 110M model at 1.48ms inference
FDA-submission statistical V&V automation validation 6 months → 3 weeks (87.5%↓) at 99.2% confidence
PCR signal baseline correction (data-driven redesign) false-negative rate 0.47% → 0.04% (91.49%↓)
Diagnostic-equipment QC automation (LSTM, 61,248 signals) QC time ~93%↓, ~13× annual operating-cost reduction

7 patents filed (first inventor on 4) · President's Award (R&D), Seegene · Chair's Award, Columbia Biostatistics


🎓 Background

  • M.S. Biostatistics, Columbia University — Chair's Award
  • B.A. Mathematics, Baruch College (CUNY)
  • B.S. Biochemistry, Kangwon National University — Valedictorian
  • Alzheimer's multi-omics biomarker research, Columbia / Taub Institute

🔎 Go deeper


📫 Contact

Email · LinkedIn · GitHub

Popular repositories Loading

  1. website website Public

    HTML

  2. quarto_error quarto_error Public archive

    JavaScript

  3. claude-code claude-code Public

    Forked from ultraworkers/claw-code

    An independent Python feature port of Claude Code, entirely rewritting from scratch using oh-my-codex. Educational Purpose only.

    Python

  4. ai-seminar ai-seminar Public

    AI knowledge-sharing seminar archive — from prompt engineering to agent implementation, with shared slash-command tooling across Claude Code, Gemini CLI, and GitHub Copilot.

    Python

  5. kmink3225 kmink3225 Public

    AI Engineer / Data Scientist — profile

  6. kmink3225.github.io kmink3225.github.io Public

    Personal site — Kwangmin Kim, AI Engineer / Data Scientist (al-folio)

    HTML