Skip to content

abrangel/Kenryu

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

74 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ‡ͺπŸ‡Έ Leer en EspaΓ±ol Β |Β  πŸ‡¬πŸ‡§ Read in English

KENRYU

High-precision bioinformatics platform for microRNA research and biomarker discovery
Genomic Convergence Β Β·Β  PubMed Evidence Β Β·Β  Functional Enrichment Β Β·Β  Automated Scientific Reporting


Live Demo Β  Website Β  Report Bug Β  Request Feature



Overview Β Β·Β  Analysis Flow Β Β·Β  Features Β Β·Β  Tech Stack Β Β·Β  Databases Β Β·Β  Installation Β Β·Β  Roadmap

DOI Β Β  OpenSSF Best Practices Β Β  Hugging Face Spaces Β Β  Repository Size



⚠️ Medical Disclaimer

KENRYU is for Research Use Only (RUO). It is not a substitute for professional medical judgment, individualized patient evaluation, local clinical guidelines, or review by a qualified healthcare professional. Evidence comes from public databases (NCBI, miRTarBase, TargetScan, STRING-DB) and must be rigorously verified before any therapeutic or research intervention. This platform does not store or process PII or PHI.


πŸ”¬ About the Project

MicroRNAs (miRNAs) are post-transcriptional regulators that control the expression of complex gene networks via targeted silencing. Identifying which genes are simultaneously regulated by multiple miRNAs is fundamental to discovering critical nodes in pathologies such as cancer, cardiovascular diseases, neurodegeneration, and metabolic disorders.

KENRYU automates the entire discovery workflow in one platform:

Step Action Sources
1 πŸ”— Integrate thermodynamic predictions + experimental validation TargetScan Β· miRTarBase
2 🎯 Identify dynamic Venn intersection of core regulated genes Local DB + Harmonizome
3 🧬 Enrich targets functionally with balancing algorithm KEGG · Reactome · WikiPathways · GO
4 πŸ“‘ Investigate each core gene across 4 sources in parallel PubMed Β· OMIM Β· ClinVar Β· ClinicalTrials
5 πŸ“„ Generate editable academic report with Vancouver bibliography PDF Β· Markdown ZIP Β· TXT

Unlike traditional pipelines, KENRYU uses a search cascade with fallback (strict → expanded → minimal) and real-time ES→EN translation to overcome PubMed's limitations with non-English queries. Fully optimized for Hugging Face Spaces with robust NCBI eUtils rate-limit handling.


⚑ Analysis Flow

Click on each phase to explore the underlying bioinformatics logic.


πŸ“₯ 01. INPUT πŸ“Š 02. COLLECTION 🎯 03. VENN CORE
miRNA selection Multi-source retrieval Genomic convergence
🧬 STEP 01 β€” Input & Parameters (Click to expand)

The Start: User enters N miRNAs in standard nomenclature.

  • Dynamic Filtering: Define the year window for PubMed evidence.
  • Validation: Automatic name cleaning for cross-database compatibility.
πŸ” STEP 02 β€” Multi-Source Collection (Click to expand)

The Engine: Parallel querying of high-fidelity databases.

  • TargetScan 8.0: Thermodynamic binding predictions (Local Database).
  • miRTarBase: Gold-standard experimental validation via Harmonizome (Remote API).
πŸ’Ž STEP 03 β€” Genomic Convergence (Click to expand)

The Logic: Mathematical intersection of regulatory networks.

  • Strict Mode: Only genes common to ALL miRNAs.
  • N-1 / N-2: Robust consensus for broader discoveries.

🧬 04. ENRICHMENT πŸ“‘ 05. RESEARCH πŸ“„ 06. REPORT
Pathway analysis Multi-source evidence Professional output
πŸ“Š STEP 04 β€” Functional Enrichment (Click to expand)

The Context: Identifying biological impact.

  • Databases: KEGG, Reactome, WikiPathways, and GO.
  • Evidence: Automated PubMed cross-referencing for every significant pathway found.
πŸ“š STEP 05 β€” Multi-Source Research (Click to expand)

The Evidence: Consolidating real-world scientific data.

  • Live queries to ClinVar (variants) and OMIM (disorders).
  • ClinicalTrials.gov integration for human-centric research insights.
✨ STEP 06 β€” Scientific Reporting (Click to expand)

The Result: Professional-grade academic reports.

  • Vancouver Style: Auto-generated citations for all integrated sources.
  • Multi-Format: Export to PDF, Markdown ZIP with assets, or raw TXT.

Consensus Modes

Mode Description Use Case
Strict Genes regulated by ALL miRNAs in the panel High-confidence core targets
N-1 Genes regulated by at least N-1 miRNAs Robust regulatory networks
N-2 Genes regulated by at least N-2 miRNAs Broader pathway exploration

🧩 Key Features

πŸ”¬ Bioinformatics Analysis

  • Dynamic Convergence β€” Venn intersection of N miRNAs, no hardcoded limit on core gene count
  • Balanced Functional Enrichment β€” fairness algorithm across KEGG, Reactome, WikiPathways, and GO Biological Process
  • Volcano plot Β· Venn diagram Β· STRING-DB interactome β€” generated on-the-fly per analysis

πŸ“‘ Multi-source Research

Source What it provides
PubMed Scientific articles per gene + biological context, year-filtered with cascade fallback
OMIM Mendelian Inheritance in Man β€” associated hereditary diseases
ClinVar Reported pathogenic / likely pathogenic variant count
ClinicalTrials.gov Active clinical trials where the gene symbol appears in the title

⚑ Robustness & Performance

  • 429-aware Retries β€” automatic NCBI eUtils rate-limit handling with exponential backoff
  • Persistent Cache β€” local_db/analysis_cache.json stores translations, PubMed results, and enrichment across sessions
  • ESβ†’EN Mapping β€” auto-translates biological terms from Spanish presets for effective PubMed queries
  • Filter Cascade β€” strict β†’ expanded β†’ minimal year window fallback

πŸ“„ Report Generation

  • Professional A4 Editor β€” WYSIWYG preview with smart height-based pagination
  • Vancouver Bibliography β€” [N] Authors. Title. Journal. Year. PMID: X. URL
  • Multi-type Citations β€” PMID / OMIM / ClinVar / NCT with correct identifier per source
  • PDF Export β€” via window.print() with optimized @media print CSS, footer anchored to each A4 page
  • Markdown ZIP Export β€” Report.md + /assets/ PNGs + README.md β€” Pandoc/Obsidian/VSCode compatible

πŸ“Έ Screenshots

Kenryu main interface with analysis results Main interface β€” Volcano plot, Venn diagram, and STRING-DB interactome generated live. (Note: The clinical interface displays analytical metrics and parameter presets in Spanish for localized environments).



Gene charts and visualizations Bioinformatics charts β€” Functional enrichment visualization



Gene information in sidebars Gene sidebars β€” Quick access to OMIM, ClinVar, and trial data



Multi-source research results Research results β€” PubMed Β· OMIM Β· ClinVar Β· ClinicalTrials consolidated



Narrative academic report Narrative report β€” Academic synthesis with Vancouver bibliography. The automated export functions render structured clinical narratives, cross-referenced literature insights, and citation indices.

πŸ—οΈ Architecture

                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚   Frontend (HTML / JS / CSS)    β”‚
                    β”‚  Β· Paged A4 WYSIWYG Editor      β”‚
                    β”‚  Β· PDF Β· Markdown ZIP Β· TXT     β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                 β”‚  POST /api/v1/analyze
                                 β–Ό
                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚  FastAPI Backend β€” kenryu_engineβ”‚
                    β”‚  Β· Venn Intersection Engine     β”‚
                    β”‚  Β· Functional Enrichment        β”‚
                    β”‚  Β· Multi-source Research        β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                 β”‚
       β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
       β–Ό          β–Ό              β–Ό               β–Ό                  β–Ό
 β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
 β”‚TargetScanβ”‚ β”‚miRTarBaseβ”‚ β”‚ Enrichr β”‚ β”‚NCBI PubMed  β”‚ β”‚ OMIM Β· ClinVar β”‚
 β”‚ v8 local β”‚ β”‚(Harmoniz)β”‚ β”‚KEGG/GO… β”‚ β”‚  eUtils     β”‚ β”‚  ClinicalTrialsβ”‚
 β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                 β”‚
                                 β–Ό
                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚  Persistent Cache               β”‚
                    β”‚  local_db/analysis_cache.json   β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ› οΈ Tech Stack

Backend

Python FastAPI httpx Matplotlib Pandas

Frontend

HTML5 JavaScript JSZip FontAwesome

Deployment

Docker HuggingFace GitHub Pages


πŸ—„οΈ Integrated APIs & Databases

Source Function Type
🧬 TargetScanHuman v8.0 Thermodynamic prediction of binding sites Local (zip)
πŸ”¬ miRTarBase via Harmonizome Experimental interaction validation (CLIP-seq, Luciferase, WB) Remote API
πŸ“Š Enrichr Functional enrichment β€” KEGG / Reactome / WikiPathways / GO Remote API
πŸ“š NCBI PubMed eUtils Scientific articles per gene + context with cascade + retries Remote API
πŸ₯ NCBI OMIM Associated hereditary diseases by gene symbol Remote API
πŸ§ͺ NCBI ClinVar Reported pathogenic / likely pathogenic variants Remote API
πŸ’Š ClinicalTrials.gov v2 Active clinical trials filtered by gene symbol in title Remote API
πŸ”— STRING-DB Protein-protein interaction network generation Remote API
🌐 MyGene.info Clinical gene annotation Remote API
πŸ”€ MyMemory API Real-time ES↔EN biological term translation Remote API

πŸ“¦ Installation

Prerequisites

  • Python 3.11+
  • pip
  • Docker (optional β€” for containerized deployment)

Local Setup

# 1. Clone the repository
git clone https://github.com/abrangel/Kenryu.git
cd Kenryu

# 2. Create virtual environment (recommended)
python3.11 -m venv venv
source venv/bin/activate      # Linux / macOS
# venv\Scripts\activate       # Windows

# 3. Install dependencies
pip install -r requirements.txt

# 4. Start the server
uvicorn kenryu_engine:app --host 0.0.0.0 --port 7860 --reload

# 5. Open in browser β†’ http://localhost:7860

Docker Deployment

docker build -t kenryu .
docker run -p 7860:7860 -v $(pwd)/local_db:/app/local_db kenryu

The local_db volume persists the analysis cache across container restarts.

Hugging Face Spaces

The repo is fully ready for HF Spaces β€” the Dockerfile exposes port 7860 and mounts local_db for persistent caching. Simply connect the Space to this GitHub repository.


πŸš€ Usage

Via Web Interface

  1. Open production Space or http://localhost:7860
  2. Enter miRNAs separated by commas: hsa-miR-33a-5p, hsa-miR-144-3p, hsa-miR-758-3p
  3. Select year cutoff (PubMed evidence age filter)
  4. Select consensus mode (Strict / N-1 / N-2)
  5. Click Execute

Via REST API

curl -X POST "http://localhost:7860/api/v1/analyze" \
  -H "Content-Type: application/json" \
  -d '{
    "mirnas": ["hsa-miR-33a-5p", "hsa-miR-144-3p", "hsa-miR-758-3p"],
    "years": 10,
    "mode": "strict"
  }'
πŸ“‹ Example JSON Response
{
  "common_genes": ["ABCA1", "KPNA3", "SCN1A"],
  "gene_research": {
    "ABCA1": {
      "pubmed": [{"pmid": "...", "term": "...", "year_window": "10y"}],
      "omim":   [{"omim_id": "600046", "title": "..."}],
      "clinvar": {"count": 201, "url": "..."},
      "trials":  [{"nct_id": "NCT01456650", "title": "..."}]
    }
  },
  "enrichment": ["..."],
  "scientific_synthesis": "...",
  "venn_plot":    "data:image/png;base64,...",
  "volcano_plot": "data:image/png;base64,...",
  "ppi_plot":     "data:image/png;base64,..."
}

πŸ“ Repository Structure

Kenryu/
β”œβ”€β”€ kenryu_engine.py              # FastAPI backend + bioinformatics logic (~1700 lines)
β”œβ”€β”€ Dockerfile                    # HF Spaces / Docker configuration
β”œβ”€β”€ requirements.txt
β”‚
β”œβ”€β”€ static/
β”‚   β”œβ”€β”€ index.html                # A4 editor main interface
β”‚   β”œβ”€β”€ script.js                 # Frontend logic (pagination, PDF/MD export)
β”‚   └── style.css                 # Dark mode Β· gold/teal accents
β”‚
β”œβ”€β”€ data/
β”‚   β”œβ”€β”€ targetscan_full.json.zip  # TargetScan v8.0 indexed DB (Git LFS, ~8 MB)
β”‚   └── hsa-miR-*.txt             # Pre-processed per-miRNA example files
β”‚
└── local_db/                     # Runtime cache (auto-created, not in repo)
    └── analysis_cache.json

πŸ—ΊοΈ Roadmap

βœ… Implemented

  • Dynamic Venn convergence β€” Strict / N-1 / N-2 modes
  • Balanced functional enrichment β€” KEGG Β· Reactome Β· WikiPathways Β· GO
  • Visualizations β€” Venn Β· Volcano Β· STRING-DB interactome
  • Vancouver-style bibliography with multi-source citation types
  • Multi-source research β€” PubMed Β· OMIM Β· ClinVar Β· ClinicalTrials
  • Versioned persistent cache across sessions
  • 429-aware retries with exponential backoff for NCBI eUtils
  • ESβ†’EN mapping for effective PubMed queries
  • Professional Markdown ZIP export with separate image assets
  • A4 PDF footer correctly anchored across all pages
  • Integrated genomic evidence narrative in report body

πŸ”œ Planned

  • Support for miRNA isomers (isomiRs)
  • DisGeNET integration for gene-disease association
  • Structured JSON export for interoperability with other tools
  • Side-by-side miRNA panel comparison
  • Cross-expression heatmap (genes Γ— miRNAs)
  • Batch mode β€” process multiple panels from CSV
  • TargetScan v9.0 integration when available

🀝 Contributing

Contributions are welcome. To report bugs, suggest features, or submit pull requests:

  1. Fork the repository
  2. Create a branch: git checkout -b feat/feature-name
  3. Commit with descriptive messages following conventions below
  4. Push: git push origin feat/feature-name
  5. Open a Pull Request

Commit conventions: feat: Β· fix: Β· docs: Β· refactor: Β· chore:

When reporting bugs please include: steps to reproduce, expected vs actual behavior, browser console logs (frontend errors), and HF Space logs (backend errors).


πŸ“œ License

This project is distributed under a Free Academic License for Research and Education. For commercial use, contact the author.

KENRYU integrates data from public sources (NCBI, Enrichr, OMIM, ClinVar, ClinicalTrials.gov, STRING-DB) subject to their respective terms of service. Users are responsible for complying with each database's policies.


πŸ‘€ Author

Cesar Manzo β€” Clinical Bioinformatics Β· Genomic Analysis Β· Translational Medicine

Hugging Face GitHub Pages GitHub


πŸ™ Acknowledgements

  • TargetScanHuman β€” Lewis Lab, Whitehead Institute
  • miRTarBase β€” Chou et al., Nucleic Acids Research
  • NCBI β€” National Center for Biotechnology Information
  • Enrichr β€” Ma'ayan Lab, Mount Sinai
  • ClinicalTrials.gov β€” U.S. National Library of Medicine
  • STRING-DB β€” European Molecular Biology Laboratory

KENRYU Bioinformatics Engine Β· 2026
Made with scientific rigor for the genomic research community.

About

MicroRNA bioinformatics tool for target identification, functional enrichment, and automated scientific reporting

Topics

Resources

License

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors