DeepSynth: Multi-Modal Generative Deep Learning for De Novo Antibiotic Molecule Design to Combat Antimicrobial Resistance
This repository hosts the official research and implementation of DeepSynth, a generative deep learning framework engineered to design entirely new, synthetic antibiotic molecules from scratch (de novo). By fusing structural molecular biology with advanced machine learning, DeepSynth aims to accelerate the discovery of potent drug candidates to combat Antimicrobial Resistance (AMR), slashing traditional wet-lab costs and screening timelines from years to days.
Antimicrobial resistance is one of the greatest threats to global health. Traditional drug discovery is failing to keep pace with mutating superbugs. DeepSynth addresses this crisis through two core technological breakthroughs:
- Multi-Modal Generative Modeling: Combines textual molecular representations (SMILES strings) with Graph Neural Networks (GNNs) to understand both the chemical syntax and the 3D spatial properties of target molecules.
- Reinforcement Learning for Optimization: Employs a reward-driven optimization loop that penalizes toxicity and structural instability while rewarding maximum binding affinity against specific bacterial strains.
- SMILES & Graph Hybrid Embeddings: Processes molecules concurrently as 1D sequence tokens (using Transformer-based chemical LLMs) and 2D/3D molecular graphs.
- Target-Specific Binding Prediction: Uses deep equivariant neural networks to predict chemical binding affinity against bacterial protein targets (e.g., cell wall synthesizers).
- ADMET Property Screening: Built-in prediction filters for Absorption, Distribution, Metabolism, Excretion, and Toxicity (ADMET) to ensure safe molecule design.
- Bio-Benchmarking: Direct integration with specialized molecular benchmarks like MoleculeNet and repositories such as ChEMBL and PubChem.
├── src/
│ ├── data_loaders/ # Custom pipelines for ChEMBL, ZINC, and PubChem datasets
│ ├── models/ # GNNs, Diffusion/GAN models, and Chemical Transformers
│ ├── screening/ # ADMET toxicity filters and virtual screening loops
│ └── optimization/ # Reinforcement learning and reward function configurations
├── data/ # MoleculeNet benchmark preprocessing scripts
├── evaluations/ # Docking simulation scripts and chemical property distributions
├── notebooks/ # Exploratory molecular analysis and chemical visualizations
├── Literature_Review/ # Team research matrices and BibTeX reference files
└── README.md