You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This issue tracks the implementation, configuration, execution, and validation of the DEEPSYNTH-AMP (De novo Evolutionary Ensemble Policy Synthesis-Yaw Network Transfer Heuristic for Antimicrobial Molecular Peptide Design) pipeline.
The primary objective is to implement a lightweight, diffusion-based reinforcement learning pipeline that simultaneously optimizes antimicrobial peptides (AMPs) for three core objectives: Antibacterial Activity (MIC), Safety (Low Human Toxicity), and Chemical Synthesizability using a Pareto-based multi-objective optimization framework.
🎯 Project Scope & Core Innovations
Weight-Free Multi-Objective Optimization: Generates a Pareto front of non-dominated candidates, eliminating arbitrary reward weights.
Synthesizability-Aware Diffusion: Addresses the laboratory manufacturing gap by explicitly integrating synthesizability constraints.
Democratized Design: Completely open-source pipeline reproducible on a single GPU within one week with < $10 compute cost.
🗺️ Implementation Milestones & Task Checklist
Module A: Data & Model Preparation
Step 1: Environment Configuration
Install Python 3.10+, PyTorch (with CUDA support), HuggingFace transformers, RDKit, and AiZynthFinder.
Verification: All packages import successfully; GPU is correctly detected by PyTorch.
Step 2: Dataset Acquisition and Processing
Download public datasets: DRAMP 3.0 (20,234), ToxinPred (2,000), PeptideAtlas (10,000), and AiZynth Database (500K rules).
Apply processing rules: Filter sequence lengths between 5-50 Amino Acids (AA), remove duplicates, precompute retrosynthetic synthesizability scores, and split data into 70% Train / 15% Val / 15% Test.
Verification: Training set contains 15,000+ sequences; synthesizability scores range from 0 to 1.
Step 3: Model Loading and Initialization
Load the pretrained EvoDiff model (38M parameters) from HuggingFace hub.
Freeze base layers and add LoRA adapters (Rank=8, Alpha=16) for parameter-efficient fine-tuning.
Verification: 100 generated test peptides pass regular expression (regex) validation for standard amino acids.
📝 Issue Description
This issue tracks the implementation, configuration, execution, and validation of the DEEPSYNTH-AMP (De novo Evolutionary Ensemble Policy Synthesis-Yaw Network Transfer Heuristic for Antimicrobial Molecular Peptide Design) pipeline.
The primary objective is to implement a lightweight, diffusion-based reinforcement learning pipeline that simultaneously optimizes antimicrobial peptides (AMPs) for three core objectives: Antibacterial Activity (MIC), Safety (Low Human Toxicity), and Chemical Synthesizability using a Pareto-based multi-objective optimization framework.
🎯 Project Scope & Core Innovations
🗺️ Implementation Milestones & Task Checklist
Module A: Data & Model Preparation
Module B: Training & Optimization
pytest; rewards range dynamically between -0.5 and 1.0.3e-4and clip range of0.2. Save model checkpoints every 5 epochs.Module C: Evaluation & Selection
README.md.📊 Expected Performance Benchmarks
The generated output metrics should ideally align with or exceed the following target values based on proposal estimations:
⚡ Risk Mitigation Matrix
If execution encounters any bottlenecks, refer to these approved mitigation strategies:
🕒 Estimated Effort Summary