3D GNN for Antibody-Antigen ddG Prediction

The accurate prediction of changes in binding free energy ($\Delta\Delta G$) upon point mutation is a cornerstone of computational antibody engineering and affinity maturation. While structural deep learning has gained prominence, distinguishing the inductive bias provided by 3D geometry from signal inherent in sequence and physicochemical properties remains a challenge. This repository implements a rigorous comparative framework to evaluate the predictive performance of 3D Graph Neural Networks (GNNs) against robust linear and sequence-based baselines. The primary objective is to quantify the marginal gain of modeling the antibody-antigen interface as a graph, focusing on the AB-Bind dataset and extending to SKEMPI 2.0 for generalizability.

Given a wild-type antibody-antigen complex with a resolved 3D structure (PDB) and a specific single-point mutation (defined by chain, residue index, wild-type amino acid $\to$ mutant amino acid), the objective is to predict the scalar change in binding free energy: $$\Delta\Delta G = \Delta G_{\text{mutant}} - \Delta G_{\text{wild-type}}$$

A negative $\Delta\Delta G$ typically indicates improved affinity (stabilization), while a positive value indicates destabilization.

Methodology: A Hierarchical Modeling Approach

To ensure that performance gains are attributable to structural reasoning rather than data leakage or simple residue propensities, this project employs a two-stage modeling strategy.

The Linear and Sequence-Based Baseline

Before implementing geometric deep learning, we establish a "lower bound" of performance using interpretable linear models (Linear/Logistic Regression) and tree-based ensembles (Random Forest/XGBoost).

Rationale:

Leakage Detection: Historical benchmarks on datasets like SKEMPI have frequently suffered from data leakage, where models memorize complex-specific biases rather than learning biophysical rules (1). High performance by a linear model often indicates improper train/test splits (e.g., random splitting rather than complex-level splitting).
Signal Quantification: Recent studies suggest that sequence-only and simple statistical features can achieve significant correlations in $\Delta\Delta G$ prediction tasks (2). This baseline quantifies how much variance in the AB-Bind dataset can be explained by residue identity and physicochemical descriptors alone, without explicit 3D coordinates.
Interpretability: Linear weights provide immediate sanity checks regarding physicochemical intuition (e.g., penalties for burying hydrophilic residues or introducing steric clashes via volume changes).

Feature Engineering:

Wild-type and Mutant amino acid identities (One-Hot).
Physicochemical property shifts (Volume, Hydrophobicity, Charge, Polarity).
Interface vs. Non-interface positioning flags.

3D Graph Neural Network (GNN) on the Interface

Upon validation of the dataset splits and baselines, we implement a geometric deep learning architecture. Graph Construction: Following the philosophy of frameworks such as DeepRank-GNN (3), the protein interface is transformed into a graph $\mathcal{G} = (\mathcal{V}, \mathcal{E})$:

Nodes ($\mathcal{V}$): Interface residues (defined by a distance cutoff, typically 8-10 Angstrom) from both the Antibody and Antigen. Features include amino acid type, chain logic (Ab/Ag), and atomic coordinates.
Edges ($\mathcal{E}$): Spatial neighbors within the defined cutoff. Edges are annotated with Euclidean distances and categorical flags for intra-chain vs. inter-chain interactions.

Architecture:

A message-passing GNN (e.g., Graph Convolutional Networks or Graph Attention Networks) propagates features across the interface graph.
The architecture specifically encodes the mutation site, allowing the network to learn the localized perturbation in the structural environment.
The readout layer pools node representations to regress the scalar $\Delta\Delta G$.

Datasets and Curation

AB-Bind

Source: Sirin et al., AB-Bind: Antibody binding mutational database for computational affinity predictions
Composition: 1,101 mutants across 32 unique antibody-antigen complexes with experimentally determined $\Delta\Delta G$ values.
Processing: Raw data is sourced via submodule from 3D-GNN-over-antibody-antigen/data/external/AB-Bind-Database. Processed data (3D-GNN-over-antibody-antigen/data/processed/ab_bind_with_labels.csv) includes PDB IDs, chain mappings, normalized mutation strings, and discrete labels (Improved/Neutral/Worsened) for classification tasks.

SKEMPI 2.0

Source: Jankauskaite et al., SKEMPI 2.0: an updated benchmark of changes in protein-protein binding energy
Composition: 7,085 mutants covering a diverse range of general protein-protein interactions (PPIs).
Utility: Used to assess the transferability of features learned on antibody interfaces to general PPIs.

References

Geng, C., et al. (2019). ISPRED4: interaction sites PREDiction in protein structures with a refinement strategy. Bioinformatics. (Context: Evaluation of leakage in PPI datasets).
Dehghanpoor, R., et al. (2018). ProAffiMuSeq: sequence-based prediction of protein-protein binding affinity change upon mutation. Bioinformatics.
Rau, M., et al. (2023). DeepRank-GNN: a graph neural network framework to learn patterns in protein-protein interfaces. Bioinformatics.
Sirin, S., et al. (2016). AB-Bind: Antibody binding mutational database for computational affinity predictions. Protein Science.
Jankauskaite, J., et al. (2019). SKEMPI 2.0: an updated benchmark of changes in protein-protein binding energy, kinetics and thermodynamics upon mutation. Bioinformatics.

Reproducible pipeline

Dependencies for the current scripts are declared in requirements.txt and include biopython and torch in addition to the pandas/scikit-learn stack; install them with pip install -r requirements.txt. The complex-level CV pipeline additionally uses pyyaml for config parsing.

Enhanced feature engineering adds per-mutation structural context (mutation/neighbor counts, chain type, partner contact density, interface flags, and distance-to-partner) to 3D-GNN-over-antibody-antigen/data/processed/ab_bind_features.csv.

Run python -m src.data.prepare_ab_bind to consume 3D-GNN-over-antibody-antigen/data/processed/ab_bind_with_labels.csv, engineer physicochemical shift statistics plus the structural context above, and build group-consistent train/validation/test splits. The script writes 3D-GNN-over-antibody-antigen/data/processed/ab_bind_features.csv and 3D-GNN-over-antibody-antigen/data/processed/ab_bind_splits.json.
Execute python -m src.data.build_interface_graphs to parse the PDBs under data/external/AB-Bind-Database, retain residues around each mutation, and serialize per-mutation interface graphs (with solvent proxies, atomic counts, and B-factor cues) to 3D-GNN-over-antibody-antigen/data/graphs/ab_bind_graphs.pkl. The parser now preprocesses files like 3nps.pdb to fill missing occupancy columns so every structure yields features.
Use python -m src.baselines.train_baselines to fit ridge, random forest, and gradient-boosting regressors on the engineered features; per-split metrics are saved to 3D-GNN-over-antibody-antigen/reports/metrics/baseline_metrics.csv.
Run python -m src.gnn.train_gnn (optionally pass --patience N for early stopping) to load the serialized graphs, train a lightweight attention-weighted message-passing GNN, and log epoch-level metrics to 3D-GNN-over-antibody-antigen/reports/metrics/gnn_metrics.csv.
Execute python scripts/compare_models.py to merge baseline and GNN predictions and write comparison metrics/plots under results/model_comparison/.

Future steps should compare the new GNN metrics against the baselines and extend the structural graphs to SKEMPI 2.0 for transfer evaluation.

Complex-level cross-validation

To quantify variance across complex-wise splits (and avoid leakage), we run GroupKFold at the complex level. The CV pipeline reuses the same feature table and baseline hyperparameters, and optionally stacks GBT + fixed GNN predictions (no per-fold GNN retraining).

Ranges from the 5-fold × 3-repeat run:

random_forest: Pearson -0.01–0.51, Spearman -0.02–0.54, MAE 1.16–1.97 kcal/mol
gbt: Pearson 0.05–0.47, Spearman 0.09–0.54, MAE 1.19–2.12 kcal/mol
stack_gbt_gnn: Pearson 0.05–0.47, Spearman 0.09–0.54, MAE 1.13–1.87 kcal/mol

Stacking improves MAE in many folds but does not consistently improve correlation versus GBT.

R² remains close to zero on this small, noisy dataset; these are baseline-level results, not state of the art.

How to reproduce:

# Run complex-level CV (uses config/complex_cv.yaml)
python scripts/run_complex_cv.py --config config/complex_cv.yaml

# Generate CV summary plots
python scripts/plot_complex_cv_results.py

Benchmark snapshot (AB-Bind, test split)

Best tested GNN (v1): InterfaceGNN, hidden_dim=128, layers=3, dropout=0.0, target standardization + distribution-weighted loss.
Comparative results:

Model	Test MAE	Test RMSE	Test R2
Gradient boosting	0.80	0.99	0.22
Random forest	0.89	1.07	0.01
Ridge	>1.0	>1.3	<0
InterfaceGNN (v1)	0.74	1.14	-0.02

Takeaway: the GNN pipeline trains and matches the MAE of strong tabular baselines but still trails in R2; next steps are richer node/edge features and light architecture tweaks on GPU (3-4 layers, 128-256 channels).

Next-step plan (script by script)

3D-GNN-over-antibody-antigen/reports/metrics/gnn_metrics.csv Replace with a clean GPU run (InterfaceGNN 128d/3 layers, lr=1e-3, standardize targets, dist_weighted loss), then rerun the comparison plot.
scripts/compare_models.py Run python scripts/compare_models.py after updating metrics to generate results/model_comparison/ outputs.
src/data/build_interface_graphs.py (v2 feature enrichment) Load data/processed/ab_bind_features.csv and inject physico-chemical features (hydrophobicity/volume/charge/polarity deltas, structural stats) into node_features (and optionally edge_attr) keyed by sample_id.
src/gnn/train_gnn.py / InterfaceGNN Keep 3-4 layers, 128-256 channels; continue using clamped edge_attr/weights and small readout. If graph features are enriched, adjust input dimensions accordingly.
Suggested GPU sweep (v2) hidden_dim in {128, 256}, layers in {3, 4}, lr in {5e-4, 1e-3}, with target standardization + dist_weighted loss.

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
config		config
data		data
reports		reports
scripts		scripts
src		src
tests		tests
.gitignore		.gitignore
README.md		README.md
TODO.md		TODO.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

3D GNN for Antibody-Antigen ddG Prediction

Methodology: A Hierarchical Modeling Approach

The Linear and Sequence-Based Baseline

Feature Engineering:

3D Graph Neural Network (GNN) on the Interface

Datasets and Curation

References

Reproducible pipeline

Complex-level cross-validation

Benchmark snapshot (AB-Bind, test split)

Next-step plan (script by script)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

3D GNN for Antibody-Antigen ddG Prediction

Methodology: A Hierarchical Modeling Approach

The Linear and Sequence-Based Baseline

Feature Engineering:

3D Graph Neural Network (GNN) on the Interface

Datasets and Curation

References

Reproducible pipeline

Complex-level cross-validation

Benchmark snapshot (AB-Bind, test split)

Next-step plan (script by script)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages