Motivation: Computational tools that predict peptide binding by major histocompatibility complex (MHC) proteins play an essential role in current approaches to harness adaptive immunity to fight viral pathogens and cancers. However, there are >22,000 known class-I MHC allelic variants, and it is unknown how well binding preferences are predicted for most alleles. We introduce a machine learning framework that enables state-of-the-art MHC binding prediction along with per-allele estimates of predictive performance.
If you utilize MHCGlobe or MHCPerf in your research please cite:
E. Glynn,D. Ghersi,& M. Singh, Toward equitable major histocompatibility complex binding predictions, Proc. Natl. Acad. Sci. U.S.A. 122 (8) e2405106122, https://doi.org/10.1073/pnas.2405106122 (2025).
MHCGlobe and MHCPerf are both easily accessible for model inference and re-training.
-
Download the mhcglobe git repository containing the code:
git clone https://github.com/ejglynn/mhcglobe.git -
Update the
mhcglobe_dirvariable insrc/paths.pywith the full path to yourmhcglobefolder. -
Download the two pickle files available on Zenodo and place them in the data folder.
-
From the
mhcglobefolder create and activate a Python3 virtual environment with the following commands:python3 -m pip install --user --upgrade pippython3 -m pip install --user virtualenvpython3 -m venv envsource env/bin/activate -
Install prerequisites in the virtual environment:
pip3 install jupyter pandas scipy sklearn tensorflow tqdm -
From the
mhcglobefolder, start jupyter:jupyter notebook
On your browser, click on the MHCGlobe_User_Notebook.ipynb to open and interact with the notebook.
To speed things up, output files have already been provided in the output folder. If you want to recompute these files, simply delete or rename the output folder.