ani_engine

A workflow to train or evaluate ANI-related networks.

Install

install torchani, if working with parquet dataset (currently not working for hipergator because of cuda driver issue)

conda create -n cudf -c https://roitberg.chem.ufl.edu/projects/conda-packages-uf-gainesville -c rapidsai -c nvidia -c defaults -c conda-forge sandbox_cudf python=3.8

otherwise install torchani with

conda create -n ani -c https://roitberg.chem.ufl.edu/projects/conda-packages-uf-gainesville -c pytorch -c nvidia -c defaults -c conda-forge sandbox python=3.8

then install ani_engine and dependencies

conda activate ani  # or cudf
git clone git@github.com:roitberg-group/ani_engine.git
cd ani_engine
pip install -e .
pip install -r test_requirements.txt

After installation, there will be an executable script (ani_engine) available on you path.

$ which ani_engine
/home/richard/program/anaconda3/envs/cudf/bin/ani_engine

Features

train
- read config file, check more detail at configs
- ensemble train
- wandb support - Example Report 1
- customize your own engine and model
convert h5 dataset to parquet format
eval
- quickly eval a (builtin model / trained model / trained ensemble model) for (a dataset file / a folder of datasets)
- save evaluation results as csv (example of ani2x model for comp6v1 dataset: misc/eval/ani2x-Overall-comp6v1.csv)
  - csv is sorted by mean_abs_error_kcal_mol in descending order
  - only avaiable for parquet format dataset

Examples

0. init

Initiate ani_run, the following command will create a ani_run folder

ani_engine init

folder structure:

ani_run/
├── ani_run   # scripts, custom models and engines
│   ├── engines
│   │   └── __init__.py
│   ├── __init__.py
│   └── models
│       └── __init__.py
├── configs   # config files
├── datasets
├── logs
└── setup.py

A simple demo is available at ani_run_demo.
It's recommended to make this folder under source control (like git).
And note that please always run ani_engine command within the first ani_run folder.

1. download

download dataset from torchani

torchani download --help

2. train

ani_engine train configs/1x-energy.yaml

Useful options:

--name: override general.name in the config file
--use_wandb: flag to enable wandb, before using it you need to register a wandb account, and sign in with wandb login. Check quickstart - documentation for wandb. And let Richard know you wandb account email, so you could be added to our roitberg-group organization.
--mode: choose between {run, test, debug}, this controls which folder, e.g. logs/run or logs/debug, the log and checkpoint files should be saved. Folder logs/debug should be the one that could be safely deleted without any issue.

Check more options with ani_engine train --help.

3. ensemble train

prepare ensembles configs

ani_engine prepare_ensembles configs/1x-energy-ensemble.yaml -n 8 --mode=run

output

=> custom config_options:
{'general;mode': 'run'}

=> prepareing
=> using following configs for each ensemble
logs/run/20210808_213427-395a1814/0/config.yaml
logs/run/20210808_213427-395a1814/1/config.yaml
logs/run/20210808_213427-395a1814/2/config.yaml
logs/run/20210808_213427-395a1814/3/config.yaml
logs/run/20210808_213427-395a1814/4/config.yaml
logs/run/20210808_213427-395a1814/5/config.yaml
logs/run/20210808_213427-395a1814/6/config.yaml
logs/run/20210808_213427-395a1814/7/config.yaml

train

ani_engine train logs/run/20210808_213427-395a1814/0/config.yaml

Check more options with ani_engine prepare_ensembles --help.

4. h52pq

Convert h5 dataset into parquet format.

# convert one h5 file
ani_engine h52pq datasets/ani1x/ANI-1x-wB97X-6-31Gd.h5
# or a folder contains multiple h5 files
ani_engine h52pq datasets/ani1x/

5. eval

Evaluate trained/builtin model with specified dataset.

ani_engine eval config_path data_path

config_path
- torchani builtin model: ani1x, ani1ccx, ani2x
- a config file: logs/debug/20210819_210426-769a2f31/config.yaml
- a log dir contain config.yaml: logs/debug/20210819_210426-769a2f31/
- a log dir of ensemble training: logs/debug/20210819_210426-769a2f31/
data_path
- single h5 or pq dataset file
- a directory of datasets, specify file extension type by --ext=h5 or --ext=pq

examples:

ani_engine eval ani1x datasets/comp6v1/ANI-MD-Bench.pq
ani_engine eval ani1x datasets/comp6v1/ --ext=pq
ani_engine eval logs/debug/20210819_210426-769a2f31/config.yaml datasets/comp6v1/ANI-MD-Bench.pq

For pq datasets, eval will save the prediction results into a csv file, for example:

$ ani_engine eval ani1x datasets/comp6v1/ANI-MD-Bench.pq
Evaluating the following 1 datasets:
['datasets/comp6v1/ANI-MD-Bench.pq']
=> loading datasets

[1/1]: datasets/comp6v1/ANI-MD-Bench.pq
loading ['datasets/comp6v1/ANI-MD-Bench.pq']  time used: 0.03 s, peak memory used: 28.00 MB, memory used: 28.00MB
abs_error_kcal_mol description:
count    1791.000000
mean        4.518622
std         7.594079
min         0.000057
25%         0.598989
50%         1.416891
75%         3.310224
max        41.041344
Name: abs_error_kcal_mol, dtype: float64

prediction results:
                count  atoms  mean_energy_hartree  mean_pred_hartree  std_abs_error_kcal_mol  min_abs_error_kcal_mol  max_abs_error_kcal_mol  mean_abs_error_kcal_mol       dataset
C101H154N28O29    128    312         -7654.514618       -7654.475488                7.506302                4.599105               41.043289                24.554672  ANI-MD-Bench
C51H68N12O18      128    149         -3994.653553       -3994.623883                3.530049                9.905260               28.050765                18.618089  ANI-MD-Bench
C20H30O           128     51          -855.053566        -855.061703                3.279170                0.201232               15.995083                 5.240489  ANI-MD-Bench
C38H52N6O7        128    103         -2333.813382       -2333.811527                2.054335                0.025642               10.021924                 2.646591  ANI-MD-Bench
C24H33N3O4        128     64         -1399.129431       -1399.126231                1.275210                0.050889                6.324997                 2.266660  ANI-MD-Bench
C22H28N2O         127     53         -1039.608298       -1039.610291                1.080829                0.017344                5.322560                 1.524564  ANI-MD-Bench
C22H31NO          128     55          -986.664078        -986.665765                1.031409                0.006053                5.559594                 1.516545  ANI-MD-Bench
C16H28N2O4        128     50         -1036.672773       -1036.672597                1.063736                0.000057                5.246314                 1.283457  ANI-MD-Bench
C18H27NO3         128     49          -982.282336        -982.283091                0.977692                0.007971                3.976573                 1.246637  ANI-MD-Bench
C8H10N4O2         128     24          -680.186476        -680.185177                0.763155                0.023915                3.445625                 1.053451  ANI-MD-Bench
C17H21NO          128     40          -790.164569        -790.164118                0.765038                0.033761                3.286045                 1.045221  ANI-MD-Bench
C15H25N3O         128     44          -825.889368        -825.890083                0.636617                0.024524                3.005481                 0.871884  ANI-MD-Bench
C13H21NO3         128     38          -788.188306        -788.188702                0.730836                0.013122                4.013644                 0.848035  ANI-MD-Bench
C8H9NO2           128     20          -515.323039        -515.323440                0.380368                0.008054                1.945716                 0.521167  ANI-MD-Bench
csv saved at: /work/dev/ani_run/logs/eval/ani1x/ani1x-20210914_170039-ANI-MD-Bench.csv


test_rmse: 8.89

6. genconfs

Generate multiple configs based on a base_config file and a matrix_config file

set() is applied to each parameter's values, so there are no repeating field.
if general.name exist, it will be set as general.name_{i}
if general.note exist, it will be set with matrix info for this config

check example at tests/test_config

cd tests/test_config
ani_engine genconfs base.yaml matrix.yaml -v

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ani_engine

Install

Features

Examples

0. init

1. download

2. train

3. ensemble train

4. h52pq

5. eval

6. genconfs

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 91 Commits
ani_engine		ani_engine
ani_run_demo		ani_run_demo
configs		configs
misc/eval		misc/eval
tests		tests
tools		tools
.flake8		.flake8
.gitignore		.gitignore
README.md		README.md
TODO.md		TODO.md
psi4_data_requirements.txt		psi4_data_requirements.txt
setup.py		setup.py
test_requirements.txt		test_requirements.txt

roitberg-group/ani_engine

Folders and files

Latest commit

History

Repository files navigation

ani_engine

Install

Features

Examples

0. init

1. download

2. train

3. ensemble train

4. h52pq

5. eval

6. genconfs

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages