This repo provides a toolset for building de Bruijn graphs enhanced with various edge and node features as well as versatile graph learning options.
Warning
The code here is under development. Bugs are possible.
Before running the scripts, create and activate the environment with a PyTorch version suitable for your machine. You can try to use provided yaml file, but no guarantees that would work for you:
conda env create -f path/to/DBG-GNN/envs/environment.yaml
conda activate GNNsTo build de Bruijn graphs supplied with node and edge features, run:
python create_dbgs_cli.py \
--indir /path/to/dir/with/samples \
--outdir /path/to/outdir \
--kmer_len 4 \
--subkmer_len 2 \
--skip_N \
--normalization_method max \
--node_feature_method subkmer_freq_positional \
--normalize_node_features \
--threads 4 \
--verboseSee more info on arguments by running python create_dbgs_max_cli.py --help.
Use this script to train the model
python train_gnn_cli.py \
--indir /path/to/dir/with/graphs \
--outfile /path/to/model/savefile \
--plots_outdir /path/to/dir/to/save/plots \
--verboseExplore other parameters via python train_gnn_cli.py --help.