Check whether the method for placing vertices during Chaos Game Representation (CGR) has an impact on biological sequence classification perofrmance.
# running random search of different CGR encodings. This script calls FCGR_gen.R from inside
python ./cnn_random_encoding_search.py \
--logfile cnn.log \
--seqfile ../data/deeploc_clean.csv \
--fcgrfile ../data/random_encoding_0865_35.csv \
--outfile ../data/cnn_res_iter.csv \
--sf 0.865 \
--res 35 \
--n 10_000The resulting file is results/cnn_res_iter.csv.
# creating the dedicated encodings (Min, Q1...Max).
# It uses the FCGR generation script (FCGR_gen.R)
bash dedicated_encodings.sh# running cross validation for the dedicated datasets.
# It calls internally the cnn_cv.py file
bash multiple_cvs.shThe resulting file is results/multiple_cv_results.csv.
# running augmentation with cross validation.
# it calls the augmentation.py file
bash multiple_augmentations.shThe resulting file is results/multiple_aug_results.csv.