Skip to content

chaconlab/PiFoldDB

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PiFoldDB

This is a curated CATH 4.3 dataset for PiFold (an updated version of CATH 4.2 by Ingraham, et al, NeurIPS 2019). This new version included better structures (PDB-REDO), more chains, the last CATH release, included gaps (noted by "-"), removed Tags and missing regions (noted as "X" with NaN coordinates), removed tags, and cases with large missing regions.

Datasets

Preprocessed data and splits can be found here: cathPi.tgz:

  • chain_set.jsonl Max sequence length 500 aa
  • chain_set_splits.json Test: 1422 Train: 18960 Validation: 1436

About

PiFoldDB

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages