Skip to content

PKUliujl/GeoSeqBuilder

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

35 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GeoSeqBuilder

GeoSeqBuilder, an user-friendly tool package based on deep learning, is used for protein backbone-fixed sequence design and simultaneously gives highly accurate side-chain conformations.

Alt text

The dependent packages for GeoSeqBuilder are listed in the requirement.txt.

INSTALLATION

  1. Users can download it by git clone https://github.com/PKUliujl/GeoSeqBuilder.git (without pretrained model's parameters in the params directory due to the large size), alternatively, it is available to access our disk with Access Code 2xWE to get these files;
  2. Change the PATH of dssp in the line 158 and 511 to your own path if the given dssp does not work.

Before using GeoSeqBuilder, please check whether the dependent packages are available in your environment (see requirement.txt). If not, using pip or conda to install them.

USAGE

To better use GeoSeqBuilder toolkit for backbone-fixed protein sequence design and side-chain conformation prediction, please add some of
these parameters

optional arguments:
  -h, --help            show this help message and exit
  --purpose {0,1,2}     0 is used only for sequence design, 1 is chosen only for side-chain conformation prediction, and 2 involves
                        performing both tasks simultaneously. default: 0
  --inputPATH INPUTPATH, -iP INPUTPATH
                        The directory path should contain the pdb file
  --chainID CHAINID     The given chain for design or repacking
  --inputfile INPUTFILE, -i INPUTFILE
                        A pdb file under inputPATH, eg. T1024_A.pdb.
  --ST ST               Sampling temperature(scaling factor) belongs to (0,1). Larger values increase the number of interation steps for
                        convegence. Only used when --purpose is 0 or 2. default: 0.1
  --n N                 The number of designed sequences. Only activated when --purpose is 0 or 2. default: 10
  --SM SM               Starting mode with different sequence initializations (how many residues are randomly masked). default = 100(%)
                        masked, i.e. random initialization)
  --Fixed FIXED         Those residues are expected to remain unchanged with names following the format
                        {residuenumber_chainid_resname,residuenumber_chainid_resname}, e.g. 1_A_W,10_A_C Please note that the residuenumber
                        is renumbered begining at 0.
  --outputPATH OUTPUTPATH, -oP OUTPUTPATH
                        the directory path of the outputfile. default: inputPATH
  --verbose {0,1}       Display the intermediate sequences
  --noCYS {0,1}         Do NOT generate Cys as far as possible
  

EXAMPLE

For repacking,

python run_GeoSeqBuilder.py --purpose 1 --inputPATH examples/ --inputfile 3mpc_A.pdb --outputPATH ./ --chainID A   

For design,

python run_GeoSeqBuilder.py --purpose 0 --inputPATH examples/ --inputfile 3mpc_A.pdb --outputPATH ./ --chainID A

or,

python run_GeoSeqBuilder.py --purpose 2 --inputPATH examples/ --inputfile 3mpc_A.pdb --outputPATH ./ --chainID A

Reminder: Only the regular pdb format files are accepted as inputs. If many GPUs are available in your machine, please specify the param CUDA_VISIBLE_DEVICES which you use.

Feel free to contact me via email at liujl@stu.pku.edu.cn for other issues.

CITATION

If you find GeoSeqBuilder useful in your research, please cite it:

@article{liu2024all,
  title={All-atom protein sequence design based on geometric deep learning},
  author={Liu, Jiale and Guo, Zheng and Zhang, Changsheng and Lai, Luhua},
  journal={bioRxiv},
  pages={2024--03},
  year={2024},
  publisher={Cold Spring Harbor Laboratory}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages