This is the accompanying repository for Refining Embedding-Based Binding Predictions by Leveraging AlphaFold2 Structures. If your input files are in the exact same format as the sample files, you can use the scripts main methods. Otherwise, it is advised to incorporate the implemented functions in your own code.
The application constitutes 3 independent modules:
- bindViz.py uses PyMol to visualize binding residues on the protein 3D structures. It outputs a .gif file of a rotating protein annotated with its binding residues.
- bindAdjust.py modifies the small, metal and nuclear binding probabilities of all residues based on the distances between them.
- bindRefine.py identifies one or several sections of the protein with the highest average binding probability.
usage: bindViz.py [-h] -p PREDSDIR -t TRUES -o OUTDIR [-lm LIGANDMAP]
[-lt LIGANDTYPE] [-c CUTOFF] [-res RESOLUTION] [-fpr FPR]
[-s]
This tool visualizes binding residue on the 3D structure of proteins. Make
sure PyMol is installed on your system.
required arguments:
-p PREDSDIR, --predsdir PREDSDIR
directory containing predictions in specific format,
see sample file. If your predictions are not available
in this specific format. Please use the functions
directly. (default: None)
-t TRUES, --trues TRUES
file containing known binding residues, see sample
file. If your true values are not available in this
specific format. Please use the functions directly.
(default: None)
-o OUTDIR, --outdir OUTDIR
output directory for visualizations (default: None)
optional arguments:
-lm LIGANDMAP, --ligandmap LIGANDMAP
ligand map, maps UniProt sequences to PBD structures.
If no map is provided, stored map will be used.
(default: None)
-lt LIGANDTYPE, --ligandtype LIGANDTYPE
ligand type to analyse, options are: small, metal and
nuclear (default: small)
-c CUTOFF, --cutoff CUTOFF
cutoff used when converting float binding
probabilities to binary predictions (default: 0.5)
-res RESOLUTION, --resolution RESOLUTION
resolution of the render, number of pixels for height
and width, always a square (default: 500)
-fpr FPR, --fpr FPR frames per rotation - number of frames generated for
one full rotation of the protein, more frames lead to
longer render times (default: 12)
-s, --spectrum visualize probabilities as continous color spectrum
instead of binary predictions (default: False)
python3 bindViz.py -o <outdir> -lt small -p files/example_input/predictions -t files/example_input/binding_residues_2.5_small.txt -res 1000 -fpr 10
It is important that PyMol is running and that the following packages are installed in the enviroment:
- imageio
- tqdm
- pymol
- Blue = True Positives
- Red = False Positives
- Light Blue = False Negatives
- Grey = True Negatives
- Cyen = low probability
- Red = high probability
usage: bindRefine.py [-h] -p PREDSDIR -o OUTDIR -d DISTANCEMAP
(-eps EPSILON | -k K) [-l LAYER]
This tool identifies a section of a protein with the highest average ligand
binding probability.
required arguments:
-p PREDSDIR, --predsdir PREDSDIR
directory containing predictions in specific format,
see sample file. If your predictions are not available
in this specific format. Please use the functions
directly. (default: None)
-o OUTDIR, --outdir OUTDIR
output directory (default: None)
-d DISTANCEMAP, --distancemap DISTANCEMAP
directory containing protein distance maps, see sample
file of distance map for required file structure and
name. (default: None)
-eps EPSILON, --epsilon EPSILON
set this value if you want to use epsilon mode of
bindRefine (default: None)
-k K, --k K set this value if you want to use k mode of bindRefine
(default: None)
optional arguments:
-l LAYER, --layer LAYER
index of distance map layer. Options: 0 N, 1 C-alpha,
2 C-beta and 3 backbone C distances (default: 3)
python3 bindAdjust.py -o <outdir> -p files/example_input/predictions -d files/example_input/distance_maps -C 15
- tqdm
- numpy
usage: bindRefine.py [-h] -p PREDSDIR -o OUTDIR -d DISTANCEMAP
(-eps EPSILON | -k K) [-l LAYER]
This tool identifies a section of a protein with the highest average ligand
binding probability.
required arguments:
-p PREDSDIR, --predsdir PREDSDIR
directory containing predictions in specific format,
see sample file. If your predictions are not available
in this specific format. Please use the functions
directly. (default: None)
-o OUTDIR, --outdir OUTDIR
output directory (default: None)
-d DISTANCEMAP, --distancemap DISTANCEMAP
directory containing protein distance maps, see sample
file of distance map for required file structure and
name. (default: None)
-eps EPSILON, --epsilon EPSILON
set this value if you want to use epsilon mode of
bindRefine (default: None)
-k K, --k K set this value if you want to use k mode of bindRefine
(default: None)
optional arguments:
-l LAYER, --layer LAYER
index of distance map layer. Options: 0 N, 1 C-alpha,
2 C-beta and 3 backbone C distances (default: 3)
python3 bindRefine.py -o <outdir> -p files/example_input/predictions -d files/example_input/distance_maps -k 10
- tqdm
- numpy
If you want to credit us, feel free to cite
Endres, L., Olenyi, T., Erckert, K., Weißenow, K., Rost, B., Littmann, M. Refining Embedding-Based Binding Predictions by Leveraging AlphaFold2 Structures.