Very often in Struc Bio it is necessary to determine the contacts between two molecules. Most of us in the group have written a snippet of code to compute precisely that or they have adapted the Biopython functionality or one of the tools in pdbtools. The piece of code written in Python presented here is a Biopython variety that gives you the intermolecular contacts and it annotates the interface neighborhood. An example of the program output is given in the Figure below:
Download
You can download it from here. There are three files inside:
- GetInterfaces.py – the main source/runnable file
- README.txt – instructions, very similar to this post (quite a lot copy/pasted)
- 1A2Y.pdb – the PDB file used in the example to practice on.
Requirements:
You need Biopython. (if you are from OPIG or any other Bioinformatics group, most likely it is already installed on your machine). You can download it from here.
How to use it?
As the bare minimum, you need to provide the structure of the pdb(s) and the chains that you want to examine contacts in.
Input options:
- –f1 : first pdb file [Required]
- –f2 : second pdb file (if the contacts are to be calculated in the same molecule, just submit the same pdb in both cases) [Required]
- –c1 : Chains to be used for the first molecule [Required]
- –c2 : Chains to be used for the second molecule [Required]
- –c : contact cutoff for intermolecular contacts (Optional, set to 4.5A if not supplied on input)
- –i : interface neighbor cutoff for intramolecular neighborhood of the contacting interface (Optional, set to 10.0A if not supplied on input). Set this option to zero (0.0) if you only want to get the intermolecular contacts in the interface, without the interface neighborhood.
- –jobid : name for the output folder (Set to out_<random number> if not supplied on input)
An example which you can just copy paste and run when in the same directory as the python script:
python GetInterfaces.py --f1 1A2Y.pdb --f2 1A2Y.pdb --c1 AB --c2 C --c 4.5 --i 10.0 --jobid example_output
Above command will calculate the contacts between antibody in 1a2y (chains A and B) and the antigen (chain C). The contact distance was defined as 4.5A and the interface distance was defined as 10A. All the output files are saved in the folder out_example_output.
Output
Output folder is placed in the current directory. If you specify the output folder name (–jobid) it will be saved under the name ‘out_[whateveryoutyped]’, otherwise it will be ‘out_[randomgeneratednumber]’. The program tells you at the end where it saved all the files.
Input options:
- molecule_1.pdb – the first supplied molecule with b-factor fields of contacts set to 100 and interface neighborhood set to 50
- molecule_2.pdb – the second supplied molecule with b-factor fields of contacts set to 100 and interface neighborhood set to 50
- molecule_1.txt – whitespace delimited file specifying the contacts and interface neighborhood in the second molecule in the format: [chain] [residue id] [contact ‘C’ or interface residues ‘I’]
- molecule_2.txt – whitespace delimited file specifying the contacts and interface neighborhood in the second molecule in the format: [chain] [residue id] [contact ‘C’ or interface residues ‘I’]
- molecule_1_constrained.pdb – the first molecule, which is constrained only to the residues in the interface.
- molecule_2_constrained.pdb – the second molecule, which is constrained only to the residues in the interface.
- parameters.txt – the contact distance and neighborhood distance used for the particular run.
Hello,
Nice code. Its very useful. I have two question. How about this method compare with the others that do interface calculation, like FoldX or PDBePISA? and what do you call “contact residues” to the residues that other program call interface residues?
Thanks!
Hi,
I have not benchmarked it against any other programs – although I don’t think it will be faster than others. In any case I did not envisage this program being used for running thousands of batch jobs over and over so it is definitely not optimized for such uses. I use it for benchmarking protein-protein contact prediction methods where I need to know what are the native intermolecular contacts shown. It is just a simple script which calculates residue distances and produces colored structure to be readily examined.
Thus an explanation:
contact residue: residues within a certain distance of each other (the –c option) between two molecules. These are the RED residues on the right side of the Figure in the post. Thus these are only INTERMOLCULAR
interface residue: residues in the neighborhood of the contact residues (the –i option). These are the GREEN residues on the right side of the Figure in the post. Thus these are only INTRAMOLECULAR
Intuition for this is the following: contact residues physically interact with the other molecule while the interface residues form the neighborhood of the contact residues (ie might contribute to binding in some other way – provide framework for the interface).
Hello,
Thanks for your answer, but I think that a write something wrong. I wanted to asked you why in this script the contact residues are the residues that other programs call interface residues? I wanted to know why exist such difference. With your explanation my definition of interface residues change. Before this I was thinking that interface residues are the residues that interact directly with residues in the other chain.
Thanks, and sorry if Im a little lost.
Sorry for the confusion, many programs refer to interfaces differently, using different cutoffs etc. In most cases what I call ‘contacts’ here, is understood as the interface everywhere else. However, I was trying to distinguish between residues that actually engage in intermolecular contacts (here called simply ‘contacts’) and the residues that are not engaged in intermolecular contacts but are still close to those on the same molecule that do (here called the ‘interface neighborhood’). Those residues that are in the neighborhood can have an effect on the binding, for example allostericity
For most applications when you want to get the interface CONTACTS without the neighborhood just run the program with the option –i 0 – this will not add any intramolecular neighborhood to your results.
thanks,
Ok. Now I understand all.
Thanks.