Selectivity is an important trait to consider when designing small molecule probes for chemical biology. If you wish to use a small molecule to study a particular protein, but that small molecule is fairly promiscuous in its binding habits, there are risks that any effects you observe may be due to it binding other proteins with similarly shaped binding pockets, instead of your protein of interest.
In this situation, you may decide to investigate so called ‘bump and hole’ methodology. This technique, which has been successfully employed in a variety of circumstances (e.g. Science. 2014 October 31; 346(6209): 638–641. doi:10.1126/science.1249830.), involves redesigning a small molecule probe to include extra steric bulk – a ‘bump’ – and conversely introducing a mutation into the protein of interest that swaps a bulkier residue for a smaller one – a ‘hole’. The new ‘bump’ in the small molecule will prevent it from binding properly with its original targets, but the mutated – ‘holed’ – protein will still accommodate it.
Under the right circumstances, simple human inspection of a crystal structure containing the original protein and ligand combination will yield plenty of potential bump/hole combination ideas to try. If there is a sterically bulky side chain cosying up next to portion of your molecule that looks like it could be relatively easy to chemically modify, you don’t necessarily need anything more empirical before starting the relevant synthesis and mutation in the lab, then testing in vitro.
But if no immediately obvious routes present themselves, taking a more methodical approach in silico in the first instance has the potential to shave many dead ends off any proposed lab work, and even to present routes that may not have been obvious by inspection alone.
Assuming there is a crystal structure of the small molecule bound to the protein of interest that can be used as a basis, docking can provide a high throughput way to test different combinations of ligand modifications and protein mutations. Foldx, a software suite by CRG, can be used to evaluate proposed mutations, providing energy readouts that give a good indication of the mutant protein’s stability relative to the parent. Once a 3D coordinate file for the mutant protein has been generated, the ‘bumped’ ligand can be docked into it and their predicted binding energies evaluated.
“But why bother redocking a small molecule whose binding conformation is, essentially, already known?” I hear you ask. A very good point, and this is why constrained docking is so useful for this particular scenario. GOLD, by CCDC, allows users to provide a ‘scaffold’ when docking, which will bias its docking results towards those where a certain portion of your molecule is held in a user-defined position. In this case, a conserved portion of the ligand – i.e., a section that is exactly the same in both the original ligand and the bumped ligand – can be used to hold that section of the bumped ligand in the correct position and conformation during docking. This ensures that the computational power is used effectively, by exploring conformations of the modified section of the ligand, rather than trying to accurately redock a section whose binding conformation is already known to us.
With this in mind, I find it most effective to build any ‘bumped’ ligand 3D coordinate files using the ligand from the original crystal structure as a template – e.g. using PyMOL’s ‘builder’ function. This will not only preserve the coordinates of the conserved section (which, in theory, doesn’t matter), but it will also preserve any ring conformations present (which docking software often has more trouble with). Contrast this with, for example, converting a SMILES string into a .mol2 files using RDKit, Openbabel, or otherwise: the conformation of the conserved section given by this may well be different to that seen in the crystal structure, and again it makes no sense the the docking software to attempt to recreate something for which we already have the answer.
Another feature that GOLD offers is residue flexibility. How extensively you use this feature – if at all – will depend on your specific system. But allowing the mutated residue’s side chain to explore multiple conformations along with the unconstrained half of the ligand will provide more insight than leaving it in the ‘default’ position assigned by Foldx.
Once GOLD has returned its best docking attempts for a set of ‘bumped’ ligands and ‘holed’ proteins, Any suitable forcefield – such as the one built into Autodock 4 – can be used to predict their binding energies, and inspection of the full grid of ligands vs proteins will highlight any exciting avenues to be explored in the lab. From my own personal experience, I would encourage anyone attempting this method to be exhaustive in their docking, pitching all twenty potential mutants for any given position against a set of all the chemical modifications you can think of/easily make 3D coordinate files for. Don’t limit your testing to just testing the bumps and holes that you think will complement each other, but do the full exhaustive grid, as some predicted interactions with good binding energies may well end up being between combinations that wouldn’t have appeared intuitive.
Furthermore, whilst taking synthetic feasibility into account when proposing chemical modifications is obviously sensible, don’t be frightened of testing ideas in silico that might be hard to make in the lab – after all, it won’t add much extra time to your computational workflow. Besides, spending time making the right compound from the get go, however tricky the synthesis, may end up proving quicker than sticking to easier syntheses, but having to do loads because none of your compounds work properly.