gpgpu | Oxford Protein Informatics Group

As the clock speed in computer Central Processing Units (CPUs) began to plateau, their data and task parallelism was expanded to compensate. These days (2013) it is not uncommon to find upwards of a dozen processing cores on a single CPU and each core capable of performing 8 calculations as a single operation. Graphics Processing Units were originally intended to assist CPUs by providing hardware optimised to speed up rendering highly parallel graphical data into a frame buffer. As graphical models became more complex, it became difficult to provide a single piece of hardware which implemented an optimised design for every model and every calculation the end user may desire. Instead, GPU designs evolved to be more readily programmable and exhibit greater parallelism. Top-end GPUs are now equipped with over 2,500 simple cores and have their own CUDA or OpenCL programming languages. This new found programmability allowed users the freedom to take non-graphics tasks which would otherwise have saturated a CPU for days and to run them on the highly parallel hardware of the GPU. This technique proved so effective for certain tasks that GPU manufacturers have since begun to tweak their architectures to be suitable not just for graphics processing but also for more general purpose tasks, thus beginning the evolution General Purpose Graphics Processing Unit (GPGPU).

Improvements in data capture and model generation have caused an explosion in the amount of bioinformatic data which is now available. Data which is increasing in volume faster than CPUs are increasing in either speed or parallelism. An example of this can be found here, which displays a graph of the number of proteins stored in the Protein Data Bank per year. To process this vast volume of data, many of the common tools for structure prediction, sequence analysis, molecular dynamics and so forth have now been ported to the GPGPU. The following tools are now GPGPU enabled and offer significant speed-up compared to their CPU-based counterparts:

Application	Description	Expected Speed Up	Multi-GPU Support
Abalone	Models molecular dynamics of biopolymers for simulations of proteins, DNA and ligands	4-29x	No
ACEMD	GPU simulation of molecular mechanics force fields, implicit and explicit solvent	160 ns/day GPU version only	Yes
AMBER	Suite of programs to simulate molecular dynamics on biomolecule	89.44 ns/day JAC NVE	Yes
BarraCUDA	Sequence mapping software	6-10x	Yes
CUDASW++	Open source software for Smith-Waterman protein database searches on GPUs	10-50x	Yes
CUDA-BLASTP	Accelerates NCBI BLAST for scanning protein sequence databases	10	Yes
CUSHAW	Parallelized short read aligner	10x	Yes
DL-POLY	Simulate macromolecules, polymers, ionic systems, etc on a distributed memory parallel computer	4x	Yes
GPU-BLAST	Local search with fast k-tuple heuristic	3-4x	No
GROMACS	Simulation of biochemical molecules with complicated bond interactions	165 ns/Day DHFR	No
GPU-HMMER	Parallelized local and global search with profile Hidden Markov models	60-100x	Yes
HOOMD-Blue	Particle dynamics package written from the ground up for GPUs	2x	Yes
LAMMPS	Classical molecular dynamics package	3-18x	Yes
mCUDA-MEME	Ultrafast scalable motif discovery algorithm based on MEME	4-10x	Yes
MUMmerGPU	An open-source high-throughput parallel pairwise local sequence alignment program	13x	No
NAMD	Designed for high-performance simulation of large molecular systems	6.44 ns/days STMV 585x 2050s	Yes
OpenMM	Library and application for molecular dynamics for HPC with GPUs	Implicit: 127-213 ns/day; Explicit: 18-55 ns/day DHFR	Yes
SeqNFind	A commercial GPU Accelerated Sequence Analysis Toolset	400x	Yes
TeraChem	A general purpose quantum chemistry package	7-50x	Yes
UGENE	Opensource Smith-Waterman for SSE/CUDA, Suffix array based repeats finder and dotplot	6-8x	Yes
WideLM	Fits numerous linear models to a fixed design and response	150x	Yes

It is important to note however, that due to how GPGPUs handle floating point arithmetic compared to CPUs, results can and will differ between architectures, making a direct comparison impossible. Instead, interval arithmetic may be useful to sanity-check the results generated on the GPU are consistent with those from a CPU based system.

Oxford Protein Informatics Group

or "OPIG" to friends

Tag Archives: gpgpu

Happy 10th Birthday, Blopig!

GPGPUs for bioinformatics