Today sees the start of the Computational Biology and Innovation PhD Symposium at University College, Dublin. Matthias Gerstgrasser will be giving a presentation in tomorrow’s (Wednesday’s) session.
Title and abstract are:
Parallelising Sequential Metropolis-Hastings: Implementing MCMC in multi-core and GPGPU environments.
Markov Chain Monte Carlo (MCMC) techniques have become popular in recent years to efficiently calculate complex posterior distributions in Bayesian statistics. In computational biology, these methods have a wide range of applications, and in particular lend themselves to parameter estimation in models of complex biological systems. The Metropolis-Hastings algorithm is one widely used routine in this context. (1)
Our research focuses on employing the computational power provided by multi-core CPUs and general-purpose graphics processing units (GPGPUs) to provide a speedup to the operation of this algorithm. Both multi-core and GPGPU architectures offer vast computing power compared to traditional single-core environments, but tapping into these resources presents additional complexities. Yet current computer systems rely increasingly on increasing core count rather than performance per core to provide improvements in computing power, a trend that is almost certain to continue in the future. While (2) provides a GPGPU algorithm applicable to Independent Metropolis-Hastings (IMH), a parallel implementation of general MH instances has proven difficult due to the inherently sequential nature of this algorithm. In our own research, we are investigating possible speedups in automated model fitting and parameter estimation in large phenotype arrays of brewer’s yeast and other microorganisms. Our findings, however, would be equally applicable to other problems in systems biology.
We show how for some types of target distributions we can leverage independence in the structure of these distributions in order to partially parallelise the running of the MH algorithm. We furthermore discuss how this approach can be implemented efficiently on both multi-core CPUs as well as in GPGPU environments. In both cases we divide the workload of computing the acceptance probability in the MH algorithm’s main loop among several threads. Furthermore, we replicate the remaining instructions of the loop among these threads as well in order to minimise overhead incurred by thread creation, synchronisation and deletion. More importantly, in GPGPU environments this modification greatly decreases data transfers between GPU and main memory. Both our implementations show a significant speedup over a single-threaded classical MH algorithm for computationally expensive target distributions. We discuss limitations of these implementations and necessary conditions for them to provide a measurable speedup over single-threaded implementations.
In conclusion we compare the performance of parallelising a single instance of the MH algorithm compared to running several instances in parallel on either a multi-core CPU or in a GPGPU environment. The latter approach is particularly applicable to the common situation of estimating e.g. parameters from a number of distinct, but similar, experiments. We show how GPGPU computing can be used in these situations to provide an even greater speedup compared to single-threaded implementations.
1. Wilkinson, D J. Stochastic Modelling for Systems Biology, 2006.
2. Jacob, P, Robert CP, Murray HS. 2011; arXiv:1010.1595v3.