# Gaussian process regression for ABC inference

**Supervisor: **David Balding

**Available for:** PhD only

**Location:** Melbourne Integrative Genomics, University of Melbourne

**Project title: **Gaussian process regression for ABC inference**Description:** ABC, or Approximate Bayesian Computation, has revolutionised statistical inference, initially in population genetics but later spreading to other application areas, by allowing principled albeit approximate statistical inference under complex models that can have large numbers of latent (usually nuisance) variables. The key idea is very simple: (1) simulate data under the model; (2) if the simulated data are sufficiently similar to the observed data, then retain the parameters of interest from the simulation; (3) repeat until the set of retained values is large enough for accurate inferences.

There are many variations on this basic algorithm, for example retained values can be weighted rather than simply accepted or rejected. In particular there are many ways to quantify the similarity between observed and simulated datasets, often using summary statistics. My colleagues and I previously proposed a regression approach in which we viewed a parameter that is the target of inference as the dependent variable in a regression, with each simulated dataset a realisation of a high-dimensional predictor variable. The simulations provide the training data from which a regression model can be fit, and the task is then to use the fitted model to predict the unobserved parameter value corresponding to the observed dataset.

We originally fitted a local-linear regression for the posterior mean of the parameter. In this project we will investigate more sophisticated modelling approaches including Gaussian Process Regression models in the parameter space. Note that GPR have been used in the data space, for example in synthetic regression, but we will investigate ways to model the posterior distribution of the parameter using Gaussian processes, which is a convenient class of models for which efficient software is already available. If successful, this project could generate another major step forward in statistical inference under complex models in many application areas.