## Building Intelligent Probabilistic Systems

In the HIPS group, we are interested in building intelligent algorithms. What makes a system intelligent? Our philosophy is that "intelligence" means making decisions under uncertainty, adapting to experience, and discovering structure in high-dimensional noisy data. The unifying theme for research in these areas is developing new approaches to statistical inference: uncovering the coherent structure that we cannot directly observe and using it for exploration and to make decisions or predictions. We develop new models for data, new tools for performing inference, and new computational structures for representing knowledge and uncertainty.

A perpetual challenge in statistical modelling is trying to find the

*parsimonious complexity*in the data. That is, balancing simplicity in our explanations of the world with the flexibility that is required to capture the rich variation that occurs in real data. One remarkable class of mathematical tools for balancing these extremes are Bayesian nonparametric model, which enable one to specify an infinite-dimensional model, while still manipulating it tractably on a finite computer. Such models mean that our explanations for the world can grow in complexity precisely to the extent that the data allow it.
Modern machine learning methods have proved remarkably successful at inferring statistical structure from data, something that any intelligent system must be able to perform. However, there is a disconnect between how our algorithms are represented in computer hardware and what we understand about the hardware of natural neural systems. In particular, we are still trying to understand how action potentials (neural spikes) can be used to implement adaptive computation. An ongoing project in the HIPS group is to try to formalize such computation in terms of powerful statistical objects called

*point processes*.
Many of the empirical successes of machine learning can be characterized as "simple discrimination functions applied to complex representations". The question is: how do we automatically find these representations? In probabilistic modelling, we view this as a problem of finding

*latent variables*, which provide a simpler and often lower-dimensional representation of our high-dimensional data. In the HIPS group, we are constantly developing new ways to construct these kinds of models and apply them in different domains.
Powerful mathematical models and representations are only useful if we can perform the computation necessary to manipulate them. In the context of intelligent probabilistic systems, this can often be viewed as the problem of performing statistical inference. We are interested in building new computational tools that enable this inference, most often by developing new Monte Carlo methods, with potential impact both within computer science and statistics, but also across the broader sciences, such as biology and physics.

## Recent News

## "Firefly Monte Carlo" Wins Best Paper at UAI

Dougal Maclaurin's paper Firefly Monte Carlo: Exact MCMC with Subset of Data has won the Microsoft Best Paper Award at this year's Conference on Uncertainty in Artificial Intelligence (UAI). Congrats Dougal!

## Five Papers at ICML 2014

The HIPS group co-authored five papers to appear at this year's International Conference on Machine Learning (ICML).

## Netflix Using Spearmint for Bayesian Optimization

As reported by Wired magazine and on the Netflix tech blog. Netflix has been experimenting with deep learning tools for making recommendations. Moreover, they've been using our software Spearmint to set the hyperparameters with a cluster of machines on Amazon EC2.

## Recent Publications

Firefly Monte Carlo: Exact MCMC with Subsets of Data. .
Thirtieth Conference on Uncertainty in Artificial Intelligence (UAI). 2014. { arXiv:1403.5693 [stat.ML] | PDF }

Accelerating MCMC via Parallel Predictive Prefetching. .
Thirtieth Conference on Uncertainty in Artificial Intelligence (UAI). 2014. { arXiv:1403.7265 [stat.ML] | PDF | Code }

Bayesian Optimization with Unknown Constraints. .
Thirtieth Conference on Uncertainty in Artificial Intelligence (UAI). 2014. { arXiv:1403.5607 [stat.ML] | PDF }

A Physiological Time Series Dynamics-Based Approach to Patient Monitoring and Outcome Prediction. .
IEEE Journal of Biomedical and Health Informatics. 2014. { PDF }

Input Warping for Bayesian Optimization of Non-Stationary Functions. .
Thirty-First International Conference on Machine Learning (ICML). 2014. { arXiv:1402.0929 [stat.ML] | PDF }