Fourth Workshop on

BAYESIAN INFERENCE IN STOCHASTIC PROCESSES

Villa Monastero, Varenna (LC), Italy

June, 2-4, 2005

TALKS



Matthew J. Beal and Yee Whye Teh

Efficient Sampling Strategies for the Hierarchical Dirichlet Process: with Application to the Infinite Hidden Markov Model and its Variants

We consider time series data modelling using Hidden Markov Models having an a priori unknown number of hidden states. We show that the Infinite Hidden Markov Model [1] can be recast in the framework of Hierarchical Dirichlet Processes (HDPs). The HDP framework [2] considers problems involving related groups of data: each (fixed) group of data is modelled by a DP mixture model, with the common base measure of the DPs being itself distributed according to a global DP. The base measure being discrete w.p.1 ensures that the group DPs share atoms (despite being countably infinite). [2] presents two sampling schemes for posterior inference in the HDP.
We cast sequential data in the grouped data framework by assigning observations to groups, where the groups are indexed by the value of the previous state variable in the sequence; then the current state and its emission distributions define a group-specific mixture model. Thus the hidden state sequence implicitly defines a partition into groups, and induces constraints in the posterior that make the sampling methods proposed in [2] quite inefficient. We present novel effective MCMC methods to overcome this problem in the iHMM and it variants, and present results in text and bioinformatics domains.

[1] Beal, M.J., Ghahramani, Z. and Rasmussen, C.E. (2001). The Infinite Hidden Markov Model. In Advances in Neural Information Processing Systems 14:577-584, eds. T. Dietterich, S. Becker, Z. Ghahramani, MIT Press, 2002.

[2] Teh, Y.W., Jordan, M.I., Beal, M.J. and Blei, D.M. Hierarchical Dirichlet Processes. To appear in Advances in Neural Information Processing Systems 17, MIT Press, 2005. For more details see Technical Report 653, UC Berkeley Statistics, 2004. Also submitted to JASA.

Alexandros Beskos

Exact inference for diffusion processes using MCMC

We present a new methodology for parametric inference about diffusions given a set of discrete time observations. The method is based on an algorithm for the exact simulation of the stochastic model and involves imputation of the missing paths among the data. The exact simulation algorithm allows for carrying out a simple Gibbs step when sampling from the full conditional of the missing paths without the necessity of any kind of approximation.

Paul G. Blackwell and K. J. Harris

Bayesian radio-tracking: inference for diffusions in heterogeneous environments

In ecology and zoology, the radio tracking of animals is an important source of data on movement, behaviour and habitat use. Dependence between successive observations on an individual needs to be allowed for in any statistical analysis; since observations are not necessarily made at equal time intervals, a natural approach is to regard the true underlying movement process as a diffusion.
In this talk, we will discuss a range of fully-parametric diffusion models that capture key features of realistic patterns of animal movement. These include the two-dimensional Ornstein-Uhlenbeck process (Dunn and Gipson, 1977, Biometrics 33:85-101), diffusions driven by an underlying Markov chain representing behaviour (Blackwell 2003, Biometrika 90:613-627) and diffusions with different properties in different discrete regions of space (Blackwell and Harris, work in progress); the latter is an extension to two (or more) dimensions of continuous-time threshold auto-regressive processes.
We will describe and illustrate techniques for carrying out fully Bayesian inference for these models, using a Markov Chain Monte Carlo approach.

Paolo Bulla and Pietro Muliere

Bayesian Nonparametric Estimation for Reinforced Markov Renewal Processes

Starting from the definitions and the properties of reinforced renewal processes and reinforced Markov renewal processes, we characterize, via exchangeability and de Finetti's representation theorem, a prior that consists of a family of Dirichlet distributions on the space of Markov transition matrices and beta-Stacy processes on distribution functions. Then, we show that this family is conjugate and give some estimate results.

Pier Luigi Conti

Bayesian nonparametric inference on solutions of renewal equations

Renewal process play an important role as models of stochastic systems. Generally speaking, many important characteristics associated to renewal processes (e.g., the "standard" renewal function, the distribution function of latest/first renewal before/after time t, etc.) are obtained as solution of a general integral equation, the so-called ``renewal equation''. To be concrete, let (Xi;  i $ \geq$ 1) be a sequence of r.v.s (``inter-arrival times'') i.i.d. conditionally on their d.f. F. A general renewal equation is an integral equation of the form

M(t) = m(t) + $\displaystyle \int_{{- \infty}}^{{+ \infty}}$M(t - x) dF(x)

m( . ) being an appropriate function depending on F. As a prior law for F, we assume first a Dirichlet process. Under appropriate regularity conditions, consistency of the posterior law of M( . ) is shown. Then, the weak convergence of the posterior of M( . ), when appropriately rescaled, to a Gaussian process is proved. Due to the possible unboundedness of M( . ), this point is discussed in some detail. Approximations to the actual posterior law of M( . ) are also discussed. Finally, some results are extended to more general prior laws.

Lehel Csato and Manfred Opper

Efficient Gaussian Process Inference

Gaussian processes (GP) have gained popularity among researchers in the machine learning community. Despite their theoretical simplicity and generality, the implementation of GP inference usually suffers from: (1) intractability of the Bayesian posterior or the predictive distribution for all but the simplest likelihood models and (2) the time required to find the solution is cubic in the size of the data set.
In the presentation the proposed a solution to the problem of intractability is a sequential approximation of the non-tractable posterior with a GP. The same approximation allows for an estimation of the -- not necessarily Gaussian -- predictive distribution. With the approximation technique we estimate the marginal data likelihood or evidence to the data, which in turn can be used to adjust the parameters of the kernel function to the GP.
The prohibitive scaling of the computation time is solved by projecting the GP into a low-dimensional subspace, this projection makes the computational time cubic with respect to the size of the basis or anchor set, making the GP inference applicable for large data-sets.

1. Cornford D, Csato L, Evans D, Opper M. (2004): Bayesian analysis of the Scatterometer Wind Retrieval Inverse Problem: Some New Approaches, Journal of the Royal Statistical Society B, Vol. 66, pp. 1--17.
2. Csato L, Opper M (2002): Sparse Gaussian Processes, Neural Computation, 14/3, The MIT Press.
3 Csato L (2002): Gaussian Processes -- Iterative sparse approximations, Ph.D thesis, Neural Computing Research Group, Aston University.
4. Csato L (2004): Sparse Gaussian Processes Matlab Toolbox www.tuebingen.mpg.de/~csatol/OGP

Alan Gelfand

Multivariate Spatial Process Modelling

Models for the analysis of multivariate spatial data are receiving increased attention these days. In many applications it will be preferable to work with multivariate spatial processes to specify such models. A critical specification in providing these models is the cross covariance function. Constructive approaches for developing valid cross-covariance functions offer the most practical strategy for doing this. These approaches include separability, kernel convolution or moving average methods, and convolution of covariance functions. We review these approaches but take as our main focus the computationally manageable class referred to as the linear model of coregionalization (LMC). We introduce a fully Bayesian development of the LMC. We offer clarification of the connection between joint and conditional approaches to fitting such models including prior specifications. However, to substantially enhance the usefulness of such modelling we propose the notion of a spatially varying LMC (SVLMC) providing a very rich class of multivariate nonstationary processes with simple interpretation.
We illustrate the use of our proposed SVLMC with application to more than 600 commercial property transactions in three quite different real estate markets, Chicago, Dallas and San Diego. Bivariate nonstationary process models are developed for income from and selling price of the property.

Simon Godsill and Gary Yang

Inference for Gaussian and non-Gaussian continuous-time ARMA processes

In many physical science and engineering applications a continuous-time stochastic process model is more appropriate than an approximate discrete time model. Here we present results for Bayesian inference in continuous-time ARMA models using Monte Carlo sampling procedures. The target distributions are challenging for MC exploration and we demonstrate effective performance using specially designed proposal functions and annealed samplers, which yield superior performance to simpler approximation methods from the literature. We discuss also how to extend from the Gaussian case to symmetric $\alpha$-stable Levy-driven ARMA processes through use of an augmented model based on the scale mixture representation of the symmetric stable law. This leads to a simulation-exact method for certain cases, while an approximate discretisation is required in the general case.

Andrew Golightly and Darren J. Wilkinson

Bayesian Sequential Inference for Nonlinear Multivariate Diffusions

We extend recently develop ed simulation-based sequential algorithms to the Bayesian analysis of partially and discretely observed diffusion processes. Typically, since observations arrive at discrete times, yet the model is formulated in continuous time, it is natural to work with the first order Euler approximation. As the interobservation times are usually too large to be used as a time step, it is necessary to augment the observed low-frequency data with the introduction of m - 1 latent data points in between every pair of observations (Pedersen, 1995). Markov chain Monte Carlo (MCMC) methods can then be used to sample the posterior distribution of the latent data and model parameters. Unfortunately, if the amount of augmentation is large, high dependence between parameters and missing data results in arbitrarily slow rates of convergence of basic algorithms such as Gibbs samplers.
We propose a simulation filter, exploiting the diffusion bridge construct of Durham & Gallant (2002), which allows on-line estimation of parameters and state (Liu & West, 2001) and doesn't break down as m increases. We apply the method to the estimation of parameters governing some interesting nonlinear multivariate diffusions.

Durham, G. B. & Gallant, R. A. (2002). Numerical techniques for maximum likelihood estimation of continuous-time diffusion processes. Journal of Business and Economic Statistics 20, 279­316.

Liu, J. & West, M. (2001). Combined parameter and state estimation in simulation-based filtering. In Sequential Monte Carlo Methods in Practice, A. Doucet, N. de Freitas & N. Gordon, eds.

Daniela Golinelli

Bayesian inference in a hidden stochastic two-compartment model for feline hematopoiesis

In this paper we describe a hidden two-compartment process that has been adopted to model the kinetics of feline hematopoietic stem cells in continuous time. Because of the experimental design and the data collection scheme making inference in such model is extremely difficult. We introduce an RJMCMC algorithm that allows us to obtain an estimate of the posterior distribution of the parameters of interest. We show the performance of the algorithm on both simulated and real data. In particular we apply the introduced algorithm to the case of multiple cats, or multiple realizations.

Mathieu Kessler, Omiros Papaspiliopoulos and Rui Paulo

Objective priors for spatial Gaussian processes

We address the issue of deriving objective priors for inference based on the observation of a spatial Gaussian process on a finite grid. We consider the reference approach suggested by Bernardo, where an essential step consists in the replication of the experiment to reach perfect estimation asymptotically. We argue that replication of the experiment is not the natural way to reach perfect estimation in this stochastic process context and investigate the effect of considering infield asymptotics instead.

Athanasios Kottas, Jason Duan and Alan E. Gelfand

Bayesian Nonparametric Spatio-temporal Modeling for Disease Incidence Data

We propose a Bayesian nonparametric approach to modeling disease incidence data, which are typically available as rates or counts for specified regions, and are collected over time. We develop a hierarchical formulation using spatial random effects modeled with a Dirichlet process prior. The Dirichlet process is centered around a normal distribution, which is defined by first assuming a Gaussian process model for the underlying spatial surface, and then using block averaging of this Gaussian process to the areal units determined by the regions in the study. We employ a dynamic formulation for the spatial random effects to extend the model to spatio-temporal settings. Posterior inference is implemented with an efficient Gibbs sampler, which utilizes strategically introduced latent variables. We illustrate the methodology with data on lung cancer incidences for all 88 counties in the state of Ohio over an observation period of 21 years.

Samuel Kou, Sunney Xie and Jun Liu

Bayesian Analysis of Stochastic Models in Single Molecule Biophysics

Recent technological advances allow scientists for the first time to follow a biochemical process on a single molecule basis, which, unlike traditional macroscopic experiments, raises many challenging data-analysis problems and calls for a sophisticated statistical modeling and inference effort. This paper provides the first likelihood-based analysis of the single-molecule fluorescence lifetime experiment, in which the conformational dynamics of a single DNA hairpin molecule is of interest. The conformational change is modeled as a continuous-time two-state Markov chain, which is not directly observable and has to be inferred from changes in photon emissions from a dye attached to the DNA hairpin molecule. In addition to the hidden Markov structure, the presence of molecular Brownian diffusion further complicates the matter. We show that closed form likelihood function can be obtained and a Metropolis-Hastings algorithm can be applied to compute the posterior distribution of the parameters of interest. The data augmentation technique is utilized to handle both the Brownian diffusion and the issue of model discrimination. Our results increase the estimating resolution by several folds. The success of this analysis indicates there is an urgent need to bring modern statistical techniques to the analysis of data produced by modern technologies.

Hedibert Lopes and Carlos Carvalho

Factor models with time-varying loadings and regime switiching

In this article we use factor models to describe acertain class of covariance structure for financial time series models. More specifically, we concentrate on situations where the factor variances are modeled by a multivariate stochastic volatility structure. We build on previous work by allowing the factor loadings, in the factor model structure, to have a time-varying structure and to capture changes in asset weights over time motivated by applications with multiple time series of daily exchange rates.
The factor loadings time-varying structure as well as the common factor and specific factor volatilities are modeled through Markov switching processes. Posterior inference is performed by designed Markov chain Monte Carlo and sequential Monte Carlo algorithm.
We explore and discuss potential extensions to the models exposed here in the prediction area. This discussion leads to open issues on real time implementation and natural model comparisons.

Ramses H. Mena and Stephen G. Walker

Representation of some Markov Models via Predictive Distributions

Predictive distributions that arise from Bayesian settings constitute a useful tool for modelling dependency in time series analysis. This technique is of particular interest in non-linear and non-Gaussian cases. In this work we discuss an approach to model stationary processes via predictive distributions. In particular, some developments within the continuous time framework are presented. The underlying approach provides alternative ways to simulate and estimate well-known continuous time processes.

Philip O'Neill and N. Demiris

Bayesian inference for stochastic epidemics using random graphs

We consider the problem of Bayesian inference for a class of stochastic epidemic models in which a population of individuals is partitioned into groups (such as households). Potentially infectious contacts occur both within groups, and between groups, as governed by Poisson processes of different rates. Given data on final outcome, namely the individuals who are ever infected during an outbreak, we are interested in inferring information about the infection rates. The main obstacle to this task is that the likelihood is intractable, which suggests that data augmentation may be profitable. Here we proceed by using a certain random graph construction that contains information about the eventual outcome, and use realisations of such graphs within a Markov chain Monte Carlo setting to perform the desired inference. The methods are illustrated with data on influenza outbreaks. The methods are very general and extensions will be discussed.

Andrea Ongaro

A new random re-labelling scheme for posterior calculations of symmetric prior processes

In previous work (Ongaro 2003, JSPI) the connection between a certain random re-labelling scheme (known as size-biased sampling) and a general class P of processes to be used as nonparametric prior distributions was established. This led to representations of the class particularly suitable for inferential purposes and allowed the derivation of relatively simple results on the posterior and predictive distributions and on the structure of a sample of observations from the class. Here we aim at extending such a framework by considering a more general class of prior processes. In particular, the class P is enlarged so that it essentially coincides with the class of so-called symmetrically distributed random probability measures (Kallenberg, 1986, Random measures). Consequently, we have constructed a new extension of the size-biased re-labelling scheme which allows to generalise the results obtained for P to the enlarged

Jesus Palomo and David Dunson

Bayesian structural equation models with latent variables

Structural equation models (SEMs) provide a general framework for modeling of multivariate data, particularly in settings in which measured variables are designed to measure one or more latent variables. In implementing SEM analyses, it is typically assumed that the model structure is known and that the latent variables have normal distributions. To relax these assumptions, this article proposes a semiparametric Bayesian approach. Categorical latent variables with an unknown number of classes are accommodated using Dirichlet process (DP) priors, while DP mixtures of normals allow continuous latent variables to have unknown distributions. Robustness to the assumed SEM structure is accommodated by choosing mixture priors, which allow uncertainty in the occurrence of certain links within the path diagram. A Gibbs sampling algorithm is developed for posterior computation. The methods are illustrated using biomedical and social science examples.

Sara Pasquali

Bayesian and Classical Inference for a Stochastic Predator-Prey System

In this work we study the problem of parameter estimation in a discretely observed predator-prey model described by a system of stochastic differential equations. A classical approach based on maximum likelihood function and a Bayesian approach based on a MCMC algorithm are compared.
The estimator based upon the ML function converges in mean square to the true value of the parameter, when the final time and the number of observations go to infinity and the interval between two consecutive observations goes to zero.
In the Bayesian framework, considering a prior normal distribution for the parameter to be estimated, we obtain a normal posterior distribution whose mean is close to the classical estimator when an improper prior is chosen. Then we introduce latent data between two consecutive actual observations. A MCMC method based on the Metropolis-Hastings algorithm is applied to sample from the posterior distribution of the latent data. The Bayesian approach provides a reasonable posterior distribution for the parameter even when we dispose of a relatively small number of observations.
Numerical results are illustrated.

Havard Rue and Sara Martino

Approximative Deterministic Bayesian Inference for Hierarchical Gaussian Markov Random Field Models

Hierarchical Gaussian Markov random field (GMRF) models are often used in statistical applications, and includes dynamic and temporal models, model-based geostatistical and spatial models, and spatio-temporal models. Most of these have a posterior of the following form,

where are a low-dimensional hyperparameter of the GMRF , and are pointwise observations of {xi :  i }.

Although Bayesian inference about using MCMC techniques and preferable block-MCMC, we consider in this talk ways to do approximative deterministic inference for and xi, i = 1,..., n, conditioned on . Such approximations can be derived from tractable Gaussian and non-Gaussian approximations of and have several advantages to simulation based inference;

  • Lack of (large) Monte Carlo error
  • Extremely fast
  • No burn-in/convergence problems
The disadvantage is a small bias

Alexandra M. Schmidt, Bruno Sansò and Aline A. Nobre

Spatio-temporal models based on spatial discrete convolutions

We consider a class of models for spatio-temporal processes based on convolving spatial independent processes with a discrete kernel that is represented by a lower triangular matrix. This triangular matrix represents the Cholesky decompostion of an auto-regressive process of order p. We find that the proposed families of models provide a rich variety of covariance structures. These include covariance functions that are stationary and separable in space and time as well as time dependent non-separable and non-isotropic ones. We also address, under the proposed model, the problem of missing data and the problem of covariates which are measured at different locations from the response variable (spatial misalignment). A real data example, based on measurements of PM10 at the city of Rio de Janeiro, will be presented.

David Stephens and Matthew Gander

Inference for Volatility Models driven by Levy Processes

We extend the currently most popular models for the volatility of financial time series, Ornstein-Uhlenbeck stochastic processes, to more general Non-Ornstein Uhlenbeck models. In particular, we investigate means of making the correlation structure in the volatility process more flexible. We implement a method for introducing quasi long-memory into the volatility model without recourse to superposition We demonstrate that the models can be fitted to real share price returns data, and that results indicate that for the series we study, the long-memory aspect of the model is not supported.

Osnat Stramer and Jun Yan

Parametric inference for partially observed diffusion processes: a comparison study

Diffusion models described by stochastic differential equations are used extensively in many areas of science-engineering, hydrology, financial economics and physics. While the models are formulated in continuous-time, the data are recorded at discrete points in time. For most diffusion models the likelihood function is unavailable. We describe four alternative approaches to inference of diffusion models.
  • Bayesian inference using Gibbs sampling and data augmentation (Elerian at al., 2001; Roberts & Stramer, 2001).
  • Simulated maximum likelihood estimation (Pedersen, 1995; Durham & Gallant, 2002).
  • Analytical approximation of the likelihood (Ait Sahalia, 2002 and Ait Sahalia, 2005).
  • Efficient method of moments (Gallant & Long, 1997)
  • All the above methods are computationally heavy. We review all four methods in terms of
  • what models can each method handle,
  • theoretical justification of the method,
  • accuracy,
  • speed.
  • We perform a set of Monte Carlo experiments to compare the performance of these approaches on two different type of models:
  • Simple models like the square-root/Cox-Ingersoll-Ross models that can be solved explicitly and hence can be estimated via maximum likelihood.
  • Models like the preferred model for short term interest rates of Ait-Sahalia (1996) that cannot be solved explicitly and hence cannot be estimated via maximum likelihood.
  • In addition, we also provide a numerical technique to improve the computational efficiency of the simulated maximum likelihood method proposed by Durham & Gallant (2002).

    Jonathan Stroud, Michael Johannes and Nicholas Polson

    Sequential Parameter Estimation in Stochastic Volatility Models with Jumps

    This paper analyzes the sequential learning problem for both parameters and states in stochastic volatility models with jumps. We describe the existing methods, the particle and practical filter, and then extend these algorithms to incorporate jumps. We analyze the performance of both approaches using both simulated and S&P 500 index return data. On both types of data, we find that both algorithms are effective in sequential learning of the jump parameters, although sensitivity analysis indicates that the practical filter performs marginally better. These conclusions are similar to those in Stroud, Polson and Muller (2004) regarding stochastic volatility models.

    Elisa Varini and Renata Rotondi

    A sequential particle filter for a continuous-time state-space model

    A new state-space model for the analysis of earthquake sequences is proposed: a pure jump Markov process with observations from marked point processes. We considered three marked point processes for seismic sequences, each of them having a different physical foundation: the Poisson process, the stress release model and the Epidemic-Type Aftershock-Sequence model. We assumed that a time-magnitude seismic sequence is composed by a series of realizations of these three models (the observed process). The dynamic of their activation times is driven by an hidden pure jump Markov process (the state process). The inference of a so complex and rich model is carried out by exploiting a Bayesian sequential Monte Carlo method in order to estimate the model parameters and to approximate the filtering distribution. We examined a simulated data set focusing our attention on the properties of the estimation methods proposed and on their sensitivity to the assessment of some particular parameters. When the model parameters are known, a solution of the filtering problem is proposed by exploiting the innovation method in the context of the martingale representation of a point process. This method can be also used to improve the approximation of the filtering distribution obtained by the sequential Monte Carlo procedure when the model parameters are unknown.

    Krisztina Vasas, Peter Elek and Laszlo Markus

    A regime switching time series model of daily river flows

    We report on fitting a two-state regime switching autoregressive model for daily water discharge series registered at monitoring sites of River Tisza and of its tributaries. One peculiarity of the model is that the noise sequence switches distribution according to the exponential and normal law, the rising regime being governed by the exponential part. In the case of Markovian regime switching, the estimation can be carried out by a simple implementation of the MCMC algorithm. However, as change times of the regimes in hydrological series are known to deviate from the geometric distribution, Markov-modelling is not entirely satisfactory. When generalisations of the model are considered, more sophisticated estimation algorithms should be chosen. In some particular non-Markovian cases, either the efficient method of moments algorithm or reversible jump MCMC may be of help. In the talk we also analyse the joint behaviour of the main river and of its tributaries, by examining the pattern of regimes and the multivariate distribution of the noise sequences at different sites.

    Mike West

    Stochastic Search & MCMC on "Big" Model Spaces

    I will present and discuss approaches to stochastic computation in "big" and "sparse" models - large, sparse graphical models, regression variable models with many candidate predictors, and large-scale, sparse factor models. Varieties of MCMC methods and non-MCMC evolutionary stochastic search methods will be discussed, compared and exemplified. Questions arise about convergence characteristics of the resulting stochastic processes - whether MCMC or stochastic optimisation is the goal - questions that challenge Bayesian probabilists and algorithm developers alike.
    Critical questions of Bayesian model/prior specification - issues that are simply central to scalability of Bayesian technologies as models/parameter spaces increase in dimension - will also be discussed.

    Darren J. Wilkinson

    Bayesian Inference for Biochemical Network Dynamics

    This talk will give an overview of one of the key problems in the new science of Systems Biology - inference for the rate parameters underlying complex stochastic kinetic biochemical network models, using partial, discrete and noisy time course measurements of the system state. The basic problem will be introduced, highlighting the importance of stochastic modelling for effective estimation, and then a range of approaches to Bayesian inference will be reviewed and compared. Some approaches recognise the discrete nature of the underlying molecular dynamics, whilst others use a diffusion approximation to give a non-linear multivariate stochastic differential equation representation.

    Some ``weird'' formulas are due to bugs in the software transforming Latex and Word into HTML.