Fourth Workshop on
BAYESIAN INFERENCE IN STOCHASTIC
PROCESSES
Villa Monastero, Varenna (LC), Italy
June, 2-4, 2005
Matthew J. Beal and Yee Whye Teh
|
Efficient Sampling Strategies for the Hierarchical Dirichlet Process:
with Application to the Infinite Hidden Markov Model and its Variants
We consider time series data modelling using Hidden Markov Models
having an a priori unknown number of hidden states. We show that the
Infinite Hidden Markov Model [1] can be recast in the framework of
Hierarchical Dirichlet Processes (HDPs). The HDP framework [2]
considers problems involving related groups of data: each (fixed)
group of data is modelled by a DP mixture model, with the common base
measure of the DPs being itself distributed according to a global DP.
The base measure being discrete w.p.1 ensures that the group DPs share
atoms (despite being countably infinite). [2] presents two sampling
schemes for posterior inference in the HDP.
We cast sequential data in the grouped data framework by assigning
observations to groups, where the groups are indexed by the value of
the previous state variable in the sequence; then the current state
and its emission distributions define a group-specific mixture model.
Thus the hidden state sequence implicitly defines a partition into
groups, and induces constraints in the posterior that make the
sampling methods proposed in [2] quite inefficient. We present novel
effective MCMC methods to overcome this problem in the iHMM and it
variants, and present results in text and bioinformatics domains.
[1] Beal, M.J., Ghahramani, Z. and Rasmussen, C.E. (2001). The
Infinite Hidden Markov Model. In Advances in Neural Information
Processing Systems 14:577-584, eds. T. Dietterich, S. Becker,
Z. Ghahramani, MIT Press, 2002.
[2] Teh, Y.W., Jordan, M.I., Beal, M.J. and Blei, D.M. Hierarchical
Dirichlet Processes. To appear in Advances in Neural Information
Processing Systems 17, MIT Press, 2005. For more details see
Technical Report 653, UC Berkeley Statistics, 2004. Also submitted to
JASA.
Exact inference for diffusion processes using MCMC
We present a new methodology for parametric inference about
diffusions given a set of discrete time observations. The
method is based on an algorithm for the exact simulation
of the stochastic model and involves imputation of the
missing paths among the data. The exact simulation algorithm
allows for carrying out a simple Gibbs step when sampling
from the full conditional of the missing paths without the
necessity of any kind of approximation.
Paul G. Blackwell and K. J. Harris
|
Bayesian radio-tracking: inference for diffusions in heterogeneous
environments
In ecology and zoology, the radio tracking of animals is an important
source of data on movement, behaviour and habitat use. Dependence between
successive observations on an individual needs to be allowed for in any
statistical analysis; since observations are not necessarily made at equal
time intervals, a natural approach is to regard the true underlying
movement process as a diffusion.
In this talk, we will discuss a range of fully-parametric diffusion models
that capture key features of realistic patterns of animal movement. These
include the two-dimensional Ornstein-Uhlenbeck process (Dunn and Gipson,
1977, Biometrics 33:85-101), diffusions driven by an underlying Markov
chain representing behaviour (Blackwell 2003, Biometrika 90:613-627) and
diffusions with different properties in different discrete regions of
space (Blackwell and Harris, work in progress); the latter is an extension
to two (or more) dimensions of continuous-time threshold auto-regressive
processes.
We will describe and illustrate techniques for carrying out fully Bayesian
inference for these models, using a Markov Chain Monte Carlo approach.
Paolo Bulla and Pietro Muliere
|
Bayesian Nonparametric Estimation for Reinforced Markov Renewal Processes
Starting from the definitions and the properties of reinforced renewal
processes and reinforced Markov renewal processes, we characterize, via
exchangeability and de Finetti's representation theorem, a prior that
consists of a family of Dirichlet distributions on the space of Markov
transition matrices and beta-Stacy processes on distribution functions.
Then, we show that this family is conjugate and give some estimate results.
Bayesian nonparametric inference on solutions of renewal equations
Renewal process play an important role as models of stochastic
systems. Generally speaking, many important characteristics
associated to renewal processes (e.g., the "standard" renewal
function, the distribution function of latest/first renewal
before/after time t, etc.) are obtained as solution of a
general integral equation, the so-called ``renewal equation''. To
be concrete, let
(Xi; i
1) be a sequence of r.v.s
(``inter-arrival times'') i.i.d. conditionally on their d.f. F.
A general renewal equation is an integral equation of the form
M(
t) =
m(
t) +
M(
t -
x) d
F(
x)
m( . ) being an appropriate function depending on
F. As a prior law for F, we assume first a Dirichlet process.
Under appropriate regularity conditions, consistency of the
posterior law of
M( . ) is shown. Then, the weak convergence
of the posterior of
M( . ), when appropriately rescaled, to
a Gaussian process is proved. Due to the possible unboundedness of
M( . ), this point is discussed in some detail.
Approximations to the actual posterior law of
M( . ) are
also discussed. Finally, some results are extended to more general
prior laws.
Lehel Csato and Manfred Opper
|
Efficient Gaussian Process Inference
Gaussian processes (GP) have gained popularity among researchers in the
machine learning community. Despite their theoretical simplicity and
generality, the implementation of GP inference usually suffers from: (1)
intractability of the Bayesian posterior or the predictive distribution for
all but the simplest likelihood models and (2) the time required to find the
solution is cubic in the size of the data set.
In the presentation the proposed a solution to the problem of intractability
is a sequential approximation of the non-tractable posterior with a GP. The
same approximation allows for an estimation of the -- not necessarily
Gaussian -- predictive distribution. With the approximation technique we
estimate the marginal data likelihood or evidence to the data, which in turn
can be used to adjust the parameters of the kernel function to the GP.
The prohibitive scaling of the computation time is solved by projecting the
GP into a low-dimensional subspace, this projection makes the computational
time cubic with respect to the size of the basis or anchor set, making the
GP inference applicable for large data-sets.
1. Cornford D, Csato L, Evans D, Opper M. (2004): Bayesian analysis of
the Scatterometer Wind Retrieval Inverse Problem: Some New Approaches,
Journal of the Royal Statistical Society B, Vol. 66, pp. 1--17.
2. Csato L, Opper M (2002): Sparse Gaussian Processes, Neural Computation,
14/3, The MIT Press.
3 Csato L (2002): Gaussian Processes -- Iterative sparse approximations,
Ph.D thesis, Neural Computing Research Group, Aston University.
4. Csato L (2004): Sparse Gaussian Processes Matlab Toolbox
www.tuebingen.mpg.de/~csatol/OGP
Multivariate Spatial Process Modelling
Models for the analysis of multivariate spatial data are receiving increased
attention these days. In many applications it will be preferable to work with
multivariate spatial processes to specify such models. A critical
specification in providing these models is the cross covariance function.
Constructive approaches for developing valid cross-covariance functions offer
the most practical strategy for doing this. These approaches include
separability, kernel convolution or moving average methods, and convolution
of covariance functions. We review these approaches but take as our main
focus the computationally manageable class referred to as the linear model
of coregionalization (LMC). We introduce a fully Bayesian development of the
LMC. We offer clarification of the connection between joint and conditional
approaches to fitting such models including prior specifications. However, to
substantially enhance the usefulness of such modelling we propose the notion
of a spatially varying LMC (SVLMC) providing a very rich class of multivariate
nonstationary processes with simple interpretation.
We illustrate the use of our proposed SVLMC with application to more than 600
commercial property transactions in three quite different real estate markets,
Chicago, Dallas and San Diego. Bivariate nonstationary process models are
developed for income from and selling price of the property.
Simon Godsill and Gary Yang
|
Inference for Gaussian and non-Gaussian continuous-time ARMA processes
In many physical science and engineering applications a continuous-time
stochastic process model is more appropriate than an approximate
discrete time model.
Here we present results for Bayesian inference in continuous-time ARMA
models using Monte Carlo sampling procedures. The target distributions
are challenging for MC exploration and
we demonstrate effective performance using specially designed proposal
functions and annealed samplers, which yield superior performance to
simpler approximation methods from the
literature. We discuss also how to extend from the Gaussian case to
symmetric $\alpha$-stable Levy-driven ARMA processes through use of
an augmented model based on the scale mixture representation of the
symmetric stable law.
This leads to a simulation-exact method for certain cases, while an
approximate discretisation is required in the general case.
Andrew Golightly and Darren J. Wilkinson
|
Bayesian Sequential Inference for Nonlinear Multivariate Diffusions
We extend recently develop ed simulation-based sequential algorithms to the
Bayesian analysis of partially and discretely observed diffusion processes.
Typically, since observations arrive at discrete times, yet the model is
formulated in continuous time, it is natural to work with the first order
Euler approximation. As the interobservation times are usually too large to
be used as a time step, it is necessary to augment the observed low-frequency
data with the introduction of m - 1 latent data points in between every pair
of observations (Pedersen, 1995). Markov chain Monte Carlo (MCMC) methods can
then be used to sample the posterior distribution of the latent data and model
parameters. Unfortunately, if the amount of augmentation is large, high
dependence between parameters and missing data results in arbitrarily slow
rates of convergence of basic algorithms such as Gibbs samplers.
We propose a simulation filter, exploiting the diffusion bridge construct
of Durham & Gallant (2002), which allows on-line estimation of parameters and
state (Liu & West, 2001) and doesn't break down as m increases. We apply the
method to the estimation of parameters governing some interesting nonlinear
multivariate diffusions.
Durham, G. B. & Gallant, R. A. (2002). Numerical techniques for maximum
likelihood estimation of continuous-time diffusion processes. Journal of
Business and Economic Statistics 20, 279316.
Liu, J. & West, M. (2001). Combined parameter and state estimation in
simulation-based filtering. In Sequential Monte Carlo Methods in Practice,
A. Doucet, N. de Freitas & N. Gordon, eds.
Bayesian inference in a hidden stochastic two-compartment model for feline
hematopoiesis
In this paper we describe a hidden two-compartment process that has been
adopted to model the kinetics of feline hematopoietic stem cells in continuous
time. Because of the experimental design and the data collection scheme making
inference in such model is extremely difficult. We introduce an RJMCMC
algorithm that allows us to obtain an estimate of the posterior distribution
of the parameters of interest. We show the performance of the algorithm on
both simulated and real data. In particular we apply the introduced algorithm
to the case of multiple cats, or multiple realizations.
Mathieu Kessler, Omiros Papaspiliopoulos and Rui Paulo
|
Objective priors for spatial Gaussian processes
We address the issue of deriving objective priors for inference based on the
observation of a spatial Gaussian process on a finite grid. We consider the
reference approach suggested by Bernardo, where an essential step consists
in the replication of the experiment to reach perfect estimation
asymptotically. We argue that replication of the experiment is not the
natural way to reach perfect estimation in this stochastic process context
and investigate the effect of considering infield asymptotics instead.
Athanasios Kottas, Jason Duan and Alan E. Gelfand
|
Bayesian Nonparametric Spatio-temporal Modeling for Disease
Incidence Data
We propose a Bayesian nonparametric approach to modeling disease
incidence data, which are typically available as rates or counts for
specified regions, and are collected over time. We develop a
hierarchical formulation using spatial random effects modeled with a
Dirichlet process prior. The Dirichlet process is centered around a
normal distribution, which is defined by first assuming a Gaussian
process model for the underlying spatial surface, and then using block
averaging of this Gaussian process to the areal units determined by
the regions in the study. We employ a dynamic formulation for the
spatial random effects to extend the model to spatio-temporal
settings. Posterior inference is implemented with an efficient Gibbs
sampler, which utilizes strategically introduced latent variables. We
illustrate the methodology with data on lung cancer incidences for all
88 counties in the state of Ohio over an observation period of 21 years.
Samuel Kou, Sunney Xie and Jun Liu
|
Bayesian Analysis of Stochastic Models in Single Molecule Biophysics
Recent technological advances allow scientists for the first time to
follow a biochemical process on a single molecule basis, which, unlike
traditional macroscopic experiments, raises many challenging data-analysis
problems and calls for a sophisticated statistical modeling and inference
effort. This paper provides the first likelihood-based analysis of the
single-molecule fluorescence lifetime experiment, in which the
conformational dynamics of a single DNA hairpin molecule is of interest.
The conformational change is modeled as a continuous-time two-state Markov
chain, which is not directly observable and has to be inferred from
changes in photon emissions from a dye attached to the DNA hairpin
molecule. In addition to the hidden Markov structure, the presence of
molecular Brownian diffusion further complicates the matter. We show that
closed form likelihood function can be obtained and a Metropolis-Hastings
algorithm can be applied to compute the posterior distribution of the
parameters of interest. The data augmentation technique is utilized to
handle both the Brownian diffusion and the issue of model discrimination.
Our results increase the estimating resolution by several folds. The
success of this analysis indicates there is an urgent need to bring
modern statistical techniques to the analysis of data produced by
modern technologies.
Hedibert Lopes and Carlos Carvalho
|
Factor models with time-varying loadings and regime switiching
In this article we use factor models to describe acertain class of covariance
structure for financial time series models. More specifically, we concentrate
on situations where the factor variances are modeled by a multivariate
stochastic volatility structure. We build on previous work by allowing the
factor loadings, in the factor model structure, to have a time-varying
structure and to capture changes in asset weights over time motivated by
applications with multiple time series of daily exchange rates.
The factor loadings time-varying structure as well as the common factor and
specific factor volatilities are modeled through Markov switching processes.
Posterior inference is performed by designed Markov chain Monte Carlo and
sequential Monte Carlo algorithm.
We explore and discuss potential extensions to the models exposed here in the
prediction area. This discussion leads to open issues on real time
implementation and natural model comparisons.
Ramses H. Mena and Stephen G. Walker
|
Representation of some Markov Models via Predictive Distributions
Predictive distributions that arise from Bayesian settings constitute
a useful tool for modelling dependency in time series analysis. This technique
is of particular interest in non-linear and non-Gaussian cases. In this work we
discuss an approach to model stationary processes via predictive distributions.
In particular, some developments within the continuous time framework are
presented. The underlying approach provides alternative ways to simulate and
estimate well-known continuous time processes.
Philip O'Neill and N. Demiris
|
Bayesian inference for stochastic epidemics using random graphs
We consider the problem of Bayesian inference for a class of stochastic
epidemic models in which a population of individuals is partitioned into
groups (such as households). Potentially infectious contacts occur both
within groups, and between groups, as governed by Poisson processes of
different rates. Given data on final outcome, namely the individuals who
are ever infected during an outbreak, we are interested in inferring
information about the infection rates. The main obstacle to this task is
that the likelihood is intractable, which suggests that data augmentation
may be profitable. Here we proceed by using a certain random graph
construction that contains information about the eventual outcome, and use
realisations of such graphs within a Markov chain Monte Carlo setting to
perform the desired inference. The methods are illustrated with data on
influenza outbreaks. The methods are very general and extensions will be
discussed.
A new random re-labelling scheme for posterior calculations of symmetric
prior processes
In previous work (Ongaro 2003, JSPI) the connection between a certain random
re-labelling scheme (known as size-biased sampling) and a general class P of
processes to be used as nonparametric prior distributions was established.
This led to representations of the class particularly suitable for
inferential purposes and allowed the derivation of relatively simple results
on the posterior and predictive distributions and on the structure of a
sample of observations from the class. Here we aim at extending such a
framework by considering a more general class of prior processes. In
particular, the class P is enlarged so that it essentially coincides with
the class of so-called symmetrically distributed random probability measures
(Kallenberg, 1986, Random measures). Consequently, we have constructed a
new extension of the size-biased re-labelling scheme which allows to
generalise the results obtained for P to the enlarged
Jesus Palomo and David Dunson
|
Bayesian structural equation models with latent variables
Structural equation models (SEMs) provide a general framework for modeling
of multivariate data, particularly in settings in which measured variables
are designed to measure one or more latent variables. In implementing SEM
analyses, it is typically assumed that the model structure is known and that
the latent variables have normal distributions. To relax these assumptions,
this article proposes a semiparametric Bayesian approach. Categorical latent
variables with an unknown number of classes are accommodated using Dirichlet
process (DP) priors, while DP mixtures of normals allow continuous latent
variables to have unknown distributions. Robustness to the assumed SEM
structure is accommodated by choosing mixture priors, which allow uncertainty
in the occurrence of certain links within the path diagram. A Gibbs
sampling algorithm is developed for posterior computation. The methods are
illustrated using biomedical and social science examples.
Bayesian and Classical Inference for a Stochastic Predator-Prey System
In this work we study the problem of parameter estimation in a
discretely observed predator-prey model described by a system of
stochastic differential equations. A classical approach based on
maximum likelihood function and a Bayesian approach based on a
MCMC algorithm are compared.
The estimator based upon the ML function converges in mean square
to the true value of the parameter, when the final time and the
number of observations go to infinity and the interval between two
consecutive observations goes to zero.
In the Bayesian framework, considering a prior normal
distribution for the parameter to be estimated, we obtain a normal
posterior distribution whose mean is close to the classical estimator
when an improper prior is chosen. Then
we introduce latent data between two consecutive actual
observations. A MCMC method based on the Metropolis-Hastings
algorithm is applied to sample from the posterior distribution of
the latent data. The Bayesian approach provides a reasonable
posterior distribution for the parameter even when we dispose of a
relatively small number of observations.
Numerical results are illustrated.
Havard Rue and Sara Martino
|
Approximative Deterministic Bayesian Inference for Hierarchical
Gaussian Markov Random Field Models
Hierarchical Gaussian Markov random field (GMRF) models are often used
in statistical applications, and includes dynamic and temporal models,
model-based geostatistical and spatial models, and spatio-temporal
models. Most of these have a posterior of the following form,
where
are a low-dimensional hyperparameter of the GMRF
,
and
are pointwise observations of {xi : i
}.
Although Bayesian inference about
using MCMC techniques and preferable
block-MCMC, we consider in this talk ways to do approximative
deterministic inference for
and xi, i = 1,...,
n, conditioned on
. Such approximations can be derived from
tractable Gaussian and non-Gaussian approximations of
and have several
advantages to simulation based inference;
- Lack of (large) Monte Carlo error
- Extremely fast
- No burn-in/convergence problems
The disadvantage is a small bias
Alexandra M. Schmidt, Bruno Sansò and Aline A. Nobre
|
Spatio-temporal models based on spatial discrete convolutions
We consider a class of models for spatio-temporal processes based on
convolving spatial independent processes with a discrete kernel that is
represented by a lower triangular matrix. This triangular matrix represents
the Cholesky decompostion of an auto-regressive process of order p. We find
that the proposed families of models provide a rich variety of covariance
structures. These include covariance functions that are stationary and
separable in space and time as well as time dependent non-separable and
non-isotropic ones. We also address, under the proposed model, the problem
of missing data and the problem of covariates which are measured at
different locations from the response variable (spatial misalignment). A
real data example, based on measurements of PM10 at the city of Rio de
Janeiro, will be presented.
David Stephens and Matthew Gander
|
Inference for Volatility Models driven by Levy Processes
We extend the currently most popular models for the volatility of
financial time series, Ornstein-Uhlenbeck stochastic processes, to
more general Non-Ornstein Uhlenbeck models. In particular, we
investigate means of making the correlation structure in the
volatility process more flexible. We implement a method for
introducing quasi long-memory into the volatility model without
recourse to superposition We demonstrate that the models can be fitted to
real share price returns data, and that results indicate that for
the series we study, the long-memory aspect of the model is not
supported.
Osnat Stramer and Jun Yan
|
Parametric inference for partially observed diffusion processes: a
comparison study
Diffusion models described by stochastic differential equations are used extensively in many
areas of science-engineering, hydrology, financial economics and physics. While the models
are formulated in continuous-time, the data are recorded at discrete points in time. For
most diffusion models the likelihood function is unavailable. We describe four alternative
approaches to inference of diffusion models.
All the above methods are computationally heavy. We review all four methods
in terms of
We perform a set of Monte Carlo experiments to compare the performance of these approaches on two different type of models:
In addition, we also provide a numerical technique to improve the computational efficiency
of the simulated maximum likelihood method proposed by Durham & Gallant (2002).
Jonathan Stroud, Michael Johannes and Nicholas Polson
|
Sequential Parameter Estimation in Stochastic Volatility Models with Jumps
This paper analyzes the sequential learning problem for both
parameters and states in stochastic volatility models with jumps.
We describe the existing methods, the particle and practical filter,
and then extend these algorithms to incorporate jumps. We analyze
the performance of both approaches using both simulated and S&P 500
index return data. On both types of data, we find that both algorithms
are effective in sequential learning of the jump parameters, although
sensitivity analysis indicates that the practical filter performs
marginally better. These conclusions are similar to those in Stroud,
Polson and Muller (2004) regarding stochastic volatility models.
Elisa Varini and Renata Rotondi
|
A sequential particle filter for a continuous-time state-space model
A new state-space model for the analysis of earthquake sequences is proposed:
a pure jump Markov process with observations from marked point processes.
We considered three marked point processes for seismic sequences,
each of them having a different physical foundation: the Poisson process,
the stress release model and the Epidemic-Type Aftershock-Sequence model.
We assumed that a time-magnitude seismic sequence is composed by a series
of realizations of these three models (the observed process).
The dynamic of their activation times is driven by an hidden pure jump
Markov process (the state process).
The inference of a so complex and rich model is carried out by
exploiting a Bayesian sequential Monte Carlo method in order to
estimate the model parameters and to approximate the filtering
distribution.
We examined a simulated data set focusing our attention on the properties
of the estimation methods proposed and on their sensitivity to the
assessment of some particular parameters.
When the model parameters are known, a solution of the filtering problem
is proposed by exploiting the innovation method in the context of the
martingale representation of a point process.
This method can be also used to improve the approximation of the
filtering distribution obtained by the sequential Monte Carlo procedure
when the model parameters are unknown.
Krisztina Vasas, Peter Elek and Laszlo Markus
|
A regime switching time series model of daily river flows
We report on fitting a two-state regime switching autoregressive model
for daily water discharge series registered at monitoring sites of River Tisza
and of its tributaries. One peculiarity of the model is that the noise
sequence switches distribution according to the exponential and normal law,
the rising regime being governed by the exponential part. In the case of
Markovian regime switching, the estimation can be carried out by a simple
implementation of the MCMC algorithm. However, as change times of the regimes
in hydrological series are known to deviate from the geometric distribution,
Markov-modelling is not entirely satisfactory. When generalisations of the
model are considered, more sophisticated estimation algorithms should be
chosen. In some particular non-Markovian cases, either the efficient method
of moments algorithm or reversible jump MCMC may be of help. In the talk we
also analyse the joint behaviour of the main river and of its tributaries,
by examining the pattern of regimes and the multivariate distribution of the
noise sequences at different sites.
Stochastic Search & MCMC on "Big" Model Spaces
I will present and discuss approaches to stochastic computation in "big" and
"sparse" models - large, sparse graphical models, regression variable models
with many candidate predictors, and large-scale, sparse factor models.
Varieties of MCMC methods and non-MCMC evolutionary stochastic search methods
will be discussed, compared and exemplified. Questions arise about convergence
characteristics of the resulting stochastic processes - whether MCMC or
stochastic optimisation is the goal - questions that challenge Bayesian
probabilists and algorithm developers alike.
Critical questions of Bayesian model/prior specification - issues that
are simply central to scalability of Bayesian technologies as models/parameter
spaces increase in dimension - will also be discussed.
Bayesian Inference for Biochemical Network Dynamics
This talk will give an overview of one of the key problems in the new
science of Systems Biology - inference for the rate parameters
underlying complex stochastic kinetic biochemical network models,
using partial, discrete and noisy time course measurements of the
system state. The basic problem will be introduced, highlighting the
importance of stochastic modelling for effective estimation, and then
a range of approaches to Bayesian inference will be reviewed and
compared. Some approaches recognise the discrete nature of the
underlying molecular dynamics, whilst others use a diffusion
approximation to give a non-linear multivariate stochastic
differential equation representation.
Some ``weird'' formulas are due to bugs in the software
transforming Latex and Word into HTML.