ISBA Newsletter - March 1999

ABOUT THE SECTION

by Sudipto Banerjee
sudipto@stat.uconn.edu

We present the plans for the Section and invite students to contribute.

It is wonderful to have this new edition of the ISBA newsletter before us. One of the sections that we plan to have in the new edition is called the ``Student's Corner''. We are hoping to include in this section abstracts of dissertations of students who are currently doing research (like those from ISDS, right after this note). This would not only serve as a common platform where students can interact with one another on common problems, but would also serve as a good indicator for the fresh students as to what the current trends in (Bayesian) statistical research are. Students are welcome to discuss other problems that they may have come across in their academic or professional work. We would also like to include plans and suggestions for common activities such as meetings or creating web-pages where we might post discussions on problems that we face. These problems would cover theoretical and methodological issues as well as problems on Bayesian Data Analysis. Some of the most widely faced problems by the modern day statistician pertains to software related issues. Since heavy computing is an indispensable technology for analysing models in the Bayesian framework, many students have queries on such issues. We feel that the student corner can serve as that much sought after platform where students can exchange their thoughts and experiences on such issues. We would like to seek your co-operation in making the ``Students' Corner'' a really useful section in the newsletter. We plan to have most of our communications through e-mail. We would particularly encourage students to send their comments on what they would like to have in this section and their suggestions on what we can do to improve this section.

1998 Ph.D's at
ISDS, Duke University

(Dissertations available at
 
www.isds.duke.edu/people/alumni.html).

Omar Aguilar
Latent Structure in Bayesian Multivariate Time Series Models
Advisor: Mike West

This dissertation introduces new classes of models and approaches to multivariate time series analysis and forecasting, with a focus on various problems in which time series structure is driven by underlying latent processes of key interest. The identification of latent structure and common features in multiple time series is first studied using wavelet based methods and Bayesian time series decompositions of certain classes of dynamic linear models. The results are applied to turbulence and geochemical time series data, the latter involving development of new time series models for latent time-varying autoregressions with heavy-tailed components for quite radically ill-behaved series. Natural extensions and generalizations of these models lead to novel developments of two key model classes, dynamic factor models for multivariate financial time series with stochastic volatility components, and multivariate dynamic generalized linear models for non-Gaussian longitudinal time series. These two model classes are related through common statistical structure, and the dissertation discusses issues of Bayesian model specification, model fitting and computation for posterior and predictive analysis that are common to the two model classes. Two motivating applications are discussed, one in each of the two model classes. The first concerns short term forecasting and dynamic portfolio allocation, illustrated in a study of the dynamic factor structure of daily spot exchange rates for a selection of international currencies. The second application involves analyses of time series of collections of many related binomial outcomes and arises in a project in health care quality monitoring with the Veterans Affairs (VA) hospital system.

Gabriel Huerta
Bayesian Analysis of Latent Structure in Time Series Models
Advisor: Mike West

The analysis and decomposition of time series is considered under autoregressive models with a new class of prior distributions for parameters defining latent components. The approach induces a new class of smoothness priors on autoregressive coefficients, provides for formal inference on model order, including very high order models, and permits for the incorporation of uncertainty about model order into summary inferences. The class of prior models also allows for subsets of unit roots, and hence leads to inference on sustained though stochastically time-varying periodicities in time series and for formal treatment of initial values as latent variables. As the prior modeling induces complicated forms for prior distributions on the usual ``linear'' autoregressive parameters, exploration of posterior distributions naturally involves in iterative stochastic simulation with a Gibbs sampling format. Conditional posterior distributions are available in closed form, except for those corresponding to the parameters defining the quasi-cyclical components of the model. In this case and to assess for the induced changes of dimensionality, a reversible jump Markov chain Monte Carlo step is implemented. This methodology overcomes supposed problems in spectral estimation with autoregressive models using more traditional model fitting methods. Detailed simulation studies are presented to evaluate the efficiency of the sampler at detecting model order, wavelengths and amplitudes of cyclical components, and unitary roots or "spikes" in the spectral density at key frequencies. Analysis, decomposition and forecasting of several series is illustrated with applications to EEG studies, discovering underlying periodicities in astronomical series and climate change issues. Additionally, an extension is proposed for continuous autoregressive models that permits analysis of unequally-spaced time series. This new model, that falls within the class of dynamic linear models, overcome some of the difficulties in both embedding and fitting models defined through stochastic differential equation methods. Specifically, model structure incorporates spacings through the likelihood function and, as before, priors are specified on relevant parameters defining latent components. Simulation from the posterior distributions is implemented through component-wise random walk Metropolis steps with a reversible jump. Efficiency of the method is explored as for the standard autoregressive model with applications to irregularly sampled oxygen-isotope records.

Colin McCulloch
High-level Image Understanding Through Bayesian Hierarchical Models
Advisor: Valen Johnson

The tasks performed by medical image analysis technicians, including registration and segmentation, have become increasingly difficult with the advent of three-dimensional imaging systems. To identify features in these large images, the technician must typically engage in the tedious chore of examining numerous lower dimensional representations of parts of the data set, for instance slices though the volume or volume-rendered views. The pursuit of automatic image understanding, previously sought after in two-dimensional images for objective anatomical measurement and to reduce operator burden, therefore has become proportionally more valuable in these larger image datasets.

A statistical framework is proposed to automate image feature identification and therefore facilitate the image understanding tasks of registration and segmentation. Features are delineated using an atlas image, and a probability distribution is defined on the locations and variations in appearance of these features in new images from the class exemplified by the atlas. The predictive distribution defined on feature locations in a new image from the class essentially balances the two notions that, while each individual feature in the new image should appear similar to its atlas representation, contiguous groups of features should also remain faithful to their spatial relationships in the atlas image. A joint hierarchical model on feature locations facilitates reasonable spatial deformations from the atlas configuration, and several local image measures are explored to quantify feature appearance. The hierarchical structure of the joint distribution on feature locations allows fast and robust density maximization and straightforward Markov Chain Monte Carlo simulation. Model hyperparameters can be estimated using training data in the form of manual feature observations.

Given Maximum posteriori estimates an analysis is performed on in vitro mouse brain Magnetic Resonance images to automatically segment the hippocampus. The model is also applied to time-gated Single Photon Emission Computed Tomography cardiac images to reduce motion artifact and increase signal-to-noise.

Raquel Prado
Latent Structure in Non-Stationary Time Series
Advisor: Mike West

The class of time-varying autoregressions (TVAR) constitutes a suitable class of models to describe long series that exhibit non-stationarities. Signals experiencing changes in frequency content over time are often appropriately modeled via relatively long AR processes with time-varying coefficients. Using a particular DLM (dynamic linear model) representation of TVAR models, time-domain decompositions of the series into latent components with dynamic, correlated structures are obtained. Methodological aspects of such decompositions and interpretability of the underlying processes are discussed in the study of EEG (electroencephalogram) traces.

Multiple EEG signals recorded under different ECT (electroconvulsive therapy) conditions are analyzed using TVAR models. Decompositions of these series and summaries of the evolution of functions of the TVAR parameters over time, such as characteristic frequency, amplitude and modulus trajectories of the latent, often quasi-periodic processes, are helpful in obtaining insights into the common structure driving the multiple series. From the scientific viewpoint, characterizing the system structure underlying the EEG signals is a key factor in assessing the efficacy of ECT treatments. Factor models that assume a time-varying AR structure on the factors and dynamic regression models that account for time-varying instantaneous lead/lag and amplitude structures across the multiple series are also explored. Issues of posterior inference and implementation of these models using Markov chain Monte Carlo (MCMC) methods are discussed.

Decompositions of the scalar components of multivariate time series are presented. Similar to the univariate case, the state-space representation of a VAR(p) model implies that each univariate element of a vector process can be decomposed into a sum of latent processes where every characteristic modulus and frequency component appears in the decomposition of each univariate series, while the phase and amplitude of each latent component vary in magnitude across the univariate elements. Simulated data sets and portions of a multi-channel EEG data set are analyzed here in order to illustrate the multivariate decomposition techniques.

Luca Tardella
Some topics in Bayesian methodology
Advisor: Michael Lavine

We extend the methodology for robustness investigations in the framework of nonparametric Bayesian inference. Robustness here is intended in the sense of the determination of the global variation of outputs given an imprecise formulation of the inputs. Considering nonparametric models and robustness jointly allows an escape from the possible - inevitable - sources of imprecision in the specification of a parametric analysis and enhances at the same time the flexibility of the analysis.

In Bayesian inference an imprecise input is typically represented by a class of probability measures on the unknown quantities. In the previous literature about parametric robustness most of the efforts have been concentrated on classes of prior distributions for a finite collection of unknown parameters, while the distribution of the data conditionally on those parameters is considered to be known exactly. Robustness with respect to misspecification has received less attention due to the mathematical difficulties. In the nonparametric context the two aspects are no longer separated but they are in fact identified with the presence of a unique unknown infinite-dimensional parameter.

We examine here nonparametric analysis within the context of Exchangeable Tree processes that fall in the general class of exchangeable processes and represent a subclass of Tail-free processes. Exchangeable Trees constitute indeed a general class of processes and contain as particular cases Dirichlet processes and Polya Trees, two of the most used nonparametric priors. We propose a predictive interpretation for an imprecise prior input that leads us to formulate a general solution for the global robustness investigation. We are then able to quantify the range of linear functionals of the conditional predictive distribution after some data have been collected.

The larger framework implied by enlarging the scope to an infinite-dimensional parameter leads us to expect less robustness. Some annoying phenomena, like dilation, are experienced, deviating from the usual pattern in parametric robustness. We are able to compare how this is affected by the prior inputs and quantify how robustness can be improved by restricting attention to particular subclasses.

Finally, a different problem is approached: simulation from mixture distributions whose components are supported on spaces of different dimensions. Here a novel approach is considered by reducing the problem to that of simulating from a single target distribution that is absolutely continuous with respect to the Lebesgue measure on the largest support of the components. This approach is suggested from an alternative representation of the simulation goal in the simplified situations when the mixture consists only of a one-dimensional component and a degenerate component. Tools for suitable generalization to arbitrarily nested components and to an arbitrary number of them are provided. Hence two alternative methods are derived and one of them is successfully employed in analyzing simulated data as well as a real data set. The new approach is designed in order to avoid the numerical integration needed for evaluating the relative weight of each component and represents in the case of nested components an alternative to the currently available MCMC methods such as the reversible jump algorithm (Green, 1995) and the composite-space approach (Carlin and Chib, 1995). The distinguishing feature of the proposed method is the absence of proposals for jumping between components of different dimension or of the specification of pseudopriors. This allows for a more automatic implementation. Furthermore it is argued that in the actual implementation of a Markov chain that simulates from the absolutely continuous target distribution one can automatically build up a chain that allows for moving from one component to any other possible component which possibly improves the speed of convergence. Finally, in order to assess the mixing behaviour, standard convergence diagnostics for absolutely continuous stationary distributions can be used.
Return to the main page