September, 3-7, 2012
models for zero-inflated data: an application to the abundance data of two crustaceans’
species in the
In the ecological field,
abundance data are often characterized by the zero inflation of population
distributions. In this work we consider two commercial species belonging to the
faunistic category of crustaceans, abundant in the
North-Western Ionian Sea, namely the Parapenaeus longirostris (Lucas, 1846) and the Aristaeomorpha
foliacea (Risso, 1816).
Biological data concerning the two species of shrimp are collected during trawl
surveys carried out from 1995 to 2006 as part of the international program
MEDITS (International bottom trawl survey in the
Bayesian networks in detecting key structure features of a spring system in exploring proteins conformations
In our work we developed a mathematical model, which is applied to explore conformational movements of proteins. The model can be used in order to identify key residues (amino acids, groups of atoms), which have the greatest impact on transition from one conformation to another  or to detect intermediate conformations, a reaction path between two input conformations. We are going to employ Bayesian network to infer, which real-valued parameters of our mathematical model have the greatest impact on its efficiency. Therefore, the Bayesian network presents influence of attributes on the quality of solutions returned by the method. In order to find conditional probabilities for the network, we are going to take advantages of numerical results obtained by software implementation of our model. Numerous tests for different combinations of the values of the model attributes, will guarantee good estimation of the Bayesian network parameters. Our mathematical model of the protein conformations is implemented by means of spring systems, which are represented by a graph G: = (V; E) embedded in a Euclidean space R^3. The principal task for the spring systems is to assume a required mechanical behaviour (physical locations of network nodes, regarded as output) in response to suitable physical stimuli (displacements of control nodes, regarded as input). In , we show how such systems can be implemented. Each atom or group of atoms of given protein is represented by a node of V and virtual bonds between them constitutes the set of springs E. In order to accomplish the objective, the expected mechanical behaviour of the spring system is defined by conformations of a given protein.
 Chen, W. Z., Li,C. H., Su, J. G., Wang, C. X., Xu, X. J.
Identification of key residues for protein conformational transition using elastic network model
 Czoków, M., Schreiber, T.: Adaptive
Spring Systems for Shape Programming. Proc. of the 10th International
Conference on Artificial Intelligence and SoftComputing,
Bayesian spatial analysis of FRAP-images
Fluorescence Recovery after Photobleaching (FRAP, see for example, Sprague and McNally, 2005) is a method in biology to investigate in vivo the binding behaviour of molecules in a cell nucleus. The molecules are therefore tagged fluorescently, a part of the cell nucleus of the cell of interest is bleached, and the recovery of the bleached part of the nucleus is observed by taking pictures of the nucleus in predefined time intervals. The aim is to get information about the speed of the movement of the unbleached molecules to observe their binding behaviour. More specifically, one wants to get information about the existence of one or more binding sites for the molecule as well as the duration of residence at a specific binding site. To date, analysis of FRAP data has been performed either for only the bleached part of the cell nucleus (for example, in Sprague et al., 2004), or for both the bleached part and the unbleached part of the cell nucleus separately (for example, in Phair et al., 2004), or for a finer subdivision of the cell nucleus into a small number of disjoint parts (for example, in Beaudouin et al., 2006).
Our goal is to perform a spatial analysis of FRAP data at the pixel level. We plan to incorporate the concentration of unbleached molecules of interest in neighbouring pixels into the fit of the concentration curve per pixel, which is obtained by the imaging of the cell nucleus, to account for diffusion. Moreover, we plan to model one binding reaction per pixel. By solving the differential equations based on the compartment model that describes the change of the concentration of unbleached molecules in each pixel in the cell nucleus we aim to get a nonlinear regression equation per pixel by which we can model this concentration at any time during the recovery.
For each pixel, we intend to obtain estimates of the on- and off-rates of the binding reaction providing information about the binding behaviour of the molecules, as well as estimates of the volume of the compartments of interest by applying a MCMC-algorithm with Gibbs- and/or Metropolis-Hastings-update-steps.
Joel Beaudouin et al. Dissecting the Contribution of Diffusion and Interactions to the Mobility of Nuclear Proteins. Biophysical Journal, 90:1878-1894, 2006.
Robert D. Phair et al. Global Nature of Dynamic Protein-Chromatin Interactions In Vivo: Three-Dimensional Genome Scanning and Dynamic Interaction Networks of Chromatin Proteins. MOLECULAR AND CELLULAR BIOLOGY, 24(14):6393-6402, 2004.
Volker J. Schmid et al. A Bayesian Hierarchical Model for the Analysis of a Longitudinal Dynamic Contrast-Enhanced MRI Oncology Study. Magnetic Resonance in Medicine, 61:163-174, 2009.
Brian L. Sprague and James G. McNally. FRAP analysis of binding: proper and fitting. TRENDS in Cell Biology, 15(2):84-91, 2005.
Brian L. Sprague, Robert L. Pego, Diana A. Stavreva, and James G. McNally. Analysis of Binding Reactions by Fluorescence Recovery after Photobleaching. Biophysical Journal, 86:3473-3495, 2004.
Luca Ferreri and Mario Giacobini
A discrete stochastic model of the transmission cycle of the tick borne encephalitis virus
Tick borne encephalitis (TBE)
is an emergent zoonosis transmitted by ticks in several
woodland areas of the
TBE is naturally maintained by a cycle involving hard ticks belonging to the Ixodes spp. as vectors and mice as hosts animals. In fact, hard ticks need only one complete blood meal to moult. Furthermore, immature ticks - larvae and nymphs - usually feed on small vertebrates while adults ticks prefers large mammals. However, the main route of transmission of the TBE viruses arises from infected nymphs to larvae cofeeding on the same mice.
In this work we try to formulate a discrete stochastic model that describes the aforementioned transmission cycle. In particular, we consider a stochastic network contact structure in order to describe the potential numbers of transmissions from nymphs to larvae over different months in years. From this mathematical model we have achieved some interesting analytical results that in the future we intend to validate by stochastic simulations.
Diversification in a dynamic landscape
Allopatric speciation is
often viewed as a slow and gradual process. Over time reproductive isolation is
achieved due to a lack of gene-flow as a result of the formation of a
geographical barrier. Because geographical changes are relatively slow, allopatric speciation is usually not associated with the
generation of, or dynamic changes in, biodiversity. In contrast, environmental
factors such as water level or temperature might rapidly change the
distribution of viable habitat. Here we study the effect of such changing
environmental factors on diversification in a model containing allopatric speciation due to dynamical environmental
factors and sympatric speciation. We use the cichlids in
Many problems of current interest in science and engineering rely on the
ability to perform inference in high-dimensional spaces. A very common strategy,
which has been successfully applied in a broad variety of complex problems, is
An important drawback of the importance sampling approach, and particularly of PMC, is that its performance heavily depends on the choice of the proposal distribution (or importance function) that is used to generate the samples and compute the weights. When the variable of interest is high-dimensional or the proposal is very wide with respect to the target, the importance weights degenerate leading to an extremely low number of representative samples.
We propose a novel PMC scheme which is based on a simple proposal update scheme and introduces a technique which avoids degeneracy of the importance weights and increases the efficiency of the PMC scheme when drawing from a poorly informative proposal.
As a practical application of interest we have applied the proposed algorithm to the challenging problem of estimation of the rate parameters in stochastic kinetic models (SKM). Such models describe the time evolution of the population of a set of species which evolve according to a set of chemical reactions and present an autoregulatory behaviour. We propose a particularization of the proposed algorithm to SKMs and present numerical results based on a simple SKM, known as predator-prey model.
Laura Martin Fernandez, Ettore Lanzarone, Joaquin Miguez, Sara Pasquali and Fabrizio Ruggeri
Particle filter estimation in a stochastic predator-prey model
Parameter estimation and population tracking in predator-prey systems
are critical problems in ecology. In this paper we consider a stochastic predator-prey
system with a Lotka-Volterra functional response and
propose a particle filtering method for jointly estimating the behavioural parameter
representing the carrying capacity and the population biomasses using field data.
In particular, the proposed technique combines a sequential
Application of the Bayesian influence diagram framework to the ramified optimal transport problem
A tree leaf transports resources like water and minerals from its root to its tissues. The leaf tends to maximize internal efficiency by developing an optimal transporting system. That observation can be applied to the, well-known NP-hard, ramified optimal transport problem, where the goal is to find an optimal transport path between two given probability measures. One measure can be identified with a root (source) while the other one with tissues (target). We will present an algorithm for solving a ramified optimal transport problem within the framework of Bayesian networks. It is based on the decision strategy optimisation technique that utilises self-annealing ideas of Chen-style stochastic optimisation, and uses Xia's formulation for the cost functional. Resulting transport paths are represented in the form of tree-shaped structures.
Preetam Nandy and Michael Unger
Optimal perturbations for the identification of stochastic reaction dynamics
Identification of stochastic reaction dynamics inside the cell is
hampered by the low-dimensional readouts available with today’s measurement
technologies. Moreover, such processes are poorly excited by standard
experimental protocols, making identification even more ill-posed. Recent
technological advances provide means to design and apply complex extra-cellular
stimuli. Based on an information-theoretic setting we present novel
Causal network modeling for drug target discovery
Development of algorithmic approaches to interpretation of large-scale genetic, transcriptomic, proteomic, and metabolic datasets is a key focus of computational biology. In pharmaceutical research and development, these methods are used to gain a mechanistic understanding of the biological question of study.
One such method is causal network modelling, a systematic computational analysis that identifies upstream changes in gene regulation that can serve as explanations for observed changes in experimental data. These upstream gene regulation events are identified using a directed interaction network. Different hypotheses for upstream causal events are compared by using the network model to make predictions for the observed data, then evaluating the accuracy of the predictions. The common method for making such predictions is the shortest-paths algorithm, which predicts the regulatory effect based on the net effect along the edges of the shortest path in the network between the upstream regulation event and the observed regulation event in the data.
While the causal network modelling approach is promising, the use of shortest paths based predictions is flawed. It ignores the topological complexity that is characteristic of biological networks, such as feedback loops, which have an essential impact on net effect. It also only considers upstream hypotheses concerning individual genes, despite the fact we now know disease are too complex to target individual targets in isolation.
To address this, we are developing a statistical approach to causal network modelling. This incorporates probabilistic modelling of the network can be used to capture topological complexity and quantify uncertainty, in contrast to shortest paths algorithm. Further, we can evaluate upstream regulation events that involve multiple genes, instead of evaluating single-gene hypotheses separately.
Hossein Farid Ghassem Nia
Bayesian decision making in computer vision with an approach to industrial automation
Computer vision is becoming the main stream in automation and quality control industry. In some applications, it is critical to make correct decision based on the uncertain data from vision systems and draw a conclusion based on analysis of data and predefined models. In this presentation, we introduce the application of Bayesian theory in a novel computer vision system in automation industry. In our research, we used Bayesian theory in image processing and signal analysis to find region of interest. We also show that how we developed our theory to analyse mass spectrometry data of melanoma patients. In addition, we are aiming to demonstrate some on-going challenges in this project regarding minimizing error in decision making.
Gian Marco Palamara
Statistical inference for temperature dependent logistic time series
Methods of parameter estimation are fundamental tools to assess the predictive power of theoretical models of population dynamics. The use of simple models like the logistic growth and the ability to infer parameters from time series data is emerging as a key problem in population ecology. We simulate stochastic logistic time series from different birth and death processes using the classical Gillespie algorithm. Logistic growth is the building block of more complex population models and can be used to test different methods of parameter estimation. We are able to simulate temperature dependence of the parameters of the logistic growth (namely growth rate and carrying capacity) for different birth and death processes. We apply to those simulated time series data different observation processes based on discrete time sampling of a typical experiment and on the spatial homogeneity of a population. We then construct different likelihood functions in order to fit simulated data to different models.
We find that there is a constant bias in fitting deterministic models to stochastic data. This bias is based upon the choice of the correct parameterisation of the variance of the observed data and does not depend on the observation process we use. Taking into account the observation process we are able to disentangle the intrinsic stochasticity of the biological process from the noise induced by the observation itself. Bayesian approaches are particularly convenient when dealing with incomplete data
Statistical description of functional neural networks
The aim of the presentation is to briefly discuss mainly statistical tools for description of large-scale activity-flow graph in artificial neural networks.
Suppose we are given a recurrent artificial neural network i.e. a set of neurons connected by synapses, with its stochastic, energy-driven dynamics. During the dynamics action potentials (or spikes) transmit signals between pairs of neurons by travelling along the synapses.
These spike travels yield spike- or activity-flow graphs, consisting of the synapses, which took part in transmitting the information.
Due to the scale of the graphs one must resort to mixed statistical and random-graph-theoretical approach in order to describe properties of the activation-flow network. Among the discussed properties we mention average connectivity, empirical degree distribution, characteristic and maximum (the diameter) path length, clustering coefficient. Additional features include spectral density, 'small-world-ness indicator’, resiliency to random damage, graph degeneracy, degree assortativity etc.
The aim of such description is two-fold. First, the properties of the model seem to be interesting by themselves. Second, the statistical description can be compared to those obtained from values, reported in medical data of the fMRI brain analyses and, by extension, shed some light onto the at least some principles of brain work, at least at macroscopic level.
Time-lapse fluorescence microscopy imaging has rapidly evolved in the past decade and has opened new avenues for studying intracellular processes in vivo. Such studies generate vast amounts of noisy image data that cannot be analyzed efficiently and reliably by means of manual processing. Many popular tracking techniques exist but often fail to yield satisfactory results in the case of high object densities, high noise levels, and complex motion patterns. Probabilistic tracking algorithms, based on Bayesian estimation, have recently been shown to offer several improvements over classical approaches, by better integration of spatial and temporal information, and the possibility to more effectively incorporate prior knowledge about object dynamics and image formation. We propose an improved, fully automated particle filtering algorithm for the tracking of many subresolution objects in fluorescence microscopy image sequences. It involves a new track management procedure and allows the use of multiple dynamics models. The accuracy and reliability of the algorithm are further improved by applying marginalization concepts. Experiments on synthetic as well as real image data from three different biological applications clearly demonstrate the superiority of the algorithm compared to previous particle filtering solutions.
Computational Bayesian tools for modeling the aging process
Whereas the aging process is obvious in macroscopic organisms, it is not in single celled ones. However, when monitoring the growth of rod-shaped bacterial colonies, for instance using the model organism E. coli, it is made possible to recognize an aging mechanism. This is due to the division process, which splits the cell transversally producing a new end per progeny cell; this new end is called new pole, whereas the other pre-existing end, old pole. Thus, the replicative age is defined as the number of generations elapsed since the old pole arose. The older this pole, the slower is its growth; thus, more damages are expected to have accumulated – increased physiological age. However, the replicative age accounts for a significant, yet limited fraction of the variability observed in the physiological characteristics. Understanding the impact of the replicative age on the physiological measurements, as well as, the mechanism with which the cells are rejuvenated, symmetrical or not, is possible by reconstructing a hidden quantity that would govern the physiology of the cell while fulfilling basic conservation laws. Estimation is made in form of exploration of the approximate posterior distribution for the parameters of the constructed mathematical model. Approximate
Bayesian Computation methods (ABC rejection sampler and ABC MCMC sampler) are considered in order to avoid the combinatorial cost, as well as, the difficulty of computing the distribution of the statistics which this study is relied on. Results show that the method recognizes well the presence and the absence of asymmetry, but not at a low level.
Modeling dynamics of forest ecosystems
Forests provide man with relevant goods and services. These include not only
timber and firewood, but also non-wood forest products, protection from hydrogeological hazards, carbon sequestration, recreation
and tourism, habitat for animal and plant biodiversity. However, forest ecosystems
face pressures that may exceed their resistance or adaptation potential
(resilience). On one hand, human-induced climate change is forecasted to impact
photosynthesis, wood production, tree mortality, regeneration, tree species
distribution, soil properties, and frequency and severity of disturbances such
as fire or pest and pathogen outbreaks. On the other hand, changes in land use
brought about by socio-economic processes are affecting forest distribution,
composition and structure even faster than climatic forcing, e.g. by
deforestation (in developing countries) or formation of secondary woodlands
and/or spreading of invasive alien species (in developed countries for the most
part). In order to ensure the
continuity of forest services and the sustainability of forest resources in
face of ongoing changes, scenarios of response to external drivers and
alternative pathways of resource exploitation are needed.
A Bayesian approach to infectious disease modelling using ordinary differential equations: rotavirus in Germany
Understanding infectious disease dynamics using epidemic models requires the quantification of several parameters describing the transmission process. In the context of predictive transmission modelling this quantification is commonly based on disconnected epidemiological studies to fix as many parameters as possible in advance. This approach often leads to biased inference for parameters left to estimate due to dependency structures inherent in any given model, without sufficiently assessing the uncertainty regarding those detached assumptions. We developed a Bayesian inference framework that lessens the reliance on external parameter quantifications due to a data driven estimation approach. We extended this idea with model averaging techniques with a focus on the residual autocorrelation, to weaken those estimates dependence on the underlying model structure. We applied our methods to the modelling of age stratified weekly rotavirus incidence data in Germany from 2001-2008 using a complex susceptible-infected-recovered type model taking maternal antibodies, waning immunity seasonality and underreporting into account. Our results not only give valuable insight into the transmission processes, but also show the severe consequences of fixing parameters beforehand regarding the model predicted dynamics.