Spatio-temporal graphical modeling with innovations based on multi-scale diffusion kernel BERNARD CHALMOND University of Cergy-Pontoise, France and CMLA, Ecole Normale Sup´erieure de Cachan, France ∗
Abstract- A random field of interest is observed on an undirected spatial graph over time, thereby providing a time series of dependent random fields. We propose a general modeling procedure which has the potential to explicitly quantify intrinsic and extrinsic fluctuations of such dynamical system. We adopt a paradigm in which the intrinsic fluctuations correspond to a process of latent diffusion on the graph arising from stochastic interactions within the system, whereas the extrinsic fluctuations correspond to a temporal drift reflecting the effects of the environment on the system. We start with a spatio-temporal diffusion process which gives rise to the latent spatial process. This makes a bridge with the conventional Wold representation, for which the latent process represents the innovation process, and beyond that with the stochastic differential equation associated to the Fokker-Plank dynamic. The innovation process is modeled by a Gaussian distribution whose covariance matrix is defined by a multi-scale diffusion kernel. This model leads to a multi-scale representation of the spatio-temporal process. We propose a statistical procedure to estimate the multi-scale structure and the model parameters in the case of vector autoregressive model with drift. Modeling and estimation tasks are illustrated on simulated and real biological data. Keywords: Spatio-temporal graphical model, Spatial statistic modeling, Multi-scale heat diffusion kernel, Graph Laplacian, Intrinsic and extrinsic stochastic fluctuations, Multi-scale decomposition. .
1. Introduction We are interested in stochastic processes in time and space domains, of which the state of a variable at every time point is determined by the states of variables in its spatial neighborhood, as well as the states of a set of variables at previous time points. This vectorial process is denoted t is observed on an undirected spatial graph G, t , t = 1, ..., T }=Y. ˙ At every time point t, Y {Y composed of n nodes. The analysis of complex spatio-temporal processes from experiments is an important issue in many areas where one tries to extract information useful for the characterization of the spatial and temporal variability, in order to discover or understand the physical underlying dynamics. This study started with the following paradigm. At each instant the system has a basal spatial activity that maintains fluctuations over time. Such fluctuations arising from inherently proba∗
E-mail :
[email protected]
1
Spatial Statistics
http://dx.doi.org/10.1016/j.spasta.2013.11.004
Preprint
2
B. Chalmond
bilistic interactions within the system are typically called intrinsic or internal. Furthermore, there could exist other fluctuations, called extrinsic or external, reflecting the influence of the environment on the system. At the beginning of this study, there was also the knowledge of the ability of the diffusion kernel to represent the dependencies of stochastic observations on a graph. Our objective is to produce a representation of the process into spatial and temporal components, taking into account the fact that the fluctuations of the dynamical system are related to the intrinsic and extrinsic effects, the basal intrinsic fluctuations being described by a multi-scale diffusion kernel. Modeling of stochastic spatio-temporal processes has been widely studied. Most of these studies are based on a Markovian representation of autoregressive or diffusion type. In this context, modeling and estimation of the covariance matrix K of the innovation process have been left in the background in the literature. Regardless of the fact that this knowledge is incomplete, it is still unclear how to extract diffusion components from experimental results and to track structural changes. We revisit these basic models, and doing so, the intrinsic component is represented by a process whose innovations are based on the graph Laplacian of G, and the extrinsic component by a drift. It is from this point of view that are defined concepts of intrinsic and extrinsic fluctuations. Related works. The distinction between intrinsic and extrinsic depends on how are defined system and environment in the considered experiment. The main area where this concept has been studied is that of biological networks [11, 34, 8, 12]. Many recent studies have reported on the phenotypic variability of organisms related to an intrinsic stochasticity that operates at basal level of gene expression. This characteristic is related to the activity of differentiated subnetworks, called modules. These endogenous subnetworks are regulatory structures controlling processes that are intrinsic to the cell. In a recent study [6], we proposed a general method for detecting t . However, in this modules at fixed-time, that is to say, in the case of a single random vector Y study, the detection of modules was also performed on time series, but independently momentafter-moment without taking into account the time dependence. To avoid this simplification, we propose a spatio-temporal modeling that improves our previous approach. The classic autoregressive model is based on innovations whose covariance matrix K is not theoretically restricted to a diagonal matrix [4]. A straightforward estimate of K is given by the empirical covariance matrix of the residuals that result from the least-square fitting of the model to the data [4, 37]. This estimation, which is consistent, only asymptotically, has the drawback to confound spatial and temporal components. An alternative is to impose a modeling of K, which is a means for taking into account explicitly the spatial dependencies. In this way, a mono-scale Matern model is adopted in [15], based simply on the inter-node geographical distances, but without graphical connections. Another modeling is based on the stochastic differential equation associated with the FokkerPlanck Markovian dynamic, as considered in [2] for social network analysis. This model has been the source of many studies where finally the modeling is reduced to a first-order autoregressive model, and furthermore with K limited to a diagonal matrix [27]. This simple autoregressive model is also known under the name of Dynamic Bayesian Networks [32]. But again, the fact of ignoring spatial dependencies amounts to integrating them into the temporal dependencies,
Spatial Statistics
http://dx.doi.org/10.1016/j.spasta.2013.11.004
Preprint
Spatio-Temporal Graphical Modeling
3
which prevents the distinction between spatial and temporal effects. The Markovian approach that is based on Gibbs distribution allows the modeling of the two components. It is an efficient technique to treat inverse problems in which the Markovian modeling is made on hidden random fields, and especially for Boolean networks [7, 39]. However, a main drawback to this approach is the difficulty to calibrate the hyper-parameters, which balance the different terms that compose the energy of the Gibbs distribution [5]. t , t = 1, ..., T }, [31] describes a For the simulation of biochemical dynamical processes {Y novel approach to reveal the existence of a meaningful manifold which approximates the slow dynamics of the process. They search for p < n new variables corresponding to the dynamically meaningful slowly varying coordinates. The method is based on a connectivity matrix T × T ti denoted A(τ ) whose elements are weights Ai,j (τ ) that depend on the covariance matrices of Y and Ytj , and a scale parameter τ . By defining M(τ ) as the row stochastic matrix associated to A(τ ), the solution is given by the p first eigenvectors of M(τ ) that approximate the eigenfunctions of the Laplacian diffusion operator over the manifold. The difficulty here is the computation of A(τ ) due to the presence of the covariance matrices which are obtained in [31] by simulating the model, not to mention the difficulty of choosing τ and p as discussed in [23]. Although the initial model has no parameters, the method is in fact highly parameterized, because of the covariance matrices to estimate. Sketch of our contribution. The first-order vector autoregressive model is a Markov model, whose expression is : t, t−1 + W Yt = Φ1 Y
(1.1)
where Φ1 is a matrix of coefficients connecting the nodes activities at time t − 1 to those at time t is a vectorial Gaussian noise with zero mean t, according to a temporal directed graph G. W 2 and covariance matrix σW In . The model (1.1) is stationary when Φ1 satisfies some properties. In t } is simply there to maintain the dynamic of the temporal this case, the innovation process {W t }, but it does not contain any information specific to the intrinsic phenomenon, since process {Y t is diagonal. at every time t, the covariance matrix of W As introduced above, we propose to take into account at each instant the basal state of the system, which is based on the dependence structure of the spatial graph G. To do this, we replace t reflecting the spatial dependence. At first t by a colored innovation Z the white innovation W 2 t by a covariance matrix K In of W glance, it would suffice to replace the covariance matrix σW such that Ki,j = 0 for any pair of connected nodes and Ki,j = 0 for the other ones. However, its estimation poses a dimensionality problem in the subsequent data analysis, since t far exceeds the number of time points T . The situation is quite the the number n of variables in Y opposite to classical time series analysis [16, 4]. A solution to significantly reduce the number of covariances to be estimated is to consider a parametric model for K. A well-known model is given by the diffusion kernel Kλ = exp(−λL) where L is the graph Laplacian of G and λ is a scale parameter [3, 21, 26, 37]. Kλ is a discretized version of the diffusion operator appearing in the solution of the heat differential equation in R2 . In doing so, the determination of Kλ boils down to estimate λ. The model (1.1) is thus replaced by : t = Φ1 Y t t−1 + Z Y
Spatial Statistics
http://dx.doi.org/10.1016/j.spasta.2013.11.004
(1.2)
Preprint
4
B. Chalmond
t is distributed as a zero mean Gaussian vector with covariance matrix where the innovation Z Kλ . This short presentation has a more thorough justification. We show that the spatio-temporal model (1.2) arises from the composition of two diffusion differential equations : a spatial diffusion associated with the Laplacian L of parameter λ, and a temporal diffusion associated with the graph Laplacian L of G that makes that Φ1 is depending on a parameter τ . Denoting L and L their spatio-temporal version, the solution of this pair of differential equations is Y(λ, τ ) = e−τ L Y(λ, 0) Y(λ, 0) = e−λL W, t } and Y(λ, τ ) the model of the process where Y(λ, 0) is the model of the innovation process {Z t }, which after simplification gives an expression similar to (1.2) : {Y t−1 (λ, τ ) + Y t (λ, 0) . t (λ, τ ) = Φ1 (τ )Y Y There is another point of view that brings up a drift term, alongside the diffusion process and this, in order to model the extrinsic component of the system. To do this we consider the classical stochastic differential equation associated with the Fokker-Planck Markovian dynamic [2, 31]. The discretized expression of this dynamic can be written : √ t + a(Y , t)dt + b(Y , t)W t dt , t+dt = Y (1.3) Y , t) is a diffusion term and a(Y , t) is a drift term whose form depends on the appliwhere b(Y t independent of t, the term cation. In the particular case where a(Y , t) is a linear function of Y Yt + a(Y , t)dt can be rewritten Φ1 Yt , and therefore (1.3) becomes similar to (1.2). More generally (1.3) appears as a model susceptible to integrate a drift term related to the experiment, as we shall give an example. Beyond this first modeling, our contribution provides a multi-scale extension of the graphical innovation model, as well as a statistical estimation procedure with selection of the relevant scales. The benefits of the multi-scale approach have been demonstrated for issues related to our concerns in other areas. For instance, [1] shows that large social networks contain hierarchically organized community structures. One crucial step when studying the structure and dynamics of these networks is to identify communities. In computer vision, for shape comparison and shape matching, [33] proposes a scale-space representation of shape feature based on graph Laplacian. In our study, several scales are needed to correctly model the basal state. Therefore, we represent K by a weighted sum of diffusion kernels at different scales : r
σj2 Kj where Kj = exp(−λj L) .
j=0
This model leads to a multi-scale representation of the spatio-temporal process in continuity with the one proposed in [6] for the detection of modules. A crucial point in modeling of dynamic systems is parameter inference from observed time series. Therefore in our article, the focus is also
Spatial Statistics
http://dx.doi.org/10.1016/j.spasta.2013.11.004
Preprint
Spatio-Temporal Graphical Modeling
5
on estimation issues. The estimation of the parameters σj2 is obtained using an exact maximum likelihood principle. Without this exact likelihood, the multi-scale representation would be not consistent. In time-series analysis, log |K| is often ignored since for large T its influence on the likelihood is small. This is not true for the spatial context where we know that ignoring it can result in inconsistent estimators (see [9] Section 7.2.2, and [14] Section 5.3). In addition, the number of scales r and these scales λj are selected using a Bayesian maximum likelihood under a quadratic constraint. Modeling and estimation tasks are detailed in Section 2. In Section 3, we illustrate this modeling on simulated and real data. To do that, a statistical hypothesis testing based on a log-likelihood ratio is proposed in order to decide between two hypotheses H0 and H1 . This procedure is implemented in two distinct situations: firstly, a test of temporal dependence, and secondly a test for drift detection.
2. Models and method 2.1. Background : Random field and diffusion process We recall some classical results on random field models on graph that we call Graphical Random Fields (GRF) 1 . Consider a real random vector Y = (Y (1) , ..., Y (n) ) on an undirected graph G = (V, E). This field is indexed by the nodes V = {v1 , ..., vn }. The set of undirected edges E ⊂ V × V is such that every edge (i, j) is identical to (j, i) which is denoted i ∼ j. The dependency structure between the random variables {Y (i) } depends on the topological structure given by E. This dependency structure is here limited to a covariance structure modeled by a diffusion kernel [21, 26, 3, 10], as follows. by a random field model on G, denoted Y (λ), whose covariance We seek to represent Y structure depends on a scale parameter λ > 0. This model is obtained by equalizing the variations due to a change of scale, with the spatial variations : ˙ dλ [Y (j) (λ) − Y (i) (λ)] (2.1) Y (i) (λ + dλ) − Y (i) (λ) = j∈V : j∼i
that is written in vector form : (λ + dλ) − Y (λ) = −dλ L Y (λ) , Y L=D−A.
(2.2)
L is the Laplacian of the graph G. A is the binary adjacency matrix (or connectivity matrix) defined by the edges : Ai,j = 1i∼j , and D = diag{di } where di = j Ai,j is the degree of node i, i.e. the number of edges connected to i. L is a symmetric positive semi-definite matrix. Ai,j can be extended to weights different from 1. (2.2) is the discretized version of the heat differential 1 A graphical random field is indexed by any graph, undirected and/or directed. This term includes Markov networks and Bayesian networks.
Spatial Statistics
http://dx.doi.org/10.1016/j.spasta.2013.11.004
Preprint
6
B. Chalmond
(0) at scale λ = 0 2 : equation that requires to choose an initial state Y
d dλ Y
(λ) Y (0)
(λ) , = −L Y given .
(2.3)
The solution of (2.3) is : (λ) = Kλ Y (0) , Y Kλ = e−λL ,
(2.4)
∞ M i where exp (M ) = i=0 i! . Kλ is called diffusion kernel. The result (2.4) is also valid for directed graphs. (0). With the first one, we consider Y (0) = Y that provides a Consider two choices for Y representation of Y at different scales, and this, in order to highlight specific structures of Y . This follows directly from the smoothing properties of the diffusion (2.4) : (λ) = Kλ Y Y Y (i) (λ) =
Kλ (i, j)Y (j) .
(2.5)
j∈V : j∼i
(λ) is interpreted as a scale-space random field on V × R+ . Y The second choice concerns the generation of random fields with covariance matrix Kλ . This requires that the graph is undirected, since in this case, the exponential of the symmetric matrix (0) = W and therefore : L provides a semi-definite positive matrix. Here we consider Y (λ) = Kλ W , Y
(2.6)
2 where the W (i) are i.i.d. and verify IE(W (i) ) = 0, Var(W (i) ) = σW . The covariance matrix of 2 3 (λ) is then σ exp(−2λL) . The equation (2.3), with the initial state W , allows to construct Y W a random field Y (λ) with covariance matrix exp(−2λL) where the scale parameter λ rules the range of the spatial dependence. The more λ is large, the more the off-diagonal effects in Kλ increase.
2.2. Spatio-temporal GRF 2.2.1. Diffusion modeling and intrinsic innovations was observed at fixed-time, i.e. at a given time t. Now Previously, the graphical random field Y we examine the situation where the field is observed over time, providing a time series of random T }. Note Y = (Y , ..., Y ) the concatenation of the T vectors Yt . Y can be seen 1 , ..., Y fields {Y 1 T as a graphical random field on a spatio-temporal graph G = {V, E} where V = ∪Tt=1 Vt with 2 3
Spatial Statistics
In the classic case of diffusion in R2 , λ is a time parameter. To lighten the writing we shall note Kλ instead of K2λ .
http://dx.doi.org/10.1016/j.spasta.2013.11.004
Preprint
Spatio-Temporal Graphical Modeling
7
Vt = V for every t 4 . There are two types of edges : spatial edges E and temporal edges E, as follows : T E=E E = ∪Tt=1 Et ∪t=1 E t , (2.7) where Et ⊂ Vt × Vt and E t ⊂ Vt−1 × Vt . The edges (i, j) ∈ Et are undirected : i ∼ j, whereas the edges (j, i) ∈ E t are directed in the direction of time : j → i. The question here is the modeling of the random field Y by a spatio-temporal model. To do this, we extend the heat equation (2.3), assuming that there are two scales, a spatial scale λ and a temporal scale τ , and this with an initial state given by a random field W consisting of i.i.d. 2 : random variables with zero mean and variance σW ⎧ ∂ ⎨ ∂τ Y(λ, τ ) = −L Y(λ, τ ) ∂ (2.8) Y(λ, τ ) = −L Y(λ, τ ) ⎩ ∂λ Y(0, 0) = W. L and L denote respectively the Laplacian of the spatial graph (V, E) and the Laplacian of the temporal graph (V, E). Recalling (2.4), we obtain successively for each of these two scales : i)
Y(λ, τ ) = e−τ L Y(λ, 0)
ii)
Y(λ, 0) = e−λL W.
(2.9)
(2.9-ii) is the model (2.6) applied sequentially to each time t. At time scale τ = 0, this equation t , t = 1, ..., T }. Both equations (2.9) t (λ, 0) = e−λL W provides T independent random fields {Y lead to a model of type Moving Average (MA), known as the Wold representation when the process is not limited in time. t (λ, τ )} defined in (2.9-i) is a process of type MA whose innoProposition 1. The process {Y vation process is {Yt (λ, 0)} : t−j (λ, 0) . Yt (λ, τ ) = Θj (τ )Y (2.10) j≥0
t (λ, τ )}. Moreover, we assume the process t } is modeled by the model {Y The time series {Y t }. We model Z t by Y t (λ, 0) : {Yt } is maintained by a process of latent innovations, denoted {Z t (λ, τ ) , Yt ≡ Y t ≡ Y t (λ, 0) . Z t (λ, 0) and {Θj (τ )} correspond to the spatial Hence we have a parametric model in λ and τ . Y t (λ, 0) models the intrinsic part of the and temporal diffusions, respectively. At fixed-time, Y process Yt . 4
Spatial Statistics
Bold characters are reserved to the spatio-temporal case.
http://dx.doi.org/10.1016/j.spasta.2013.11.004
Preprint
8
B. Chalmond
Proof. To prove the property (2.10), it suffices to show that e−τ L is a lower matrix in (2.9-i). Without loss of generality, assume that connections are time-invariant : Et ≡ E and E t ≡ E , ∀ t .
(2.11)
Define the respective adjacency matrices A and A of these two sets of edges. A is a n × n symmetric matrix such as Ai,j = 1i∼j . A is a matrix of dimension n × 2n since it runs on ˇ 0], where A(i, ˇ j) = 1j→i two consecutive instants. It is composed of two blocks : A = [A, with j ∈ Vt−1 , i ∈ Vt , and 0 is a matrix identically zero. Relative to E and E, the adjacency matrices are A = diag[A, ..., A] and A = diag [A, ..., A]. A is a diagonal symmetric matrix and A is a lower triangular matrix, the main diagonals of the matrices 0 lying on the main diagonal of A. Finally, by denoting D and D the diagonal matrices of the adjacency degrees, the graph ˇ j), the r.h.s. of (2.9-i) Laplacians are written L = D − A and L = D − A. Denoting ai,j = A(i, is then written : ⎛ ⎡ ⎤⎞ .. ⎡ . ⎤ . .. ⎜ ⎢ ⎥⎟ ⎜ ⎢ ⎥ ⎥⎟ ⎢ −a1,1 ... −a1,n d1 0 ⎜ ⎢ ⎥ ⎥⎟ ⎢ ⎢ ⎟ .. ⎜ ⎢ ⎥ ⎥ .. .. t−1 (λ, 0) ⎥ ⎜ ⎢ ⎥⎟ ⎢ Y . . . ⎜ ⎢ ⎥ ⎥⎟ ⎢ ⎜ ⎢ ⎥ ⎥⎟ ⎢ −an,1 ... −an,n 0 dn ⎢ ⎢ ⎟ ⎜ ⎥ . ⎥ exp ⎜−τ ⎢... ...⎥⎟ ⎢ ⎥ −a ... −a d 0 1,1 1,n 1 ⎜ ⎢ ⎥ ⎥⎟ ⎢ ⎜ ⎢ ⎥ ⎥⎟ ⎢ Y . . . t (λ, 0) .. .. .. ⎜ ⎢ ⎥ ⎥⎟ ⎢ ⎜ ⎢ ⎥ ⎥⎟ ⎢ ⎟ ⎜ ⎢ ⎥ ⎣ ⎦ −an,1 ... −an,n 0 dn ⎝ ⎣ ⎦⎠ .. .. . . Since the matrix A is lower triangular, the matrix exp (−τ L) is too. As a result, the process t (λ, 0)}. Since the t (λ, τ )} defined in (2.9-i) is a process of type MA where innovations are {Y {Y innovations are i.i.d., this model can be seen as a generalization of the Wold representation [4]. The n × n matrices Θj (τ ) are extracted from the t-th block row of the matrix [exp (−τ L)], as this is illustrated below. At this point, we can ask if some confusions could exist between spatial and temporal interactions. One knows that a recurrent network with feedback loops cannot be represented by a DAG because such a graph excludes the possibility of representing feedback loops in the graphical structure. However, if the interactions between the variables are not instantaneous, the recurrent network can be unfolded in time to obtain a directed, acyclic network (see Figure 1 in [19]). Our spatio-temporal model has both spatial and temporal connections. In order that the temporal DAG cannot be interpreted as an unfolded graph, we assume that spatial and temporal interactions occur at two distinct time scales. In other words, intrinsic interactions are instantaneous, compared to extrinsic interactions between two successive instants t and t + 1, which are much more slower.
Spatial Statistics
http://dx.doi.org/10.1016/j.spasta.2013.11.004
Preprint
Spatio-Temporal Graphical Modeling
9
2.2.2. Two simplified models • MA(1) modeling- Limited to the order 1, (2.10) provides the MA(1) spatio-temporal model : t + Θ1 (τ )Z t−1 , t = Θ0 (τ )Z Y t ∼ N (0, Kλ ) , with Z
(2.12)
where the innovations are assumed to be Gaussian. As an example, when exp(−τ L) is approximated by its one order Taylor expansion : exp(−τ L) ≈ InT − τ L = InT − τ D + τ A , we obtain the approximation Θ0 (τ ) ≈ In − τ D , Θ1 (τ ) ≈ τ Aˇ .
(2.13)
• AR(1) modeling- We rewrite the MA model as an autoregressive model, by using several known properties (see [4]). Formally, the AR(p) model is : t (λ, τ ) = Y
p
t−j (λ, τ ) + Z t (λ, 0) , Φj (τ )Y
j=1
t (λ, 0) are independent to the past, which is the case by construction. where the innovations Z t (λ, 0) and {Φj (τ )} correspond to the spatial and temporal diffusions, As for the MA model, Y respectively. In fact, one knows that for the AR(p) model, there exists an equivalent MA representation whose matrices Θ’s are related to Φ’s as follows: Θj =
j
Θj− Φ , ∀j = 1, 2, ...
(2.14)
=1
By recalling (2.12), and for sake of parsimony by restricting to p = 1, provides Φ1 (τ ) = Θ−1 0 (τ )Θ1 (τ ). Thus, an AR(1) representation of Yt is : t−1 + Z t, t = Φ1 (τ )Y Y 0 = Z 0, Y t ∼ N (0, Kλ ) . with Z
5
the system (2.14)
(2.15)
For every i ∈ Vt , the set πi = {j ∈ Vt−1 : Φ1 (τ )i,j = 0} represents the parents of i. The autoregressive model (2.15) generalizes the Dynamic Bayesian Network model [32] for which t. the innovations are simply the white noise W 5
Spatial Statistics
Under some appropriate conditions, the MA(1) process is invertible and possesses an infinite AR representation.
http://dx.doi.org/10.1016/j.spasta.2013.11.004
Preprint
10
B. Chalmond In the particular case of the example (2.13), we have in addition : Φ1 (τ ) ≈ (In − τ D)−1 τ Aˇ .
(2.16)
By replacing the Laplacian L by its normalized version LN = D−1/2 L D−1/2 whose diagonal elements are equal to 1, (In − τ D) becomes In (1 − τ ) and the new expression of Φ1 (τ ) in (2.16) is : τ N ΦN (2.17) Aˇ . 1 (τ ) = 1−τ 2.2.3. SDE modeling and extrinsic drift Our analysis, which began with the heat differential equation, will now continue with a stochastic differential equation (SDE). The heat equation is relating to the diffusion parameters λ and τ , while the SDE is relating to the time parameter t. This SDE is associated with the Markovian dynamic of Fokker-Planck [2] : (t) = a(Y , t)dt + b(Y , t)dU (t) , t ∈ R+ , 6 dY , t) is a drift function, b(Y , t) a diffusion matrix, and U (t) is a Brownian process. Its where a(Y sampled version is written as √ t+dt = Y t + a(Y , t)dt + b(Y , t)W t dt , Y (2.18) , t) = (Φ1 (τ ) − In )Y t and b(Y , t) = Kλ , we t ∼ N (0, In ). Formally, with dt = 1, a(Y where W retrieve (2.15). In this case, a(Y , t) for which (Φ1 (τ ) − In ) does not depend on t, is not strictly speaking a drift. • A drift example - The advantage of the model (2.18) is the presence of a drift term which provides a means to model the extrinsic part of the system. We now present an example that will t } on a graph of size n = 3 such that for be processed further. Consider a random process {Y (2) every observed time series, the trajectory {yt } tends to be between the two other trajectories (1) (3) {yt } and {yt }, with a high probability. After a particular time tI < T , called intervention (1) (2) (1) time, the roles of Yt and Yt are reversed and therefore the trajectory {yt }t>tI tends to (2) (3) be between the trajectories {yt }t>tI and {yt }t>tI as illustrated in Fig.5. The effect of the (1) (2) intervention is an observable drift on the trajectories of {Yt } and {Yt } when t > tI . The simulated data in Fig.5 have been obtained using the ad hoc model : (1)
(1)
+ (Yt
(2)
(2)
+ (Yt
(3)
(3)
− (Yt
Yt+1 = Yt Yt+1 = Yt Yt+1 = Yt
(2)
+ Yt )cI 1t>tI + Zt
(3)
(1)
(1)
+ Yt )c + Zt
(1)
+ Yt )cI 1t>tI + Zt
(3)
(2)
(2.19)
(2)
(3)
,
where the drift function is (2)
, t) = [(Yt a(Y 6
Spatial Statistics
(3)
(1)
+ Yt )cI 1t>tI , (Yt
(3)
(1)
+ Yt )c, −(Yt
(2)
+ Yt )cI 1t>tI ] ,
In this equation, Y (t) is momentarily a function of time, it should not be confused with the spatial model Y (λ).
http://dx.doi.org/10.1016/j.spasta.2013.11.004
Preprint
Spatio-Temporal Graphical Modeling
11
(1)
(3)
with c < 0 and cI > 0. In (2.19), Yt and Yt are assumed to be negatively correlated, and the other two correlations are negligible. The correlation structure is taken into account in the t }, while time dependence and drift are taken covariance matrix Kλ of the innovation process {Z , t) = t with ˙ Aˇt Y into account simultaneously in the function Yt + a(Y ⎡ ⎤ 1 cI 1t>tI cI 1t>tI ⎦ , c 1 c Aˇt = ⎣ −cI 1t>tI −cI 1t>tI 1 N . Aˇt is an adjacency matrix as Aˇ in (2.17). Stationarity of an AR(1) process requires that the eigenvalues of Φ1 (τ ) are smaller than 1 in absolute value, [4]. In the case of the drift (2.19), we consider as in (2.17) τ ˇ Φ1 (τ, t) = (2.20) A , 1−τ t 7
in which τ allows to prevent that the process explodes after tI , 8 .
2.3. Multi-scale modeling 2.3.1. Multi-scale GRF model at fixed-time Let us return to the situation of section 2.1. The outstanding issue at the end of the previous (λ) of Y modeling step (2.5) is the choice of λ. In other words, what is the smooth version Y the most representative of some features of Y ? In fact, several scales may explain this profile. Therefore, the main idea consists of considering several scales Λ = {λ1 , ..., λq } [18, 35]. Fig.1 as a sum Y of q independent illustrates this representation. In this goal, we approximate Y random fields : = Y + Y /0 Y = =
q j=1 q
/0 /j + Y Y
(2.21)
/j + σ0 Y /0 , σj Y
j=1
/j denotes the component at scale λj , for j = 1, ..., q, and Y /0 a residual white noise where Y process corresponding to λ0 = 0. The covariance matrix Kj of each unweighted component 7
t ) = 0 and Cov(Y (1) , Y (3) ) < 0, before tI we have in Let’s comment (2.19) in terms of expectation. Since IE(Z t t (1)
the second equation μt equations, after tI , if
(3)
≈ −μt
(2) (μt
(3) + μt )
(2)
(2)
and thus μt+1 ≈ μt
≈ 0, μ denoting the expectation of Y . In the first and third
< 0 and
> 0 then μt
(2) μt
(1) (μt
(2) + μt )
(1)
(3)
and μt
tend to decrease if |c| and cI < 1. (1)
tends to increase and pass above the trajectory of μt . It implies in the second equation that 8 Fig.5 has been obtained as follows : K is mono-scale with K 1,1 = K3,3 = 0.9, K2,2 = 0.1, K1,3 = −0.9, and for the drift cI = −c = 0.9, τ /(1 − τ ) = 0.5. The trajectories were smoothed to improve readability.
Spatial Statistics
http://dx.doi.org/10.1016/j.spasta.2013.11.004
Preprint
12
B. Chalmond
/j is the diffusion kernel Kj with scale λj , (2.4). Therefore the covariance matrix of Y /j is Y Kj = σj2 Kj . The positive weight σj is all the more great as the scale λj significantly contributes . Thus, the multi-scale diffusion kernel K ¯ σ¯ ,Λ of Y is defined by to the random field Y σ,Λ + K0 ¯ σ¯ ,Λ = K K = =
q j=1 q
Kj + K0 σj2 Kj + K0
(2.22)
j=1
=
q
σj2 e−λj L + σ02 In ,
j=1
where σ denotes {σ1 , ..., σq } and σ ¯ = {σ0 , σ}. This is related to the additive spline models introduced by Wahba [38] chap.10, and later reintroduced under the name of multiple kernel [22] 9 /j } are unknown and have to be estimated. . The parameters (Λ, σ ¯ , q) and the components {Y In this goal, we assume that Y is distributed according to the Gaussian law ∼ N (0, K ¯ σ¯ ,Λ ) . Y
(2.23)
2.3.2. Multi-scale GRF decomposition at fixed time Component estimation is closely linked to the scale selection problem. Previously, Λ = {λ1 , ..., λq } denotes the scale domain in which there are q0 ≤ q unknown relevant scales. Given q0 and the parameters σ ¯ (their estimation will be described in a further section), the following Proposition expresses the scale components. Proposition 2. Assume the model (2.22) is over-parameterized, i.e. the decomposition Y in (2.21) depends only on q0 ≤ q unknown scales. Hence, given an observation y , the Bayesian estimation provides the scale components : yˆ/j = Kj U Dν−1ˆb , ∀j = 1, ..., q ˆb = (σ 2 D−1 + Iq )−1 U y , 0 ν 0
(2.24) (2.25)
where Dν is the diagonal matrix of the q0 largest eigenvalues ν1 ≥ ... ≥ νq0 of the covariance matrix of Y : q σ,Λ = Kj , (2.26) K j=1
(Λ) = K ¯ σ¯ ,Λ Y , which is the extension of (2.5) to the vectorial case, satisfies the heat Note that the model Y ∂ /j does not match the previous smooth version equation ∂λ Y (Λ) = −LY (Λ) for every j. In fact, the component Y 9
j
(λj ) in (2.5), since the sum Y
Spatial Statistics
j
. (λj ) does not reconstruct Y Y
http://dx.doi.org/10.1016/j.spasta.2013.11.004
Preprint
Spatio-Temporal Graphical Modeling
13
σ,Λ U = U Dν . The Bayesian and U is the matrix of the associated orthonormal eigenvectors : K estimation is done with respect to the prior distribution B ∼ N (0, Dν ). where B is a Proof. Firstly, since the columns of U are independent, we can write Y = U B q0 -random vector. Then, the previous spectral equation allows to rewrite =K σ,Λ U D−1 B Y = U B ν
=
q
Kj U Dν−1 B
=
j=1
where
Y /j
=
Kj U Dν−1 B
q
/j , Y
j=1
,
(2.27)
implies the covariance which provides in particular the components (2.24). Secondly, Y = U B matrix = UK σ,Λ U = Dν . Cov(B) (2.28) consists in maximizing For a given observation y , the Bayesian estimation of the occurrence of B ∼ the log-likelihood log p(b | y) = log p(y | b) + log p(b) + Cte. Given the Gaussian laws B /0 2 N (0, Dν ) and Y ∼ N (0, σ0 In ), this amounts to compute ˆb = argmax − 1 y − Ub2 − b D−1b , ν σ02 b
(2.29)
which provides the expression ˆb in (2.25). • Scale selection - To determine an estimate qˆ of q0 , we perform a diagonalization of the covari σ,Λ of which we retain only the qˆ largest eigenvalues according to the criteria ance matrix K qˆ ν qi=1 i = 1 − , i=1 νi
(2.30)
where is a positive parameter chosen close to 0, typically = 0.01 or 0.05. This criterion is related to that used in Principal Component Analysis [17, 30]. It means the dispersion of Y can be approximatively represented by qˆ linearly independent components with an information loss determined by . ˆ These scales are associated From qˆ, we can then achieve the selection of relevant scales Λ. 2 with the qˆ largest σj , i.e. the scales whose components are the most involved in the dispersion . As may be seen experimentally, the estimates σ ˆ have values of Y ˆj2 associated to scales in Λ\Λ close to 0. This selection therefore allows a pruning of non-significant scales, which evokes the Ridge regression [17].
Spatial Statistics
http://dx.doi.org/10.1016/j.spasta.2013.11.004
Preprint
14
B. Chalmond
2.3.3. Multi-scale decomposition of spatio-temporal GRF (2.21) is extended to obtain the multi-scale representation of the time series : t = Y =
q j=1 q
/j /0 Yt + Y t
(2.31) t/j σj Y
t/0 σ0 Y
+
, t = 1, ..., T ,
j=1
or in matrix form
⎡ ⎤ /j Y Y1 q 1 ⎢ ⎢ .. ⎥ ⎢ .. ⎣ . ⎦= ⎣ . j=0 T /j Y Y ⎡
T
or simply Y=
q
⎤
⎤ /j Y 1 ⎥ ⎢ . ⎥ ⎥= ⎥ σj ⎢ ⎦ ⎣ .. ⎦ , j=0 /j Y
Y/j =
j=0
⎡
q
T
q
σj Y/j .
j=0
With these notations, (2.31) is rewritten + Y/0 = Y=Y =
q j=1 q
Y/j + Y/0 σj Y/j + σ0 Y/0 .
j=1
/j , ..., Y /j ) obeys (2.9) or equivalently Property 1. For j = 1, ..., q, each component Y/j = (Y 1 T t/j } is such that Cov(Zt/j ) = Kj = e−λj L . (2.10). Its innovation process denoted {Z is : σ,Λ,τ of Y The covariance matrix K σ,Λ,τ = K
q j=1
Kj =
q
σj2 Kj ,
(2.32)
j=1
where Kj and Kj are cross-covariance matrices denoted Kj = Cross-Cov(Y/j ) and Kj = Cross-Cov(Y/j ) and depending on (σj2 , λj , τ ) and (λj , τ ), respectively 10 . Therefore, we have ()
to determine the expression of Kj . This expression depends on the lag-covariance function Γj
10 (σ, Λ) refers to multi-scale, and τ to time scale. Thus, the covariance matrices K , K σ,Λ and Kσ,Λ,τ respectively λ denote the mono-scale spatial matrix, the multi-scale spatial matrix and the multi-scale spatio-temporal matrix.
Spatial Statistics
http://dx.doi.org/10.1016/j.spasta.2013.11.004
Preprint
Spatio-Temporal Graphical Modeling
15
t/j Y /j ] of Y/j , as we now illustrate in the case of autoregressive model 11 . = IE[Y t+ • AR(1) model Proposition 3. For the AR(1) model, the following spatio-temporal factorization, made up of a D ’temporal’ matrix Φτ and a ’spatial’ matrix K σ,Λ , can be written in the form : σ,Λ,τ = K D K σ,Λ Φτ .
(2.33)
/j , ..., Y /j ) is modeled by (2.15) : Proof. Recalling Property-1, each scale component Y/j = (Y 1 T /j Y t
/j
/j
= Φ1 (τ ) Y t−1 + Zt ,
/j
t ∼ N (0, Kj ). For this process, Kj is organized in blocks of size n × n (see [4]) satisfywith Z ing : ()
(0)
Kj [t, t + ] = Γj = Γj Φ1 (τ ) , for ≥ 1, (0)
with Γj
(0)
= Φ1 (τ )Γj Φ1 (τ ) + Kj ,
Kj [t, t − ] = Kj [t, t + ] . Therefore, Kj can be expressed as : Kj = KD j Φτ . Φτ is a Toeplitz block-matrix whose -th upper diagonal is filled with Φ1 (τ ) : Φτ [t, t + ] = Φ1 (τ ) , and Φτ [t, t− ] = Φτ [t, t+ ] for the lower diagonal. KD j is the T -diagonal block-matrix (0)
(0)
KD j = diag[Γj , ..., Γj ]. Recalling (2.32), we get the matrix factorization ⎛ ⎞ q q D D σ,Λ,τ = σj2 Kj = ⎝ σj2 Kj ⎠ Φτ = ˙ K K σ,Λ Φτ . j=1
j=1
are • An approximated calculation - From (2.27) the spatio-temporal scale components of Y formally (2.34) Y/j = Kj UD−1 ν B , ∀j = 1, ..., q, σ,Λ,τ . To avoid where (U, Dν ) comes from the spectral decomposition of the covariance matrix K the computation of the full spectral decomposition (U, Dν ) of Kσ,Λ,τ , we reformulate the expression of the components Y/j by taking advantage of the factorization property (2.33). Recall 11 () Γj
This formulation can be extended to the non-stationary case in which (σj , λj ) depends on time t, and hence also ()
For this, it suffices to denote this dependency as (σj (t), λj (t)) and Γj,t , what we will not do to avoid overloading the formulas.
Spatial Statistics
http://dx.doi.org/10.1016/j.spasta.2013.11.004
Preprint
16
B. Chalmond
= UB (see Proof of Proposition 2). Instead of considering that the expression Y/j is based on Y =U B where this representation, we consider Y = Φτ U˙ U D ˙ ˙ ˙ K σ,Λ U = UDν .
(2.35)
is a basis of decomposition ”intermediate” between the spatial basis U ˙ and We conjecture that U D −1 σ,Λ U˙ D ˙ ν B, and therefore =U B = Φτ K spatio-temporal basis U. The new basis (2.35) implies Y with (2.32), we have ˙ ˙ −1 (2.36) Y/j = σj2 Φτ KD j U Dν B , ˙ but not the spatio-temporal basis U. which requires only to compute the spatial basis U, =D ˙ ν , the generalization of (2.29) amounts to compute for every Finally, by imposing Cov(B) time-series y : −1 ˙ν ˆ = argmax − 1 y − Φτ U˙ bD b2 − b, b 2 σ 0 b
(2.37)
−1 U = Φτ U. ˙ −1 y with U ˙ In fact, this generalˆ = (σ02 D which provides the expression b ν + U U) ization is incomplete because unlike (2.28), Cov(B) is not a diagonal matrix as we can see from =U B since U is not an eigenvector basis of K σ,Λ,τ . Y
2.4. Parameter estimation • Random Fields at Fixed-Time - We consider the multi-scale spatial model (2.22)-(2.23) for which the weight parameters σ ¯ associated to a given set Λ, have to be estimated. Let y be an . The unknown weights {σ 2 }q are estimated using the maximum likelihood occurrence of Y j j=0 ¯ for a given Λ. Assumprinciple. Let L(¯ σ |Λ) = log(pσ¯ ,Λ (y )) denote the exact log-likelihood of σ ¯ σ¯ ,Λ ), the log-likelihood is expressed up to an ing the probability density pσ¯ ,Λ is Gaussian N (0, K additive constant : ¯ σ¯ ,Λ | − y K ¯ −1 y . (2.38) L(¯ σ |Λ) = − log |K σ ¯ ,Λ The maximum likelihood estimate is computed under the constraint of positivity of the parameters σ ¯: σ ˆ (Λ) = argmax L(¯ σ |Λ) under the constraint σ ¯ >0. (2.39) σ ¯
For moderate sizes of n, the non-linear programming algorithms using gradient descent tech¯ σ¯ ,Λ | and niques are operational. For larger dimensions, the computation of the determinant |K −1 ¯ the inverse Kσ¯ ,Λ becomes a challenging issue. We experimented in the context of images on irregular grid an alternative method based on Monte Carlo computation [28], which avoids direct calculation and can be adapted to our case, (see also [13, 20] for other techniques). This computation is beyond the scope of our article.
Spatial Statistics
http://dx.doi.org/10.1016/j.spasta.2013.11.004
Preprint
Spatio-Temporal Graphical Modeling
17
¯ σ¯ ,Λ and the spatio-temporal kernel is K ¯ σ¯ ,Λ,τ , • Random Field Time Series - The space kernel is K the latter depending on the new parameter τ . The simultaneous estimation of all parameters (¯ σ , τ ) is a heavy task. As in (2.39), it would require to maximize the likelihood −1
¯ σ¯ ,Λ,τ | − y K ¯ σ¯ ,Λ,τ y , L(¯ σ , τ |Λ) = − log |K
¯ σ¯ ,Λ,τ = K σ,Λ,τ + σ02 InT , as defined in Section 2.3.3. In practice, (¯ where K σ , Λ) is estimated at fixed-time using (2.39) 12 , then given this estimate and the spatio-temporal model, τ is estimated from the observed time-series. So for the AR(1) model, thanks to the independency of the t , the maximum likelihood principle leads to compute the following estimate : innovations Z τˆ = argmax L(τ |σ, Λ) , τ
L(τ |σ, Λ) = −
T t=2
yt − Φ1 (τ )yt−1 2 −1 .
(2.40)
Kσ,Λ
t . In σ,Λ = q σ 2 Kj is the multi-scale covariance matrix (2.26) of the innovations Z where K j=1 j our experiments, Φ1 (τ, .) is given by (2.17), or (2.20) in which it depends also on t. • Summary - The procedure for estimating the parameter set {¯ σ , Λ, τ, q, b} and the component set {y/j , j = 1, ..., q} is summarized as follows. Given an observed time series y and a domain Λ : 1. Compute at fixed-time σ ˆ (Λ) using (2.39). ˆ 2. Given , from Kσˆ ,Λ compute qˆ using (2.30) and then extract Λ. ˆ and σ ˆ . 3. Compute τˆ using (2.40), Λ ˆ (Λ) ˆ using (2.37) and {ˆy/j , j = 1, ..., qˆ} using (2.36). 4. Compute b
3. Experiments Fig.1 illustrates the multi-scale decomposition at fixed-time of a field yt that represents gene expressions of Bacillus Subtilis. The underlying graph G comes from the regulatory network of the bacterium. In steady-state, gene expressions are assumed to be governed by the model (2.23). We see the structuring effects of the method in terms of gene grouping as this had already been shown for other regulatory networks [10]. The following experiments are intended to illustrate the sensitivity of the method to highlight some temporal effects. This study is based on the use of statistical hypothesis testing [24]. 12
Spatial Statistics
For a single time t in case of stationarity, and for every time t otherwise.
http://dx.doi.org/10.1016/j.spasta.2013.11.004
Preprint
18
B. Chalmond
3.1. Time effect testing Fig.2 relates to data simulated according to the AR(1) model (2.15). The graph G = (V, E) is structured into subgraphs called regulons (Fig.2-a, see also Fig.3). Each regulon has a main node, called regulator, connected to other nodes and regulators (Fig.2-b). The temporal graph (V, E) translates the connexions between the regulators. Fig.2-c shows an observation yt of the random field, and Fig.2-d depicts the profile of the first four observations y1 , ..., y4 . To highlight the effect of the temporal component, we consider the hypothesis test [H0 : τ = τ0 versus H1 : τ = τ0 ], where H0 imposes a restriction on τ . For instance, τ0 = 0 allows to test the significance of the one-lag AR model. The classical procedure of hypothesis testing requires a specific statistic of which we know the distribution under H0 . The well-known likelihood ratio statistic is LR = 2 log( ˆ1 / ˆ0 ) = 2(Lˆ1 − Lˆ0 ) where ˆi is the maximized likelihood under Hi . Under H0 , this asymptotically has a χ2 distribution with degrees of freedom equal to the number of restrictions imposed under H0 . In the case of large sample size, H0 is rejected if LR is greater than a χ2 critical value. For instance, to test the significance of a p1 th-order AR coefficient matrix ˆ 0 |/|K ˆ 1 |) where K ˆ i is the estimate against a lower p0 th-order, the likelihood ratio is LR = log(|K of the covariance matrix of the innovation process under Hi [4, 16]. σ,Λ is not required since this matrix depends only on In our case, the full estimation of K σ,Λ , the likelihood ratio is σ that is given by the Step 1 of the Summary. Therefore, given K simply LR = 2 (L(ˆ τ |σ, Λ) − L(τ0 |σ, Λ)). In our simulations, T can be chosen large. However, in the case of small sample size, the χ2 approximation becomes too coarse. Critical values can be obtained using a Monte Carlo procedure, which is close to the popular bootstrap. drawn from the distribution Let {zb , b = 1, ..., B} be a set of B independent samples of Z b N (0, Kσˆ ,Λˆ ), and {y , b = 1, ..., B} the associated time-series generated under H0 using the AR model. For each time-series we compute the likelihood ratio LRb , which provides a sample of the LR statistic under H0 . From the histogram of this sample, the critical value is then computed for a given p-value. Unfortunately, the computational cost of this procedure is a major drawback. We now deal with real data. The graph G = (V, E) in Fig.3 is a small subgraph extracted from the regulatory network of B.Subtillis. This graph is composed of four regulons, respectively identified by four colors. A time series y1 , ..., yT of gene expressions has been acquired on G over T = 11 time points. Fig.4 shows the simplified gene expressions for this time series, as obtained in [6]. This treatment was done independently at each time t using a procedure that does not take into account time dependence, indeed it is based on (2.24). At each time t, the nodes are grouped into modules such that the expression profile of each module differs from that of its neighbors, relative to the Laplacian neighborhood structure L, as can be glimpsed in Fig.2-d. This procedure is an extension of the Lindeberg’s blob detection algorithm [25] to non regular graphs. The modules are organized around the local extrema of the scale components obtained as follows: /j argopt λj (Lyt )v , j,k,v ∈Nvk
where Nvk ⊂ V denotes the neighboring nodes of v of order k, (k = 1, ..., κ). k = 1 means the nearest neighbors (NN), k = 2 means the nearest neighbors of v to which their NN are added,
Spatial Statistics
http://dx.doi.org/10.1016/j.spasta.2013.11.004
Preprint
Spatio-Temporal Graphical Modeling
19
and so on. The expression of the regulons fluctuates over time and therefore the shape of the detected modules also. In [6], a hypothesis was advanced which postulates that regulon expressions from time tI = 7 were influenced by a change in the nutritional environment of the bacteria, what should imply that the module configurations become quite stable from this instant. Nevertheless by examining Fig.4, we see that this hypothesis is contradicted between t = 7 and t = 11, at time t = 8. By repeating the same procedure as in [6] but using the model with time dependence (2.15, 2.36), the module configurations at time t = 8 become similar to the configurations at times t = 7, 9, 10, 11. This result is not so surprising since it is well known that Markovian dependence can highly improve the sensitivity of the detection of low signal, especially if it is surrounded by salient signals.
3.2. Drift on image sequence Fig.5 shows a SDE simulation for the drift (2.19). This drift has the effect of swapping the relative values of the variables Y (1) and Y (2) after the time tI . These variables are respectively identified by green and red colors. Hypothesis testing carried out on [H0 : cI = 0 versus H1 : cI = 0] is used to decide on the presence of a drift. The log-likelihood in (2.40) is rewritten L(τ |σ, Λ, cI ) where cI recalls that Φ1 (τ, t) depends on cI when t > tI . The likelihood ratio becomes LR = 2(L(ˆ τ |σ, Λ, cI )− L(ˆ τ |σ, Λ, 0)). This test can be generalized in order to estimate also tI . For this, LR is computed for each times tI = tmin , ..., tmax . From the resulting series {LR(tI )}, we select tˆI = arg maxt LR(tI ) and then we decide that tˆI is an intervention time if LR(tˆI ) is greater than a given critical value. We now briefly present the application that has motivated the introduction of the model (2.19). Fig.7 shows a time-series of T = 20 images acquired from a cell in suspension submitted to a dielectric field in a micro-fluidic device [29, 40, 36]. Dielectric field cage (DFC) technology has been demonstrated as a useful tool for manipulation of living cells in suspension. DFC can be used to spatially position non-adherent cells. It is also possible to perform cell fusion. If two cells are put into contact, the application of a high intensity electric pulse can lead to membrane rupture and subsequent fusion of cells. In this context, there is need for a quantitative analysis of membrane dynamics. On the images shown in Fig.7, the nodes of a graph G describing the structure of the membranes are overlaid. The fluorescent signals acquired on the n = 13 nodes over time compose a spatio-temporal GRF, which is represented by a SDE model of type (2.19). The structure of this graph is organized as above around three regulons. We see in Fig.6 that the mean intensity of the red regulon tends to become predominant compared to that of the green regulon. The model (2.19) is therefore appropriate for modeling the effects of the intervention on the three regulators, these effects being diffused over the whole graph. A detailed presentation of this work will be given in a future article.
Spatial Statistics
http://dx.doi.org/10.1016/j.spasta.2013.11.004
Preprint
20
B. Chalmond
4. Concluding remarks At the beginning of this study, we had the multi-scale graphical model that we developed for module detection at fixed-time. Subsequently, our experiments have raised the issue of modeling a temporal process whose dynamics is maintained by an intrinsic latent process governed at every instant by the previous spatial model. Here, we show that this extension follows naturally from the spatio-temporal heat differential equation, its solution being a Wold representation whose innovations consist of the latent process. This property allows us to revisit classic models in conjunction with the Wold representation and also to add a drift term. A procedure is then described for estimating both the model parameters and the multiscale components of the process. This provides a general framework for obtaining a parametric representation of the phenomenon under consideration. In particular, this allows to perform quantified analyses on the basis of the estimated parameters as we have illustrated from statistical hypothesis testing. Some remarks can be made on two main assumptions underlying the model, namely, the knowledge of an invariant graph and the stationarity. The choice of the graph depends on the task at hand. However, temporally rewiring networks could be needed for capturing the dynamic interactions between variables. This difficult problem is accentuated by the small size of the time series. Stationarity is also required. Nevertheless, the drift component introduces a temporary change that a stationary model without drift would interpret as a non-stationarity. In our model, At is a time-dependent matrix. This is also the case in [32] in which a time-varying dynamic Bayesian network is proposed for modeling the varying directed dependency structures, but without drift. Furthermore, in this article the time-varying graph is seen as a non-stationarity. Finally, a delicate point is worth noting. It concerns the efficient computation of estimates, although the raised issues are old, as the inversion of Toeplitz matrices.
Acknowledgments The Editor and referees are gratefully thanked. Their comments have improved the manuscript. The author is grateful to Alain Trouv´e and Yong Yu for the experience we shared on the multiscale decomposition of images, which was an inspiration, as well as Benno Schwikowski and Xiaoyi Chen for valuable discussions about the adaptation of Bacillus subtilis to nutritional environments. I also thank warmly Philippa Gergaud for editing the English of this manuscript.
References [1] Yong-Yeol Ahn, James P. Bagrow, and Sune Lehmann. Link communities reveal multiscale complexity in networks. Nature, 466(5):761 764, 2010. [2] H.T. Banks, K. L. Rehm, and K. L. Sutton. Conversion of a dynamic social network stochastic differential equation model to Fokker-Planck model. Technical report, Center for Research in Scientific Computation, North Carolina State University, 2009. [3] Mikhail Belkin and Partha Niyogi. Semi-supervised learning on riemannian manifolds. Machine Learning, 56:209–239, 2004.
Spatial Statistics
http://dx.doi.org/10.1016/j.spasta.2013.11.004
Preprint
Spatio-Temporal Graphical Modeling
21
[4] George E. P. Box, Gwilym M. Jenkins, and Gregory C. Reinsel. Time Series Analysis: Forecasting and Control. John Wiley and Sons, 2008. [5] Bernard Chalmond. Modeling and Inverse Problems in Image Analysis. Springer-Verlag, 2003. [6] Bernard Chalmond and Xiaoyi Chen. A graphical modeling to scan network activity at modular level. Technical report, Institut Pasteur / Cergy-Pontoise University, 2012. Submitted. [7] Bernard Chalmond, Franc¸ois Coldefy, Etienne Goubet, and Blandine Lavayssi`ere. Coherent 3-d echo detection for ultrasonic imaging. IEEE Transaction on Signal Processing, 25(3):592–612, 2003. [8] Bor-Sen Chen and Wei-Sheng Wu. Robust filtering circuit design for stochastic gene networks under intrinsic and extrinsic molecular noises. Mathematical Biosciences, 211:342– 355, 2008. [9] Noel A. Cressie. Statistics for spatial data. John-Wiley and Sons, 1993. [10] Guro Dorum, Lars Snipen, Margrete Solheim, and Solve Saebo. Smoothing gene expression data with network information improves consistency of regulated genes. Statistical Applications in Genetics and Molecular Biology, 10(1), 2011. [11] Michael B. Elowitz, Arnold J. Levine, Eric D. Siggia, and Peter S. Swain. Stochastic gene expression in a single cell. Sciences, 297:1183–1186, 2002. [12] B.F. Finkenstadt, D.J. Woodcock, M. Komorowski, C.V. Harper, J.R.E. Davis, M.R.H. White, and D.A. Rand. Quantifying intrinsic and extrinsic noise in gene transcription: an application to single cell imaging data. Technical report, Centre for Research in Statistical Methodology, Warwick University, 2012. [13] Jerome Friedman, Trevor Hastie, and Robert Tibshiran. Sparse inverse covariance estimation with the graphical lasso. Biostatistics, 9:432–441, 2008. [14] Carlo Gaetan and Xavier Guyon. Spatial Statistics and Modeling. Springer-Verlag, 2009. [15] C. A. Glasbey and D. J. Allcrof. A spatiotemporal auto-regressive moving average model for solar radiation. Journal of the Royal Statistical Society: Series C (Applied Statistics), 57(3):343–355, 2008. [16] James D. Hamilton. Time Series Analysis. Princeton University Press, 1994. [17] Trevor Hastie, Robert Tibshirani, and Jerome Friedman. The elements of statistical learning. Springer, 2009. [18] Lasse Holmstrom, Leena Pasanen, Reinhard Furrer, and Stephan R. Sain. Scale space multiresolution analysis of random signals. Computational Statistics and Data Analysis, 55:2840–2855, 2011. [19] Dirk Husmeier. Sensitivity and specificity of inferring genetic regulatory interactions from microarray experiments with dynamic bayesian networks. Bioinformatics, 19:2271–2282, 2003. [20] Harri Kiiveri and Frank de Hoogb. Fitting very large sparse gaussian graphical models. Computational Statistics and Data Analysis, 56:2626–2636, 2012. [21] Risi Imre Kondor and John Lafferty. Diffusion kernels on graphs and other discrete input spaces. In Morgan Kaufmann, editor, International Conference on Machine Learning, pages 315–322, 2002. [22] Gert R. G. Lanckriet, Tijl De Bie, Nello Cristianini, Michael I. Jordan, and William Stafford
Spatial Statistics
http://dx.doi.org/10.1016/j.spasta.2013.11.004
Preprint
22
[23] [24] [25] [26]
[27] [28] [29]
[30] [31]
[32] [33]
[34]
[35]
[36] [37]
[38] [39]
[40]
Spatial Statistics
B. Chalmond Noble. A statistical framework for genomic data fusion. Bioinformatics, 20(16):2626– 2635, 2004. Ann B. Lee and Larry Wasserman. Spectral connectivity analysis. Journal of the American Statistical Association, 105, 2010. Erich Leo Lehmann and Joseph P. Romano. Testing Statistical Hypotheses. Springer, 2004. Tony Lindeberg. Feature detection with automatic scale selection. International Journal of Computer Vision, 30(2):77–116, 1998. Bojan Mohar. Some applications of Laplace eigenvalues of graphs. In G. Hahn and G. Sabidussi, editors, Graph Symmetry: Algebraic Methods and Applications, volume Ser. C 497, pages 225, 275. Kluwer, 1997. Chris. J. Oates and Sach Mukherjee. Network inference and biological dynamics. Annals of Applied Statistics, 6(3):1209–1235, 2012. R. Kelley Pace and James P. LeSage. A sampling approach to estimate the log determinant used in spatial likelihood problems. Journal of Geographical Systems, 11:209–225, 2009. Olivier Renaud, Jose Vina, Yong Yu, Christophe Machu, Alain Trouv´e, Hans Van der Voort, Bernard Chalmond, and Spencer Louis Shorte. High-resolution imaging of living cells in flow suspension using axial-tomography: 3-D imaging flow cytometry. Biotechnology Journal, 3(1):53–62, 2008. Ioannis D. Schizas and Georgios B. Giannakis. Covariance eigenvector sparsity for compression and denoising. IEEE Transactions on Signal Processing, 60(5):2408–2421, 2012. Amit Singer, Radek Erbanb, Ioannis G. Kevrekidisc, and Ronald R. Coifman. Detecting intrinsic slow variables in stochastic dynamical systems by anisotropic diffusion maps. PNAS, 106(38):16090–16095, 2009. Le Song, Mladen Kolar, and Eric P. Xing. Time-varying dynamic Bayesian networks. In NIPS, pages 1732–1740, 2009. Jian Sun, Maks Ovsjanikov, and Leonidas Guibas. A concise and provably informative multi-scale signature based on heat diffusion. In Eurographics Symposium on Geometry Processing, volume 28. Blackwell Publishing, 2009. Sorin Tanase-Nicola, Patrick. B. Warren, and Pieter Rein ten Wolde. Signal detection, modularity and the correlation between extrinsic and intrinsic noise in biochemical networks. Physical Review Letters, 97(6), 2006. Kevin Thon, Havard Rue, Stein Olav Skrovseth, and Fred Godtliebsen. Bayesian multiscale analysis of images modeled as gaussian markov random fields. Computational Statistics and Data Analysis, 56:49–61, 2012. Guilhem Velve-Casquillasa, Mal Le Berrea, Matthieu Piela, and Phong T. Trana. Microfluidic tools for cell biological research. Nano Today, 5(1):28–47, 2010. S. V. Vishwanathan, Alexander J. Smola, and Ren´e Vidal. Binet-cauchy kernels on dynamical systems and its application to the analysis of dynamic scenes. International Journal of Computer Vision, 73(1):95–119, 2007. Grace Wahba. Spline models for observational data. SIAM, 1990. Zhi Wei and Hongzhe Li. A hidden spatial-temporal markov random field model for network-based analysis of time course gene expression data. Annals of Applied Statistics, 2:408–429, 2008. Yong Yu, Alain Trouv´e, Jiaping Wang, and Bernard Chalmond. An integrated statistical
http://dx.doi.org/10.1016/j.spasta.2013.11.004
Preprint
Spatio-Temporal Graphical Modeling
23
approach for volume reconstruction from unregistered sequential slices. Inverse Problems, 24(4):58–74, 2008.
Spatial Statistics
http://dx.doi.org/10.1016/j.spasta.2013.11.004
Preprint
24
B. Chalmond
gabT
gabR
cdd
dps
yjgD ycbP
ywiE
yxnA
csbA
ypuB ydaS
folD
xpt
ydaT
glgA
ysxE
yodQ
yheD
prkA
yusR usd
yjdH
yodR
ycgG
spmA yngG
yhjR
yqfZ yknT
yheC ydhDyqfD
yitC ydcC ypjB
ycgF
yunB
yaaH
yfhM
mreC
recU
ytvI
yjcA
ytrH
yfnD
ponA mreB ugtP
fabF
plsC
tuaD
ymfF
bltR
yrhH
hemX
uxaB
bdbC
yxjF
ctaA narK
psd
bdbD
pssA rsiX
yrrS ywnJ spoIIP
yknX yknW ybfP
yxjJ
fapR
yqhP
ycnJ
walR
yknZ
feuB
minD
fadM uxaC
fadR hmp
yesM
plsX
tuaH
dhbF
icd
nasE
ppsC
degQ
comK
sigH cypX rapH
kipR kipI ywlG comFA
yvrN
ycsI nasC
gltA
ureC ureB
yvzA yvcB
ureA
sigF
pbpH
yddJ
yvmC
ywbD
gerD
tlp
ycbCycbD
yhjC
oppD
yhcN
hutU
yydJ
yphF
yflB
ydjH
dnaG
ywbF ydeJ
ppsD
rapA
flgL gabP
racA
yorB
yocD pnbA
skfG
yqcG
ylbA
cotM
ppsA rocG skfH
cgeA
ynzD
yclJ
ybdN
spoIIGA
cgeD
yybL
ybaK
ylqB
tlpA
cotB
cotX
cgeE cgeC cotV ftsY cgeByxeE
yybN yybM
ybdO
codY
cwlH
cotW
yjcN
skfE
yxnB
cotY
sspG
sigL
yqxJ
yxbC yxbD
sinI
sigA
skfC
skfF
asnH dppA
bacF
yvqJ skfB
yxbB
dppE dppCkapB yxaM
dppD
bacB
yxkC
−2
yobB
slrR yfmI
kinB
bacC bacA bacE bacD
ymaE
scoC flgM
sigG
ydiPyneB
ruvA yqzHyhaO
yybK
spo0A
yscB
spoVAB
ylaJ
lytC
yhcV
yraF
gerKA
ybxH
ydfS
bcd
appF yfkQ
appC
salA
yfjR
cotH cotZ
200
400
600 800 1000 component at scale5
1200
1400
1600
200
400
600
1200
1400
1600
1
yclH
glcU
nfo
hag senS
ilvD
fabL
yufP
sspE
yufQ metP
sepF
bkdR yrrL
rocB lytF
ylmE
ykaA
metS
yvyE
yppE
yrkP
pdxS
pdxT
ycgA
ywcH
yusE
dnaN
clpQ
ahrC
yrkO
yqcF
ylmD
yqzC divIVA
med ylmH
ftsX tkt
yneF codV ylmA
yfmJ yqzD accD
yerB
nfrA
ilvA yufO
ypmP frlB frlN yufN frlD yuiA
yrkN
yrkQ
comZ ykfC
metQ metN
frlM frlO
−1
ylmG accAftsE yvhJ
yuxH
ykfD
yurJ
ykfA
guaB
csrA
0
yngB
ykcC
rocR
rocC
flgC cheB fliJ fliY fliL fliP cheA flgB fliM cheY flhA ylxF ycgN fliI flgE flhF cheW fliE fliR fliF cheD fliQ
ywfH
yuiB
pbpF
ykcB
bkdAAbuk
rocD rocE fliZ
cheC
fliG
yisY
yraE
yqfX sspD yrrD spoVAC ydfR gerKC adhB sspJ splA ykoU
fliK
sigD
flhB
ycgM
yraD
spoVAD sspP
bkdB bkdAB
ylxH fliH
gdh
ykoV
yfkR
yycA
yngA
ptb
lpdV
appB
yoaR
sspK
gerKB gerBA
appD
sspA
spoVAF
ccpB
1600
yclI sspL yraG
splB
pdaA
1400
cotG
lytA
yitF sspO yndE yndD
1200
yurS
lytB
rocA gerBB ypzA
exoA sspM
600 800 1000 component at scale4
cotU
yndF
csgA sspI
yvdQ sleB gerBC
400
0
sipV
ydgH
yhaZ ligA
yhaM yozL ruvB aprX
sdpA yoqM
phrA dppB
yvyF yvyG yuiC
ykuL
parE yqjX tagC
xynD
lysA yusN
yteA
gerE
lytR
yjdG
yitG
spoVAA sspF yozQ sspB yqfU
200
yhjE dnaE xkdA levR
yrpD
sdpB
ansZ
flgK
yjcM
2
ynzC
levG
acsA
yomJ
spo0FkinA
ykuV
ydgG
ydiO
yneA
sacC levD
hutM ackA hutH hutI hutG
yxaJ
yydI
ydjG
epr
1600
yhjD
sda
uvrC pcrA dinB
levE
yolC ptkA
yxaL
rapC nprE rapE
gerAC sspN spoIVB
yhcQ ypeB sspC
uvrB
uvrX uvrA yerH
parC ykvR yorC
levF
hutP
gerAB gerAA
yckD
−2
deoR
licR
lexA
acoBacoL acoA
epsK
yydH
spoVT pbpG
1400
hbs
yydG yqxI
rok
sspH
epsI epsC epsB
aprE spoIIAA spoIIAB
dacF
oppF
nupC
licA
acoR
ykuU
yolB
epsJ
oppA oppB oppC
ycbG
licB
yozM
bdbA
bdbB
acoC
gpr
1200
licH
licC ctaB nrdE
albC
sunA
epsH
arsB yqhG yqcK
yopL malP
glpF
albD
albE albA albB sboX sboA
yolJ
tasA
rapG
yabT yqhH seaA sacX
galK
glpT
glpP
glpK
gmuA rbsA
albF albG
sunT
yfmG
sinR
epsO
ycbJ
600 800 1000 component at scale3
0
pdp
yorD
sipW
ydjI
spoIIQ spoIIR
arsR
xylR
treA
ymaB
glpD
gmuGgmuB gntK gmuD rbsC
dltD
dltC phy
recA ilvB
abrB ilvH leuD leuC
epsA
leuB
ybaJ
yokL
ilvC
leuA epsG
yheJ yokK
ytfJ yhfM
lonB yjbA
yhfW
xynB
msmX cimH
dltAdltB
phrE
spoVG epsE
bdhA
epsF
epsN yqzG
yyaC arsC ylbB yphA
kdgA
licT
bglH
xynPyxiE
xylB
yvmB
yclK
csn
epsL
epsM
rsfA
ksgA
ypfB nasB
iolB treR
malA
gmuR
gmuC yvnA rbsR amyE gmuE
phrC
yocH
yrzI
yphE ylbC ytfI
yodF
yveA
nrdI
gntR
rbsD
dltE epsD phrK
ctaO ywcE
yvcA
yuaB
glnM
gltB
400
2
bglS
xylA
kdgK
treP sucD glpQ
gntZ
skfA
spo0E
yweA
yvrO
tnrA
wapA
bpr levB
sacY
200
acuC
odhA
cydC nrdF
gmuF
srfAAcomS srfAB
yqaP
yvrP
glnR kipA
ycsF yycC ywdI
sacB
gltC
iolR
iolG
iolD
dctP
odhB
bglP
srfAC srfAD
yokI xynC
yokJ
ppsB
degU
ykzB
lipC sipT glnP ywdK
rapK
comA
glnA
ywrD nasA
iolJ acuA yvcI ywdA
iolC iolF
iolI
ccpA
rbsK
ctaC
ctaF ctaD ctaE ctaG
cydB
rbsB gntP
cwlS
yttP
abh
cccA qcrA
cwlD
ylaE
iolT
araL
iolH iolE
yxkF
kdgR
yngC
resD qcrC
qcrB
ycsG yqzE glnH
ssbA
araB
sacA acuB
nagA
kduD cotD
kduI
sdpI
yobO
yerI
pel
iolS
citM
araA araQ pta
araE
cydA
phrG
ycdA
ydeH yolA
−0.1
sacP
araD lcfB
etfA
cotC
yhaR kdgT
rapF
yncM
yydF
yxeD antE
ppsE
sacT
ypiF
araN
fadN
nasD
ftsZ
ftsA
phrF
dacC
yttA
ywlF alsT
noc dprA sbcD yisB
fadB lcfA
fadE
cydD
sdpR
vpr
lytE phrI
spoVS ywhH
ywoF
comEC
yflN
abfA
yjdB stoA
citB
dhbC
ycnK
guaC
bofC yhcM csbX
yycB
glnQ ywdJ
citZ
mdh
ctpB
nasF
dhbE yqxD besA dhbB dhbA
tuaF tuaG
ycxA yuzA
pucK pucL pucM
yxxG maf comGD
fadA
araR
sucC
araP
ccpC fabD
katX
pucJ
sbcC
citT
etfB
fabG
pucR addB rpsR rpsF comFC comGE yhjB ybdK
comGB addAglcR rsmG comFB comGC comC
cstA
araM
ylbP resB resA resE resC
yflA
ssbB nucA
1600
0
galT
sigX
comGA
1400
acdA
yfkN
nsrR
yqhB
ykoL
comGG comEA
abnA rpoE
minC
pucG
nin comEB comGF
1200
0.1
fadH fadG
spoIVCB
perR
yknY
yvgN
pucI
comN
600 800 1000 component at scale2
fadF
hemB
spoIIIC
yoaW
yxzE
pspA
ymzB yvgO yxaB
radC
yqhQ
arfM
ahpF hemC
fnr
exuR
ypuD
rsiW ybfO
ybbA feuA
narG mrgA
katA
ahpC
scoA
phoP
pbpI
pucA pucFpucH pucE
narI
zosA
hemA
yjmC exuT yxjC mmgA mmgC
yjmD uxuA mmgB
ybfM
spo0M
ytxJ
hemL
uxaA scoB
rsbV rsbW rsbX
tagA
ytxG yvyD ytxH feuC
fabI
pucD pucC pucB
−5
narJ narH hemD rapD pbpX
tagE
ydjM tagD tagF
mcsB mcsA
fur
fabHA fabHB
400
sigM
phoR
tagB
yjeA
acpA
ywjB
200
ywbO
yxjI
ykvT cwlO iseA
clpC
yfmC yclP
yjbC
guaD fhuD
yfhC
fhuG
1600
0
blt
spoIIID
phoD yfhA
fhuC
ykuO
csbB pstS tuaC
tuaB
tatCD
fhuB yfmD
yclO yxeB
ykuN
yfiY ykuP yhfQ ycgT yclQ yclN
1400
bltD
ylxX
tatAD
pstA pstC
pstBB
ywjA yusV yfmF yfiZ
yfmE
csoR
1200
ydfK ymfD
ylxW
murBsbp metA
divIC
pstBA copZ yotD
copA
600 800 1000 component at scale1
5
ymfH
mta
spoIID spoIIIAG
divIB
400
yndA
coxA spoIIIAD
spoIIIAH bofA spoVE spoIIIAE
cotJC yebC
murF ytpA
200
yqfC yokU
cotJB yesK
ycgR ycgQ
phoB ydhF
bcrC phoA
ykvI
spsK
mreD
ytpB
disA
yfkM tuaE
spoIVCA murGspoVD cotJA spoIIIAA spoIIIAC spoIIIAF yesJ
sigE ypbG
ydaH
ykuT radA
ysxC
ctsR
yodT
sodF yhaX spoIIIAB
yhaL coaX
yhdK
ywtF
ywaC
yqjL yceC
sigI
ywmF
clpXlonA
ysnD
kamA cotO yhdL
ypuA
ftsH
sigW yceE
ispDyceH
sigB
rnr
spsJcwlJ
yodP
yjfA
bmrR hprT
secDF bmr
yceD yceG yceF yacL
ylxP
yoaA
clpP trxA
spoVK yyaD yobW
ddl
tilS
ysdB
gsiB
spoVB
yngI
yfhL
rsbRD
yflH
ysnF
yfhD
cypC
yfkS
cotE
dacB
yngE
mbl glgD
yabQ
ytrI
yfkT yocB
clpE
ypqA ispG
glgP
spmB
spoIVFA
ydeC ctc yhdF yxiS
mgsR
nusB
purR
gerM
spoVID
yabR
spoIIM safA
ykgA
yhxD
ygxB
purE
yabJ
purN
purH
spoVR
yqxA
yngF glgB
yjaV asnO
5 0 −5
yngJ
yteV nucB
yodS
yocL spoVMspoIVA
yeaA
ydjP
rodA yxbG
purF
pbuO
yybI yhbH sqhC ylbJ spoIVFBykvUglgC yhxC
yuzC
yqeZ
yqfB yozO pbpE
racX
opuE yoxB
ispF
nadE
yhdN
yraA
aag
purB
purC purL
purS pbuX glyA purM
yitD
ydcA ytxC
ydbT
yteJ
yqfA sppA xpaC
katE
ykgB
yaaI
yfkH
ytkL
nhaX purQ
purA pbuG
purDpurK
data X ydbS
yycD
yuaI yvrE
yckC
yvlB mreBH rsgI ybfQ yoaG
yvaK
ywjC
yfkD
ywzA
ykzI ywlB
ydaD csbC
yxzF
yfhE
ydaP
yfhK
era
ythQ yvlD
yfkI ywsB ydaG
yflT bmrU aldY
yhcO
yvlA
yfhF yjgB
yjgC
yoxC ydhK
ohrB
yobJ yjoB
yvlC ythP
ydaE
yugU
yocK ybyB ywmE
ywtG sodA
gspA
yjzE
yfkJ yxkO gtaB
ytaB
yitT
yuaF yaaNydjO ywrE yoaF fosB
gabD
ycdF csbD yerD ydaF
ycdG
ydbD
yppD
800
1000
spoIIE
kinC
clpY ykfB
yusD
yfhP
yfmS
dnaA
yoaH
frlR
degR
yjcP hemAT
pgdS
flhO tlpB mcpBlytD
tlpC
mcpA
ywlC
motB
cheV
yjcQ
yydA
yfmT
fliS
motA
yvyC
yjfB flhP mcpC fliD
fliT
argF
argD
carB argC argG
argJ
argB
carA
argH
(a)
(b)
gabT
gabT
gabR
cdd
ykzI
yjgC
yoxC ydhK
ywsB
ywlB
yxzF ywiE
aag
purB
folD
xpt
yxbG yxnA
csbA
purDpurK
clpE
ydaT
yceD yceG yceF yacL
yfkS
sigB
rnr
yoaA
ponA mreB ugtP
radA
disA
purR
mreD
ytpB
ykuT
trxA
clpXlonA
ydhF
purF
ctsR tuaE
fabF
plsC
phoA
tuaD
bltD
ylxX
tatAD
yfiZ
yfmE
hemX
csoR
fhuD
yfiY ykuP yhfQ ycgT yclQ yclN ykuO
tagB
tagE
yjeA
tagF
tagA
bdbC
pssA rsiX
fhuG
clpC
spo0M
ytxJ
fur
pbpI feuC
yrrS ywnJ spoIIP
exuR
yqhP
walR
yflA
katX addB rpsR rpsF comFC comGE yhjB ybdK
nasE
cydD
bofC yxxG comGG comEA
ywlF
degQ
ycsG yqzE glnH
comK
sigH
nasA
cypX rapH
kipR kipI ywlG comFA
yvrN
ycsI nasC
gltA
ureC ureB
yvzA yvcB
ureA
sigF
pbpH
yokL
gltA
yvmC
arsB yqhG yqcK
ywbD
sspH
ptkA
yxaL yydJ
yphF
hutP yflB
ydjH
dnaG yjcM
bacC bacA bacE bacD
dppD
sigA
yhjE
yvmC
arsB yqhG yqcK
ywbD
yybK
cotU
cotG yurS
sspC
yfjR
yclH
bcd
appF yfkQ
appC
salA
ycgM
hag senS
yuiB
pbpF
ilvD
fabL
yufP
sspE
guaB
cheC
yurJ
rocD rocE
ylmE
yrkQ
ykfC ycgA
ywcH dnaN
clpQ yneF codV ylmA
yfmJ yqzD accD
yerB
nfrA
ilvA yufO
ahrC
ftsX tkt
yusE med ylmH
ylmD
yqcF
yrkO
yrkN
appC
salA
mcpA
ycgM
hag senS
yuiB
pbpF
ilvD
fabL
yufP
sspE
guaB
yqzC divIVA
degR
cheC
yurJ
rocD rocE
yrrL
ylmE
pdxT yrkQ
ykfC ycgA
ywcH dnaN
clpQ yneF codV ylmA
yfmJ yqzD accD
yerB
nfrA
ilvA yufO
yrkP
pdxS
ylmG accAftsE yvhJ
comZ metQ
ykaA
metS
yvyE
yppE yuxH
ykfD
metN
ypmP frlB frlN yufN frlD yuiA
csrA
yngB
ykcC
sepF
bkdR
rocB lytF
ahrC
ftsX tkt
yusE med ylmH
ylmD
yqcF
yrkN
yrkO
yqzC divIVA
yppD spoIIE
kinC
clpY ykfB
yusD
yfhP
yfmS yoaH
frlR
yjcQ yvyC
ywlC yydA argD
yfmT
fliS
motA
yjfB flhP mcpC fliD
fliT
yclH
yycA
ykcB
bkdAAbuk
rocR
rocC
fliZ
frlM frlO
yppD
pgdS
flhO tlpB mcpBlytD
tlpC
carB argC argG
fliK
sigD
flgC cheB fliJ fliY fliL fliP cheA flgB fliM cheY flhA ylxF ycgN fliI flgE flhF cheW fliE fliR fliF cheD fliQ
ywfH yufQ
metP ykfA
flhB fliG
yisY
dnaA motB
cheV
yngA
bkdB bkdAB
ylxH fliH
glcU yraE
ptb
lpdV
appB
yoaR gdh
yraD
yfmT
fliS yjcP hemAT
cotG yurS
cotU
yfjR yclI
bcd
appF yfkQ
ykoV
spoIIE
yusD yfmS
yvyC
argF
cotH cotZ
lytA lytC
ydfS
yqfX sspD yrrD spoVAC ydfR gerKC adhB sspJ splA ykoU
yoaH yjcQ
argD
cotX
lytB
rocA
gerKA sspK
yfkR
kinC ykfB
motA
yjfB flhP mcpC fliD yydA
yybK
spo0A
yscB
appD
sspA yraF
gerKB
clpY
dnaA frlR
ywlC
tlpA
cotB
cgeE cgeC cotV ftsY cgeByxeE
sspL yraG splB
gerBA
nfo
pdxT comZ
metQ
yfhP
ybdO
codY
cwlH
cotW cgeD
yybM
lysA
spoVAD sspP
yrkP
pdxS
yhaM yozL ruvB aprX
sipV
ydiPyneB
cotM cotY
sspG
yclJ yjcN yybL yybN
gerBB ypzA
yhcV
spoVAF
ykaA
metS
yvyE
ylmG accAftsE yvhJ
ydgH
yorB
cgeA
sigL ybdN
ybaK
yndF
sspO yndE yndD ylaJ
ybxH
yrrL yppE yuxH
parE yqjX tagC yhaZ ligA ruvA yqzHyhaO
ppsA rocG ynzD
skfE
yusN
exoA sspM
gerBC
ccpB
sepF
bkdR
rocB lytF
ykfD
metN
ypmP frlB frlN yufN frlD yuiA
yycA
yngB
ykcC
rocR
rocC
fliZ
frlM frlO
csrA
ykcB
bkdAAbuk
pdaA
fliK
sigD
flgC cheB fliJ fliY fliL fliP cheA flgB fliM cheY flhA ylxF ycgN fliI flgE flhF cheW fliE fliR fliF cheD fliQ
ywfH yufQ
metP ykfA
flhB fliG
yisY
yhjE dnaE
gerE
ylbA
skfH
ylqB
yitF yngA
bkdB bkdAB
ylxH fliH
glcU
yraD yraE
ptb
lpdV
appB
yoaR gdh
ykoV
yqfX sspD yrrD spoVAC ydfR gerKC adhB sspJ splA ykoU
ydiO ynzC
xkdA levR
yvqJ
skfG
yqcG yqxJ
yxbC yxbD
spoIIGA
csgA sspI
yvdQ sleB
yclI
ydfS nfo
yneA
levG
lytR
yjdG
yocD pnbA
skfC
skfF
asnH
sinI
yxnB
yitG
spoVAA
yteA
lytC
gerKA sspK
yfkR
spoVAD sspP
levE sacC levD
ykuL
xynD
yrpD
yobB
slrR skfB
yxbB
dppE dppCkapB yxaM
dppA sigA
spoVAB
lytA
appD
sspA yraF
gerKB gerBA
dppD
bacB
bacF
ydgG
sdpA yoqM yfmI
bacC bacA bacE bacD
hutM ackA hutH hutI hutG acsA
ywbF ydeJ
ppsD
racA
kinB ymaE
scoC
yxkC
lytB
sspL yraG splB
ybxH ccpB
yphF
hutP yxaJ
yomJ yjcM sdpB
rapA phrA dppB
yvyF yvyG yuiC
yhjD
sda
uvrC pcrA dinB
yydI dnaG spo0FkinA
ykuV
flgL gabP
flgM
sspF yozQ sspB yqfU
spo0A rocA
pdaA
yolC hutU
yydJ
ydjG
epr
flgK
sigG
yhcQ ypeB
ykvR yorC
levF
ptkA
yxaL
yflB
ydjH
rapC nprE rapE
gerAC sspN spoIVB
ansZ
yitF
yhcV
spoVAF
uvrB
parC
lexA
acoBacoL acoA
epsK
yydH
yhjC
oppD
yhcN
gerAB gerAA
cotH cotZ
sspH
epsI epsC epsB
aprE spoIIAA spoIIAB
dacF
oppF
gerD
tlp yckD
cotY cwlH
cotX
deoR
licR
uvrX uvrA yerH
yydG yqxI
rok
oppA oppB oppC gpr
ycbG
ycbCycbD
spoVT pbpG
tlpA
gerBB ypzA
sspO yndE yndD ylaJ
yabT yqhH seaA sacX
yhaM yozL ruvB aprX
sipV
ydiPyneB
cotM
yybM
ybdO
codY yscB
spoVAB
exoA sspM
ydgH
yorB
cotW cotB
cgeE cgeC cotV ftsY cgeByxeE
yndF
csgA sspI
yvdQ sleB gerBC
ydiO
dnaE
parE yqjX tagC yhaZ ligA ruvA yqzHyhaO
sspG
cgeD
nupC
licA
hbs yfmG acoC
ynzC
gerE
yclJ yjcN yybL yybN
licB
yozM
acoR
ykuU
yolB
epsJ
cgeA
sigL ybdN
lysA yusN
yteA
yneA
xkdA levR
ppsA rocG ynzD
ybaK
ylqB
yitG
spoVAA sspF yozQ sspB yqfU
levE sacC levD
ykuL
ylbA
skfH
skfE spoIIGA
licH
ctaB nrdE
bdbA
bdbB
epsO
yvqJ
skfG
yqcG yqxJ
yxbC yxbD
sinI
yxnB
licC
glpF
albC
sunA
tasA
epsH
lytR
yocD pnbA
skfC
skfF
asnH dppA
bacF
yopL malP
glpT
glpP
glpK
gmuA rbsA
albD
albE albA albB sboX sboA
yolJ
yorD
leuB sipW
sinR
rapG
yobB
slrR skfB
yxbB
dppE dppCkapB yxaM
bacB
yxkC
pdp galK
glpD
gmuGgmuB gntK gmuD rbsC
albF
xylR
treA
yvmB ymaB
albG
sunT
ilvH leuD leuC
epsA
yddJ
xynB
msmX
dltD
dltC phy
recA ilvB
abrB ilvC
leuA epsG
yheJ
ydjI
licT
bglH
xynPyxiE
xylB
kdgA
cimH
dltAdltB
phrE
spoVG epsE
bdhA
epsF
epsN
ycbJ
iolB treR
malA
yclK
csn
epsL
epsM
yokK ybaJ
glpQ
gmuR
spoIIQ spoIIR
yjbA arsR
xylA
kdgK
treP sucD nrdI
gntR
gmuC yvnA rbsR amyE gmuE
phrC
yocH
yrzI
yphE
yokL
gmuF
rbsD
dltE
pbpH
ytfJ yhfM
lonB
yhfW
iolT
odhA
cydC nrdF
gntZ
skfA
epsD phrK
ctaO ywcE
yvcA
yuaB
yqzG
arsC ylbB yphA
ydgG
sdpA yoqM yfmI
kinB ymaE
scoC flgM
xynD
yrpD yjdG
racA
phrA dppB
yvyF yvyG yuiC
levG
acsA
ywbF ydeJ
ppsD
sdpB
rapA
flgL gabP
hutM ackA hutH hutI hutG
yxaJ
yomJ
spo0FkinA
ykuV
epr ansZ
sigG
yvzA yvcB
rsfA
ksgA
ypfB nasB
sacY
sda
uvrC pcrA dinB
yydI
ydjG
rapC nprE rapE
gerAC sspN spoIVB
flgK
sspC
hutU
epsK
yydH
yhjC
oppD
yhcN
gerAB gerAA
ykvR yorC
levF
yolC
epsI epsC epsB
aprE spoIIAA spoIIAB
dacF
oppF
gerD
tlp yckD
spoVT pbpG
yhcQ ypeB
ureB ureA
ylbC ytfI
yodF
yyaC
oppA oppB oppC gpr
ycbG
ureC
sigF
glnM
gltB
yveA
lexA
acoBacoL acoA
rbsK
ctaC
ctaF ctaD ctaG
srfAAcomS srfAB
spo0E
yweA
yvrO
tnrA
ycsF yycC ywdI wapA
bpr levB yhjD
uvrX uvrA yerH
parC
acoC
ycbCycbD
yvrN
ycsI nasC lipC sipT glnP ywdK
sacB
gltC
uvrB
hbs
yydG yqxI yolB
rok
ctaE
cydB
rbsB gntP
cwlS
srfAC srfAD
yokI xynC
yqaP
yvrP
glnR kipA
deoR
licR
ppsB
degU
ykzB
nupC
licA
yozM
acoR
ykuU
epsH
yabT yqhH seaA sacX
licB
rapH
kipR ywlG comFA
licH
ctaB nrdE
bdbA
bdbB
sigH yokJ
cccA qcrA
cwlD
yttP
abh
rapK
comA
cypX
odhB
bglS bglP
resD qcrC
ylaE
ppsC
degQ
comK glnA
kipI
licC
glpF
albC
sunA
tasA
epsJ
ywlF
ywrD nasA
acuC
iolD
dctP
yobO
yerI
pel
qcrB
ycsG yqzE glnH
ssbA
yopL malP
glpT
glpP
glpK
albD
albE albA albB sboX sboA
yolJ
yfmG
sinR
rapG
yolA
iolR
iolG
iolI
ccpA ycdA
ydeH
phrF
dacC
iolJ acuA yvcI ywdA
iolC iolF
yxkF
kdgR
yngC
ppsE
alsT
noc dprA sbcD yisB
comEC
gmuA rbsA
albF
xylR
kduD cotD
kduI
sdpI
ywoF
pdp galK
glpD
gmuGgmuB gntK gmuD rbsC
yddJ
epsO
ycbJ
xynB
treA
yorD
sipW
ydjI
spoIIQ spoIIR
yjbA arsR
xylB
kdgA
yvmB ymaB
araL
iolH iolE
cydA
phrG
iolS
sacA acuB
pta
araE
rapF
yncM
yydF
yttA
glnQ ywdJ
citM araB
nagA
yhaR kdgT
nasD
yxeD antE
yhcM csbX
yycB
maf comGD
nin comEB comGF
albG
sunT
ilvH leuD leuC
epsA
leuB
ybaJ
glpQ
comGG comEA
msmX
dltD
dltC phy
recA ilvB
abrB ilvC
leuA epsG
yheJ yokK
ytfJ yhfM
lonB
yhfW
bofC yxxG
licT
bglH
xynPyxiE comN
cimH
dltAdltB
phrE
spoVG epsE
bdhA
epsF
epsN yqzG
yyaC arsC ylbB yphA
xylA
iolB
sacP araA
cotC
sdpR
vpr
lytE
ftsZ
ftsA
sacT
ypiF
araN araD lcfB
etfA
cydD
citB
phrI
spoVS ywhH
citT
yflN
abfA
fadB lcfA
fadE
araQ
ycnK
guaC
pucK pucL pucM
citZ
fadN
dhbC ycxA yuzA pucJ
comGB addAglcR rsmG comFB comGC comC
fadA
icd mdh
araR
sucC
yjdB stoA
sbcC
treR
malA
yclK
csn
epsL
epsM
rsfA
ksgA
ypfB nasB
katX rpsR rpsF comFC comGE yhjB ybdK
nasE
cstA
araM araP
ctpB
nasF
dhbE yqxD besA dhbB dhbA
ssbB nucA comGA
bglS
kdgK
gmuR
yesM
etfB
ccpC
dhbF
tuaF
iolR
acuC
odhA
treP sucD nrdI
gntR
gmuC yvnA rbsR amyE gmuE
phrC
yocH
yrzI
yphE ylbC ytfI
yodF
yveA
odhB
cydC nrdF
gmuF
rbsD
dltE epsD phrK
ctaO ywcE
yvcA
yuaB
glnM
gltB
cydB
gntZ
skfA
spo0E
yweA
yvrO
tnrA
wapA
bpr levB
sacY
rbsK
ctaC
ctaF ctaD ctaG
srfAAcomS srfAB
yqaP
yvrP
glnR kipA
ycsF yycC ywdI
sacB
gltC
ctaE
srfAC srfAD
yokI xynC
yokJ
ppsB
degU
ykzB
lipC sipT glnP ywdK
rapK
comA
glnA
ywrD
ssbA
abh
ppsC
alsT
noc dprA sbcD
cccA qcrA
rbsB gntP
cwlS
yttP
galT
fadR
yflA
tuaH
addB
iolT
araL
iolJ acuA yvcI ywdA
bglP
resD cwlD
qcrB
fadM uxaC
resA resE resC
plsX
iolG
iolD
dctP
yobO qcrC
ylaE
acdA
yfkN
nsrR
ylbP
tuaG
abnA rpoE
minD
yqhB
ykoL
iolI
yerI
pel
ywoF
yisB
comEC
yolA
ppsE
fadH fadG
spoIVCB
perR
yknY
hmp
fabD
iolH
iolC
kdgR
ccpA ycdA
ydeH
phrF
dacC
yknZ
fabG
iolF
yxkF
yttA
glnQ ywdJ
fadF
spoIIIC
yoaW
yxzE
pspA
ymzB yvgO yxaB
radC
minC
pucR
sacA acuB
pta nagA
kduD cotD
kduI
sdpI
yngC
exuR
ypuD
rsiW ybfO
yvgN
iolS
citM
iolE cydA
phrG
yncM
yydF
walR
yqhQ
arfM
ahpF
hemB
sigX
araB
araE
rapF
yhcM csbX
yycB
maf comGD
nin comEB comGF
comN
yxjJ
sacP araA
cotC
yhaR kdgT
nasD
yxeD antE
ywhH
yknX yknW ybfP
pucG
sacT
ypiF
araN araD lcfB
etfA
sdpR
vpr
lytE
ftsZ
ftsA spoVS
sbcC
yrrS ywnJ spoIIP
pucI
yflN
abfA
fadB lcfA
fadE fadN
citB
phrI
guaC
pucK pucL pucM
citZ
araQ
ycnK
pucJ
comGB addAglcR rsmG comFB comGC comC
pbpI
feuA
narG
katA
hemC
fnr
phoP
ytxG yvyD ytxH feuC
fapR
ahpC
scoA
resB
yjdB stoA
dhbC ycxA yuzA
ssbB nucA comGA
fadA
icd mdh
ctpB
nasF
dhbE yqxD besA dhbB dhbA
spo0M
ytxJ
mrgA hemA
yjmC exuT yxjC mmgA mmgC
yjmD uxuA mmgB
ybfM
mcsA
fur
ycnJ
pucD pucC pucB
citT
araR
sucC
araP
ccpC
dhbF
tuaF
clpC acpA
ybbA
cstA
araM
yxjF
ctaA narK
psd
bdbD
mcsB
feuB
plsX
tuaH
bdbC
zosA
hemL
uxaA scoB
rsbV rsbW rsbX
tagA
pssA rsiX
pucA pucFpucH pucE
fadR yesM
etfB
resA resE resC
fabG
tuaG
tagF
fabHA fabHB fabI
galT
ylbP resB
ykoL
tagE
ydjM tagD yjeA
fadM uxaC
yqhB hmp
fabD
tagB ykvT cwlO
acdA
yfkN
nsrR
uxaB
iseA
rpoE
minD
sigX
pucR
hemX
fhuD
yfhC
narI
hemD rapD pbpX
spoIVCB
perR minC
bltR
narJ narH
phoR
ywjB abnA
spoIIIC
yoaW
yknY
yvgN
blt
phoD
yqhP yknZ
feuB
pucG
pucI
bltD
sigM
yxjI
fhuG
fadH fadG
yxzE
pspA
ymzB yvgO yxaB
radC
yqhQ
ykuO
ypuD
rsiW ybfO
ybbA feuA
fadF
yrhH
guaD
yfiY ykuP yhfQ ycgT yclQ yclN
ahpF
hemB
yfmC yclP yknX
yknW ybfP
yxjJ
fapR ycnJ
pucA pucFpucH pucE
csoR
katA
hemC
phoP
ytxG yvyD ytxH
fabHA fabHB fabI
ahpC
scoA
fnr
ymfH ymfF ydfK ymfD
spoIIID
yjbC
yfhA
fhuC
hemA
yjmC exuT yxjC mmgA mmgC
yjmD uxuA mmgB
ybfM
mcsA
acpA
ywjB
pucD pucC pucB
yxjF
ctaA narK
psd
bdbD
mcsB
arfM
fhuB yfmD
yclO yxeB
ykuN
narG mrgA
ylxX
ywbO
csbB pstS tuaC
tuaB
tatCD
yfmF yfiZ
yfmE
narI
zosA
hemL
uxaA scoB
rsbV rsbW rsbX
ydjM tagD
ykvT cwlO iseA
yfhC
yfmC yclP
uxaB
yndA
mta
spoIID
murBsbp metA
tatAD
pstA
hemD rapD pbpX
yqfC yokU coxA spoIIIAD
spoIIIAG
ylxW
pstBA pstC
pstBB
yusV
copA narH
phoR
yxjI
phoA
divIB
murF ytpA
tuaE
ykvI
spoIIIAH bofA spoVE spoIIIAE
cotJC yebC
divIC
bcrC
tuaD
yodT
ysnD
yfnD
ydhF
cotJB yesK
ycgR ycgQ
phoB
ctsR
ywjA
guaD
ytrH sodF
spsK
mreD
mreB
fabF
spsJcwlJ ytvI kamA
ponA ytpB
ugtP
plsC
spoVK yyaD yobW
yhaX spoIIIAB
ydaH
yqjL yceC
sigI
yfkM
yotD
yfhA
fhuC
yitC ydcC ypjB
spoVB yodP
spoIVCA murGspoVD cotJA spoIIIAA spoIIIAC spoIIIAF yesJ
sigE ypbG
ykuT
copZ
sigM
yjbC
phoD
fhuB yfmD
yclO yxeB
ykuN
sigB
rnr
yoaA
narJ
yfmF
copA
yrhH
usd yqfZ yknT
yunB
yhaL coaX
yhdK
ywtF
ywaC
clpXlonA
bltR
csbB pstS tuaC
tuaB
tatCD
yusR
yheC ydhDyqfD
cotO
ypuA
ftsH
sigW yceE
ispDyceH
ywbO
pstA pstC
pstBB
ywjA yusV
yheD
spmA yngG
yhjR
ycgF
yjcA
yceD yceG yceF yacL
ysxC
pstBA copZ yotD
yodQ yjdH
ycgG
yhdL
secDF bmr
ysnF
disA
purR
glgA
ysxE prkA
yodR
yjfA
radA
pbuO
cotE
dacB
yngE
mbl glgD
yabQ
yngI
purF
spoIIID
ypqA ispG
glgP
spmB
ytrI
yfhD
yfkS
gerM
spoVID
yabR
spoIIM
spoIVFA
ddl
bmrR
ywmF
blt
metA
divIC
bcrC
ydaT
spoVR
yqxA
yngF glgB
yjaV asnO
safA
ydjP mreC
recU
hprT
ylxP
trxA
purH
yngJ
yteV nucB
yodS
yocL spoVMspoIVA
yeaA
yaaH
clpP
purM
yybI yhbH sqhC ylbJ spoIVFBykvUglgC yhxC
yuzC
yqeZ
yqfB
ysdB
yfkT
yhdF
cypC
yitD
ydcA ytxC
ydbT
yteJ
yozO pbpE
tilS
yocB
clpE
purL
purS purN
ymfF ydfK ymfD
yoaG
yfhM
gsiB yflH
yxiS
pbuX glyA
ymfH
mta
spoIID spoIIIAG
murBsbp
ytpA
yfkM
purDpurK
spoIIIAH bofA spoVE spoIIIAE
ylxW
yckC
yvlB mreBH rsgI ybfQ
yqfA
racX
yfhL
rsbRD
ydeC ctc
yhxD
mgsR
yhcO yobJ yjoB
yvlA
ythQ yvlD
rodA yxnA
csbA
ykgA
ygxB
purE
nusB
cotJC
divIB
murF
ysxC pbuO
cotJB
yebC
folD
xpt
yabJ
purC
spoIIIAD
yesK
ycgR ycgQ
phoB
purA pbuG yndA
coxA
aag
purB
yqfC yokU
ydaH
yqjL yceC
sigI
ywmF purN
purH
ykvI
spoIVCA murGspoVD cotJA spoIIIAA spoIIIAC spoIIIAF yesJ
sigE ypbG
ypuB ydaS
yoxB
yxbG
nadE
yhdN
yraA
ydbS
opuE yaaI ispF
ytkL
nhaX purQ
spsK
yhaL coaX
yhdK
ywtF
ywaC
yhaX spoIIIAB
yhdL
ypuA
ftsH
sigW yceE
ispDyceH
ywlB
ydaD csbC
yxzF ywiE
yfhE yodT
ysnD
sodF yfnD
secDF bmr
ysnF
yfhD ylxP
clpP
purS pbuX glyA purM
ytrH kamA
cotO
sppA xpaC
katE
ykgB
yfkH
aldY
ytvI
yjcA
bmrR hprT
yuaF yaaNydjO ywrE yoaF fosB yvlC ythP
yycD
yuaI
ywsB
ohrB
spsJcwlJ
yodP
yjfA
ddl
tilS
yocB
cypC
mreC
recU
ysdB
yfkT
yhdF yxiS
purL
yaaH
yfhM
gsiB
spoVB
yngI
yfhL
rsbRD
yflH
yvrE
ydaG
yflT bmrU
spoVK yyaD yobW
yfhF
yvaK
ywjC
yfkD
ywzA
ykzI
yjgC
yoxC ydhK
yitC
yunB
era
yfkI
ydcC ypjB
ycgF
ytrI
ycbP
ybyB ywmE
ywtG sodA
gspA
yqfZ yknT
yheC ydhDyqfD
yabQ
spoIVFA
ydeC ctc
yhxD
mgsR
nusB
safA
yeaA
ykgA
ygxB
purE
yabJ
purC
usd
ydaP
yfhK
yocK
yusR
ydaE yjgB
dps
yjgD
yheD
spmA yngG
ydjP
rodA
nhaX
purA pbuG
yodQ yjdH yhjR
yjzE
yxkO gtaB ytaB
yitT glgA
ysxE prkA ycgG
yugU
ydaF
ycdG
ydbD
cotE
dacB
yngE
mbl glgD yodR
yoxB
ispF
ypuB ydaS
racX
ypqA ispG
glgP
spmB
yfkJ
yerD gerM
spoVID
yabR
spoIIM
cdd
csbD
spoVR
yqxA
yngF glgB
yjaV asnO
gabD
ycdF yngJ
yteV nucB
yodS
yocL spoVMspoIVA
ytkL
nadE
yhdN
yybI yhbH sqhC ylbJ spoIVFBykvUglgC yhxC
yuzC
yqeZ
yqfB
opuE
ydaD csbC
yitD
ytxC
ydbT
yteJ
yozO pbpE
yaaI
yfkH ydaG
yflT bmrU aldY
yraA
yoaG yqfA sppA xpaC
katE
ykgB
ydcA
mreBH rsgI ybfQ
ydbS
yycD
yuaI yvrE
yckC
yvlB ythQ yvlD
yvaK
ywjC
yfkD
ywzA
yfkI
ohrB yfhE
ydaP
yfhK
era
yhcO
yvlA
yfhF yjgB
ycbP
yobJ yjoB
yvlC ythP
ydaE
yugU
dps
yjgD
purQ
gabR
yjzE
yfkJ yxkO gtaB
ytaB
yitT yocK ybyB ywmE
ywtG
yuaF yaaNydjO ywrE yoaF fosB
gabD
ycdF csbD yerD ydaF
ycdG
ydbD
sodA
gspA
yjcP hemAT mcpA
motB
cheV degR
pgdS
flhO tlpB mcpBlytD
tlpC fliT
argF carB argC
argJ
carA
argG
argB argH
(c)
argJ
carA
argB argH
(d)
Figure 1. Multi-scale decomposition at fixed-time of gene expression of Bacillus subtilis, as presented in [6]. Nodes size and color are related to their degree. (thanks to Cytoscape viewer [www.cytoscape.org]) (a) Original data. (b) Multi-scale decomposition profiles. (c) Scale component for λ = 2. (d) Scale component for λ = 16.
Spatial Statistics
http://dx.doi.org/10.1016/j.spasta.2013.11.004
Preprint
Spatio-Temporal Graphical Modeling
25
26
21
M3
12
20
25 11 14
38
18
29
15
M4
19 17
13 10
33
22
39 16 28
31
40
9
41
24
34
36
M1
35
32
27
23
30
M2
2 7
37 1 42 8
3
4 51 5
58 6
56
47 55
46
57
53 44
43
M5
54
48 50 49
45
52 59
60
(a)
(b) 26
GRF time series, T=4 1
21
12
20
0.5
25 11 14
38
18
29
15
19 17
13 10
33
22
39 16 28
31
40
9
41
35
32
24
34
36
27
0
23
30
2 7
37 1 42 8
3
−0.5 4 51 5
58 6
56
47
−1
55
46
57
53 44
43
54
48 50 49
45
−1.5
0
10
20
30
40
50
60
52 59
60
(c)
(d)
Figure 2. Simulated data. (a) Graph G, (colors identify regulons). (b) The regularors of regulons and their connections. (c) An observation yt of the graphical random field. (d) Profiles of the first four observations y1 , ..., y4 of a time series, (the nodes are arranged in an arbitrary order while respecting the grouping around the regulons, as this is showed by the color segments at the bottom of the figure. These segments locate the regulons).
Spatial Statistics
http://dx.doi.org/10.1016/j.spasta.2013.11.004
Preprint
26
B. Chalmond
182
189
177
187
181
184
186 180
185
194
175
208
215
198
206
190
178
183
209
192 213
174
188
196 205
214
193
202
203
179
204 212
197
201
200
176(degU)
149
211 191
199(rok)
151
166
161
139
132
145
154
169
150 133
171
144
157 156
146
143
135
158
168
153
165
160
130
142
155 128
37(comK)
137
147
138 164
163 136
131 173 172
195
210
207
148
134
36(codY)
162
170
140 141
129 159
152
108
167
44
94 18
96
26
112 59 125
70
72 100
83
1 49
3
41 30
68
87 19
73
46
104
10 23
101
62
107
9
20
50
80 63
120 15
124 79
64
52
117 86
4
88
11
97 21
65
82
58
51 75
66
109
98
89
38 33
13 126
84
119
57 45
74 92
12
60
85
43
69 116
17 71
40
56
106 48
54
91
118
93
39 81 8
29
31 122
42
78 95
67
76 123
61 90
27
113
22
103
28
111 47
53
16
2
99
114 7
110
115
121
32 25
102
24 34
77
105
14
5 6
55 127
35
Figure 3. A small graph extracted from that of Fig.1. It is organized around 4 regulons.
Spatial Statistics
http://dx.doi.org/10.1016/j.spasta.2013.11.004
Preprint
Spatio-Temporal Graphical Modeling
27
182
189
182
189
177
187
180
180
178 179
149
163
208
205
137
132
163
150
165
212
133
205
137
132
153
130 157
138
153
156
157
138
195
210
197
154
156
207
134
168
170
141
115
14
61
14
102
113
96 26
102
83
31 71
72
92
36
100
70
12
56 1
78
4
46
19
85
43
79
64
52
41
73
57
11
97
3
23
182
189
149
205
166
161
163
150
165
205
137
147
153
157
138
162
135
153
156
134
168
157
162
135
156
214
142
207
134
168
115
61
2
106
69
67 99
114
26
71
72
92
36
100
84
103
28
12
56 1
78
4
46
60
19
85
43 41
73
57
66
109
79
64
52
30
63
21
45
23
30
175
179
149
166
161
163
150
165
205
137
147
153
156
153
134
214
142
211
170
207
156
134
168
115
61
25
2
102
25
48
99
114
26
71
72
49
95
60
19
85
43 3
4
46
66
109
79
64
52
73
57
11
97 63
21
45
3
23
23
15
5 6 55
182
189
177
187 181 180
181
184
186 174
174
185
175
183
188
176
188
176 178
179
179
149
149
148
148
151 209
192
163
198
169
150
165
205
137
166
161
206
190
163
150
165
194 208 206
205
137
132
147
190
196
169
203
198
136
131 173
202
215
160
139 196
132
213
171 208
215
160 136
131
209
192 194
213
171 166
139
202
203
147
172
212
133
172
204
199
153 158
200 157
138
153
156
158
201 200
157
138
195
210
197
193
130
145
162
135
143
162
135
156
195
210
197
143
164
204
199
146
144
201
193
212
133
154
146
144
130
145
62
107
127
(c) t=9
178
154
23
9
104
35
175
173
79
10
117 15
124
20
50
101 120
177
184
186 185
161
63
21 65
86
187
183
151
58
4
64
52
11 80
45
182
189
66
109
97
82
(b) t=8
180
73
57
6 55
127 35
(a) t=7
19
119 41
30 5
117 86
62
107
120 82
60 85
43 3
46
88
104
10
51 75
33
87
49
95
9
89 38 12
1
20
50
101
65
6 55
127 35
79
64
52
11 63
21
45
5
117 15
66
109
97 80
30
62
107
120 86
19
73
57
126
84
122
56
98
74 92
42
78
119 41
104
10
101
65 82
60 85
20
50
80
30
49
95
9
43
119 41
71
103
28
124
13
17
36
125
4
46
88
69 116
31
22
70
51 75
33
87
106 48 54
91
118
68
83
100
89 38 12
56 1
26
72 59
58
122 42 103
28
124
78 88
39 81 8
29
90
27
112
7 126
84
99
114
98
13
74 92
36
125 70
51 75
33
87
71 22
76 123
61
40 113 67
17
72
58
38 12
56 1
78
116
31
47 53
121 32 2
102 96
54
91
118
68
83
100
89
42 103
28
26
115
93
77
25
48 69
112 59
126
84
122
125 70
92
36
100
7
74
22
59
99
114
98
13
17
112
7
116
31
111 24
34
8
29
90
27
191
110
44
94 18
105 14 106
113
108
16 37
81
67
54
91
118
68
83
2
102 96
69
207
167
152
39
123 61
40
211
170
141
76
47 53
121 32
8
29
90
27
214
142
110
115
93
14
67
195
210
197
140
111 24
34 77
105
81 106
113
96
134
168 155
191
159 108
16 44
94 18 39
123
53
121 40
200 162 156
129
37 76
47
93 32
201
193
157 135
128
111 24
34 77
211 207
167
152
110
44
94 18
105
214 170
141 159
108
16 37
14
153 158
138 143
140
128 129
167
152
203 204
199
130
145 195
210
197
142
155
191
141 159
202
212 146
164
140
128 129
205
133 144
201 200
162
135
143 164
155
208 206
190
196 165 137
154
193
157
138
195
210
197
130
145
162
135
198
150
172
204
199
158
200 157
138 164
194
215
160 136
169 132
146
144
201
193
158 130
143
212
133
154
163
131 203
147
172
204
199
146
168
166
161 139
173
202
147 212
133 144
209
192
208 206
190
196
169 132
213
171
198
136
131 173 203
215
160
139 196
202
194
213
171 208 206
190
205
148
151 209
192 194
215 198
165 137
149
148
151 209
192 213
160 136
150
172
188
176
179
171 163
169
181 174
178
149
166
131
177
184
185
183
188
176
148
161
182
189
186
179
132
5 6 55 127
187
180
178
139
62
107 117
15 35
181
183
188
176
23
9
104
101 120
86
175
178
79
20
50 10
63
21 65
82
174
185
175
183
124
64
52
11 80
45
5
177
184
186 180
66
109
97
6 55
127
187 181 174
4
46
119
182
189
177
184
19
73
57
30
62
117
75 88
60 85
41
35
182
186 185
151
23 107
120 15
58 51
33
87
49 43 3
104
101
65
86
12
56 1
95
9
89
20
50 10
63
21
45
82
55 127
187
173
79
64
52
11 80
35
189
66
109
98
126
84
38
103
28
124
78
97
6
117 15
19
73
57
5 107
120 86
41
62
101
65 82
4
46
119
3 104
10
36 42
70
13
74 92
122
125
88 60 85
20
50
80
75
106
69 116 17 71
100
58 51
33
87
49
95
9
43 11
97
12
56 1
78
119
3
103
28
124
88
49
95
75
84
38
48 54
91
31
22
89
42
39 81 8
29
118
68
83
72 59
126
122
125 70
51
33
87
36
100
58
38
26
112
7
74 92
76 123
61
90
27
113 67
99
114
98
13
17 71
22
89
42
69 116
47 53
121 40 2
102 96
54
31
115
93 32
25
48
91
118
68
83
72 59
126
122
125 70
26
112
7
74
22
59
99
114
98
13
17
112
7
116
31
111 24
34 77
106 8
29
90
27
113
191
110
44
94 18
105 14
67
54
91
118
68
83
2
108
16 37
81
40 25 102 96
207
167
152
39
123 61
211
170
141
76
47 53
121 32
48
214
142 140
110
115
93
14
8
29
90
27
113
134
168
111 24
34 77
105
81
40 25 102 96
195
210
197
155
191
159 108
16 44
94 18 39
123
53
121
162 156
129
37 76
47
93 32
157 135
128
111 24
34 77
207
167
152
110
44
94 18
105
211
170
141 159
108
16 37
14
200
138
140
128 129
167
152
201
193
143
214
142
155
191
141 159
153
164
211
170
203 204
199
158
195
210
197
143
140
128 129
202
212 146
130
145
164
155
205
133 144
201 200
138
195
210
197
143 164
208 206
190
196 165 137
154
193
130
145
198
150
172
204
199
158
200
194
215
160 136
169 132
146
144
201
193
158
212
133
154
163
131 203
147
172
204
199
146
130
180
166
161 139
173
202
147 212
133 144
213
171 208 206
190
196
169 132
209
192 194
198
136
131 203
215
160
139
173
202
148
151 209
192 213
171 208 206
190
196 165 137
149
148
151 194
215 198
150
172
145
179
209
192 213
160 136
169
154
175
188
176 178
179
171 163
131
181 174
185
183
188
176
149
166
161
177
184
186 180
183
148
151
132
182
189 187 181
179
139
55 127
35
178
173
5 6
117 15
175
188
176
62
107
120
174
185
178
104
10 23
101
65
86
9
20
50
80 63
21
45
177
184
186 180 175
183
11
82
187 181 174
124 79
64
52
97
30 5
127
182
189
177
184
186 185
66
109
35
187
180
73
57
6 55
15
19
119 41
62
117 86
4
46
88
104
10 23 107
120 82
55 127
51 75
33
87 60
85
3
101
65
6
117
35
58
38 12
56
20
50
80 63
21
45
5 107
120 15
11
97
30
62
101
65
86
73
57
104
10
63
21
45
82
41
98
126
84 89
1 49
43
119
20
50
80
30
85
43
119
3
92
36
103
95
9
74
122
28
124 79
64
69 13
17 71
42
70
4
46
52
48
116
31
100
78 66
109
83
54
22
59 125
88 19
39 81 8
29
91
118
68
72
58 51
75
33
87 60
76 123
61
106
113
26
112
126
84
38 12
56 49
95
9
99
7
89
1
78 66
109
92
36
103
28
124
88 60
111 47
90
27 67
98
13
74
122 42
70
51 75
33
87
49
102
114
17 71
100 125
38
103
28
95
31
22
59 58
89
42
83
2
96
69 116
53
121 40
54
91
118
68
72
126
84
122
125
26
112
7
74
22
59
98
13
99
114
17
112
7
116
110
115
93 32
25
48
67
54
91
118
68
113
96
69
67 99
114
24 34
77
14 106
8
29
90
27
108
16 44
94 18
105
81
40 2
191
167
152
37
61
32 25
48
207
141
129 159
8
29
90
27
170
39
123
53
121
211
140
76
47
93
106
40 2
195
210
197 214
155
191
110
115
34 77
105
81
32 25
162 156
142
111 24
18 39
123
53
121
44
94 76
47
93
108
16 37 111
24 34
77
207
141
110
44
94 18
105
157 135
134
168
167
152
159 108
16 37
201 200
128
129
167
152
159
211
140
128
129
203
193
143
214
142
155
191
140
128
153
138
195
210
197
204
146
144
164
211
170
155
202
199
130
145
162
135
143
214
142
205 212
158
200
164 134
168
208 206
190
196 165 137 133
201
193
130
145
162
135
143
194
198
150
132 172
204
199
158
200
164
145
203
215
160 136
169
146
144
201
193
158
154
202
212
133
154
146
144
163
131
147
172
204
199
209
192
166
161 139
173
147
172
213
171 208 206
190
196
169
203
198
136
131
202
194
215
160
139
173
147
148
151
166
161
206
190
196 165
213
171
198
150
209
192 194
215
160 136
169
149
148
151 209
192 213
171 166
131
145
179
149
148
161
188
176
178
139
154
175
183
188
176
179
151
174
185
175
183
178
173
181
184
186 174
185
175
188
176
177
187 181
184
186 174
185
182
189
177
187 181
184
186 180
183
164 134
168
214
142
211
170
155
207
128
115
77
105
61
25
2
99
114
25
26
7
71
95
3
46
4
60
19
85
66
109
41
73
57
63
50
3
60
5
117
(d) t=10
46
66
109
4
41
73
57
11
97
50
62 5
107
120 117 15
104
10 23
101
65
86
9
20
80 63
21
45
82
124 79
64
52
119
6 55
127 35
19
85
30
62
107
120 15
58 51
75
33
87
88
104
10 23
101
65
86
95
89 38 12
56 1
43 11
97 21
45
82
9
126
84
122 42
49
98
13
74 92
20
80
30
79
64
52
119
71
103
28
124
78 88
43
70
51 75
33
87
69 116 17
36
125
38 12
56 1 49
106 48 54
91
31
22
89
42 103
78
39 81 8
29
118
68
83
100
58
122
125 28
26
72 59
126
84
76 123
61
112
7
74 92
36
99
114
98
13
17
22 100
70
116
31
72 59
111 47
90
27
113 67
54
91
118
68
83
2
102 96
69
112
110
40
48
67
53
121 32
8
29
90
27
113
115
93
14 106
40 102
24 34
77
105
81
32
96
44
18 39
123
53
121
14
94 76
47
93
108
16 37 111
24 34
191
141
110
44
94 18
207
167
152
159 108
16 37
211
170
129
167
152
214
142 140
128 141
129 159
134
168 155
191
140
6 55
127 35
(e) t=11
Figure 4. Module detection on a time-series of length T = 11. As in [6], this treatment was done without taking into account time dependence. In this case, the detection at time t = 8 is different from that at times t = 7, 9, 10, 11. On the contrary, with the spatio-temporal model, this difference does not exist anymore.
Spatial Statistics
http://dx.doi.org/10.1016/j.spasta.2013.11.004
Preprint
28
B. Chalmond
15 10 5 0 −5 −10 −15 −20
0
5
10
15
20
25
30
35
40
45
50
Figure 5. Drift simulation. The drift has the effect of swapping the relative positions of the variables Y (1) (green) and Y (2) (red) after the intervention time tI = 25.
180
160
140
120
100
80
60
40
0
5
10
15
20
Figure 6. Time series of the mean value of the three regulons computed on the graphs G in Fig.7.
Spatial Statistics
http://dx.doi.org/10.1016/j.spasta.2013.11.004
Preprint
Spatio-Temporal Graphical Modeling
29
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Figure 7. Time series of images, and overlaid graph G describing the structure of the membranes. This graph is organized around three regulons (red, blue and green).
Spatial Statistics
http://dx.doi.org/10.1016/j.spasta.2013.11.004
Preprint