. Advanced data, signal and image processing tools for biomedical and medical applications Ali Mohammad-Djafari ` Groupe Problemes Inverses Laboratoire des Signaux et Syst`emes UMR 8506 CNRS - SUPELEC - Univ Paris Sud 11 ´ Supelec, Plateau de Moulon, 91192 Gif-sur-Yvette, FRANCE.
[email protected] http://djafari.free.fr http://www.lss.supelec.fr
SPIS2015, December 16-17, 2015, Amirkabir University of Technology (AUT)
A. Mohammad-Djafari,
SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.
1/42
Summary 1 ◮
Data, signals, images in Biological and medical applications ◮ ◮
◮
A great number of data, variables, time series, signals, images, ... ◮ ◮ ◮
◮
Individual cells, Population of cells, Small animals, Human In vitro and In Vivo
Genes expression, Hormones, temperature, ECG, EMG, ... Tomographic images (X rays, PET, SPECT, IRM), 3D body volume, fMRI, Holographic, multi- and Hyper-spectral images, ...
Need for Visualization tools ◮ ◮ ◮ ◮
multicomponent, multivariate and multidimensional Time domain Transformed domain: Fourier, Wavelets, Time-Frequency... Scatter plots, histograms, statistics, ...
A. Mohammad-Djafari,
SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.
2/42
Summary 2 ◮
Modeling time series ◮
◮
◮
◮
Modeling images ◮ ◮
◮
Parametric: Superposition of sinusoids, Gaussians shapes, ... Non Parametric: Fourier, Wavelets, Time-frequency, time-scale,... Probabilistic: Moving Average (MA), Autoregressive (AR), ARMA, Markovian models, ...
Simple Markovian models (intensity, texture,...) Hierarchical Markovian models with hidden variables of contours and regions
Modeling the relation between observed data and unknowns ◮ ◮
Linear / Non linear Training and test data
A. Mohammad-Djafari,
SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.
3/42
Summary 3 ◮
Simple Analysis: Estimating periods, Computing harmonic components, spectra, , ...
◮
Multicomponent/Multivariate data analysis: Dimensional Reduction PCA, FA, ICA, Sparse PCA for dimensional reduction and main factors extraction
◮
Multicomponent/Multivariate Discriminant Analysis with classification: LDA, EDA, RDA, Sparse LDA for finding the most discriminant factors
◮
Blind sources separation
◮
Correlation (Pearson or Spearman) computation and dependency graph visualization
◮
Modelling input-output relations
A. Mohammad-Djafari,
SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.
4/42
1D data: one variable
xi , i = 1, · · · , M
◮
Data:
◮
1D plot, mean, median, variance
◮
No order: exchangeable
◮
histogram, probability distribution,
◮
Statistical modelling: expected value, variance, mode, median, Higher order moments, entropy
◮
Parametric, semi-parametric and Non Parametric modelling
◮
Parameter estimation: MM, ML, Bayesian
◮
Model selection: AIC, BIC, ...
A. Mohammad-Djafari,
SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.
5/42
1D data (Gaussian) 3
2
1
0
-1
-2
-3
-4 0
20
40
60
80
100
120
140
160
180
200
30
V=0.9727
0.1
20
0.08
15
0.06
10
0.04
5
0 -4
E=0.12323
0.12
25
0.02
-3
-2
A. Mohammad-Djafari,
-1
0
1
2
3
0 -3
-2
-1
0
1
SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.
2
3
6/42
1D data (Gamma) 6
5
4
3
2
1
0 0
20
40
60
80
100
120
140
160
180
200
60
0.35
0.3
50
0.25 40
0.2 30
0.15 20
E=1.0291 V=1.0623
0.1
10
0.05
0 0
1
2
A. Mohammad-Djafari,
3
4
5
6
0
0
1
2
3
4
5
SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.
6
7
7/42
1D signals: Time series ◮
1D Signal: Time series: xi = f (ti )
◮
In general no exchangeable.
◮
Time representation f (t), Fourier Transform and Fourier representation F (ν), Auto Correlation function R(τ ), Power Spectral Density S(ν)
◮
Stationary and non stationary
◮
STFT, Time-Frequency, Time-Scale, Wavelets, ...
◮
Smoothing, Noise removing, Filtering
◮
Periodic signals, estimation of the period, Fourier series Modeling:
◮
◮ ◮ ◮
Sum of sinusoids model and parameter estimation Moving average (MA) model Autoregressive (AR) model
A. Mohammad-Djafari,
SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.
8/42
1D signals f (t) and its FT F (ν) f(t)
|F(nu)|
1
0.45
0.5
0.35
0
0.25
-0.5
0.15
-1
0.05
0.4
0.3
0.2
0.1
0 0
24
48
72
96
0
1
time hours
Modulus of its FT |F (ν)|2
signal f (t) $\Gamm(\tau)$
1
2
frequency 1/day
|F(nu)|
0.8 0.2 0.6
0.4 0.15 0.2
0 0.1 -0.2
-0.4 0.05 -0.6
-0.8
0 24
48
72
96
time hours
Auto-Correlation function R(τ )
A. Mohammad-Djafari,
0
1
2
frequency 1/day
Spectral density S(ν)
SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.
9/42
Multi-component, multi-variate, multi-dimensional data {(xi , yi )}, i = 1, · · · , M}, 2M elements
◮
bi-component:
◮
bi-variate: {xi , i = 1, · · · , M}, {yj , j = 1, · · · , N}, M + N elements
◮
bi-dimensional: Images: xi,j , i = 1, · · · , M, j = 1, · · · , N, M ∗ N elements 2 data sets
3
x y
2
3
10
i j
2
20
0
2
30
1
-2
1
40 0 0
50 -1
60 -1
2
70 -2 -2
0
80
-3
90
-2
-3
100
-4 0
20
40
60
80
100
A. Mohammad-Djafari,
120
140
160
180
200
50
100
150
200
10
20
30
40
SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.
50
60
70
80
10/42
90
100
Bi-component or Bi-variate data ◮ ◮ ◮ ◮
◮ ◮
2D distribution: joint probability distribution p(x, y) Conditionals p(x|y), p(y|x) Marginal distributions p(x), p(y) Expected values E (X ), E (Y ), variances V (X ), V (Y ), and Covariances, Higher order moments, entropy Independence tests Copula, ... 3 2 data sets
3
2
xi yj
2
1
1
0
0
y
-1
-2
-1
-3
-4 0
20
40
60
80
100
120
140
160
180
200
-2 2 data sets
3
xi yj
2
-3
1
-3
0
-2
-1
0
1
2
3
4
x -1
1 -2
-3
-4 0
20
40
60
80
A. Mohammad-Djafari,
100
120
140
160
180
200
SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.
11/42
Bivariate data ◮
Joint, marginals and conditional probability density functions 0.14
2
0.12 0.1 0.08 0.06
1
0.04
y
0.02 0
0
5 20 10
-1
15 10
15 5 20
-2 -3 0
x
2
Probability Density
0.14
-2
0.12 0.1 0.08 0.06 0.04 0.02
2 1
3 2
0
x2
1 0
-1 -2
A. Mohammad-Djafari,
-1 -2
x1
SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.
12/42
Images: Space, Fourier and Wavelets representations
50
100
150
200
250 50
100
150
200
250
50
100
150
200
250
50
100
150
200
250
A. Mohammad-Djafari,
SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.
13/42
Sparse images (Fourier and Wavelets domain)
Image
Fourier
3500
Wavelets
6000
10 3000
5000
8
2500 4000 2000
6 3000
1500
4 2000 1000
0
2
1000
500
0
50
100
150
200
250
0
Image hist.
0
2
4
6
8
10
12
14
0 16 -500
-400
-300
-200
-100
0
100
200
300
400
500
Fourier coeff. hist. Wavelet coeff. hist.
8000
1400
450
400
7000
1200 350
6000 1000 300 5000 800
250
4000 200
600 3000
150 400 2000 100 200
1000
0 -200
50
-150
-100
-50
0
50
100
150
200
0 -200
-150
-100
-50
0
50
100
150
200
0 -200
-150
-100
-50
0
50
100
150
200
bands 1-3 4-6University (Polytechnique), bands 7-9 14/42 SPIS2015, December bands 16-17, Amirkabir Tehran.
A. Mohammad-Djafari,
Multi-variate data and signals ◮
Multi-variate data: ◮ ◮ ◮ ◮ ◮
◮
Multi-variate signals: ◮ ◮ ◮
◮
Scatterplots, Correlation coefficients (Pearson and Spearman) Multivariate probability distribution, means, variances, covariances, Joint, marginals and conditional probability density functions
Multi-variate time series, Auto and Inter Correlation functions, Auto and Inter Power Spectral Densities
Needs for advances visualization, Dimensionality Reduction, Factorial analysis, modelling, parameter estimation, classification, ... and Knowledge extraction
A. Mohammad-Djafari,
SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.
15/42
Temperature and activity Time series before, during and some treatment BOD: before 800
36
600
34
400
32
Activity
Temperature
38
200
30 0
24
48
0
BOD: during 800
36
600
34
400
32
Activity
Temperature
38
200
30 0
24
48
0
BOD: after 800
36
600
34
400
32
200
30 0 A. Mohammad-Djafari,
24 48 SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.
0
16/42
Activity
Temperature
38
A. Mohammad-Djafari,
SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.
17/42
Simple questions for 1D time series ◮
Main question: Has something changed during and some medical action?
◮
In this study: effects of circadian cycle on cancer cells.
1. Is there any periodic components in these signals? Yes/No (Detection)? Confidence ? 2. If Yes, How many? 3. What are those components (Periods pi or Frequencies νi , Amplitudes ai )? ◮
◮
◮
When questions 1 and 2 are answered, the problem becomes easier: Parameter estimation Trying to answer all the three questions at the same time: semi- or Non-Parametric modelling Biologists always need uncertainties −→ Bayesian inference
A. Mohammad-Djafari,
SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.
18/42
Simple Analysis tools may not be successful even in very simple cases Case of 1 sinusoid
1
0.7 0.8
0.6
0.6
0.4
0.5
0.2
0.4 0
0.3
-0.2
-0.4
0.2 -0.6
0.1 -0.8
-1
0
24
48
72
0
24 period in hours
time (hours)
Case of 2 sinusoids+noise 1
0.7
0.8
0.6 0.6
0.4
0.5
0.2
0.4 0
0.3
-0.2
-0.4
0.2
-0.6
0.1 -0.8
-1
0
24
48
72
0
24
12 period in hours
time (hours)
Case of 3 sinusoids+noise 1
0.45
0.8
0.4
0.6
0.35
0.4
0.2
0.3
0
0.25
-0.2
0.2
-0.4
0.15
0.1
-0.6
0.05
-0.8
-1
0
24
48 time (hours)
A. Mohammad-Djafari,
72
0 Inf
36
24
14.4 period in hours
SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.
19/42
Classical methods: Spectral estimation S(ω) ? ◮
Fast Fourier Transform (FFT): g(t) −→ FFT −→ f (ω) −→ S(ω) = |f (ω)|2 ◮ ◮
◮
Advantages: Well-known and understood, fast Drawbacks: linear in frequencies ν, but not equidistance in periods ν = [0, · · · , N − 1] −→ p = [∞, 1, · · · , 1/(N − 1)]
Autocorrelation function: γ(τ ) ◮
◮
If g(t) is periodic, then γ(τ ) is also periodic, but much smoother γ(0) = 1 γ(τ ) ≤ γ(0), ∀τ
◮
Power spectral density: γ(τ ) −→ FFT −→ S(ω)
◮
Autoregressive (AR), Moving Average (MA) and ARMA models
◮
Non-stationary GARCH models
◮
Sum of sinusoidal components
A. Mohammad-Djafari,
SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.
20/42
Parametric, Semi- and Non-Parametric models ◮
Parametric: g(t) =
K X
ak sin(2πνk t + φk ) + ǫ(t),
θ = {ak , φk , νk }
k =1
g(t) =
K X
ak cos(2πνk t)+bk sin(2πνk t)+ǫ(t), θ = {ak , bk , νk }
k =1
g(t) =
K X
ck exp [j2πνk t]+ǫ(t),
θ = {ck , νk }, t = 0, · · · , T
k =1 ◮ ◮
Semi-Parametric: νk = kν0 , ν0 = 1/T , K = T −→ DFT Non-Parametric: νk fixed in a given interval with given precision, so K is fixed but can be as large as necessary.
g(t) =
K X
ck exp [j2πνk t] + ǫ(t), θ = {ck } Linear model
k =1 A. Mohammad-Djafari,
SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.
21/42
Can we propose a unifying approach for all these problems? My answer is Yes: ◮ Identify what your are looking for. (red color f ) ◮ Identify what are the data : (blue color g) ◮ Consider the errors (modeling and measurement ǫ) ◮ Write the Forward model relating them: g = Hf + ǫ ◮ Write the expression of the likelihood p(g|f ) ◮ Translate your prior knowledge on the unknowns in p(f ) ◮ Use the Bayes rule: p(f |g) = ◮
p(g|f ) p(f ) ∝ p(g|f ) p(f ) p(g)
Infer on f using the posterior p(f |g): ◮
◮
Maximum A Posteriori (MAP): Posterior Mean (PM):
A. Mohammad-Djafari,
b= f
Z
b = arg max {p(f |g)} f f
f p(f |g) df
SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.
22/42
Estimating Periodic Components: Inverse Problems Approach g(t) =
K X
ck exp [j2πνk t] + ǫ(t), θ = {ck } Linear model
k =1
Slight changes of notations: use of periods pn in place of frequencies νk and f n in place of ck : g(t) =
N X
f n exp [j2π/pm t] + ǫ(t), t = m∆t, m = 1, · · · , M
n=1
Defining the vectors: g = [g 1 , · · · , g M ]′ , ǫ = [ǫ1 , · · · , ǫM ]′ , f = [f 1 , · · · , f N ]′ and the matrix H: Hm,n = exp [j2π/pm m∆t], we obtain: g = Hf + ǫ The objective is to infer on f . A. Mohammad-Djafari,
SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.
23/42
Inverse Problems Approach g = Hf + ǫ Bayesian approach: ◮ Assign the Likelihood : p(g|f ) ◮ Assign the prior law: p(f ) ◮ Use the Bayes rule : p(f |g) ∝ p(g|f ) p(f ) ◮ MAP: b = arg max {p(f |g)} = arg min {J(f )} f f
◮
f
Assuming Gaussian noise and Gaussian prior
b , Σ) b with Σ b = (H ′ H+λI)−1 and f b = arg max {J(f )} p(f |g) = N (f |f f
J(f ) = kg − Hf k2 + λkf k2
◮
Other priors (Generalized Gaussian, Student-t or Cauchy) J(f ) = kg − Hf k2 + λΩ(f )
A. Mohammad-Djafari,
SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.
24/42
Bayesian estimation with priors enforcing sparsity ◮
◮
◮
Sparsity: For any periodic signal, the spectrum is a set of Diracs Biological signals related to clock genes: a few independent oscillators Spectrum has a few non zero elements in any given interval g(m∆t) =
N X
f n exp [−j2π/pm m∆t] + ǫ(t), m = 1, · · · , M
n=1
◮ ◮
◮
g = Hf + ǫ with f sparse The question is now: How to translate sparsity? Two solutions: L1 regularization and Bayesian sparsity enforcing priors. Three main options in Bayesian: Genealized Gaussian, Student-t, mixtures models
A. Mohammad-Djafari,
SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.
25/42
Bayesian estimation with priors enforcing sparsity ◮ ◮
◮
g = Hf + ǫ with f sparse To translate this information use the heavy tailed prior law Student-t with its hierarchical structure and hidden variables ν+1 2 log 1 + f j /ν St(f j |ν) ∝ exp − 2
Infinite Gaussian Scaled Mixture (IGSM) property: Z ∞ N (f j |, 0, 1/zj ) G(zj |α, β) dzj , with α = β = ν/2 St(f j |ν) ∝= 0
◮
Hiearchical prior model: p(f j |zj ) = N (f j |0, 1/zj ), p(zj ) = G(zj |α, β) Q p(f |z) = Qj p(f j |zj ) p(z) = j p(zj ) −→ p(f , z|g) ∝ p(g|f )p(f |z)p(z) p(g|f ) = N (g|Hf , σǫ2 I)
A. Mohammad-Djafari,
SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.
26/42
Results on simulated and real activity data CT 502 B3 Activity DDBef Signal
10
CT 502 B3 Activity DDBef VBA CT 502 B3 Activity DDBef Signal
8
CT 502 B3 Activity DDBef FFT
2
2
1.8
1.8
1.6
1.6
1.4
1.4
1.2
1.2
fbF F T
6
2
1
Amplitude
Amplitude
Amplitude
4
0.8
1 0.8
0.6
0.6
0.4
0.4
0
0.2
-2
0.2
0
0
-4 0
1
8
2
9
10
11
12
13
14
15
16
17
18
Data BEFORE
20
21
22
23
24
25
26
27
28
29
30
31
32
6
6.54
7.2
8
9
10.28
12
14.4
18
24
36
Periods
Proposed method
CT 502 B3 Activity DDDur Signal
10
19
Periods
Number of days
FFT
CT 502 B3 Activity DDDur VBA
CT 502 B3 Activity DDDur FFT
1.4
1.4
1.2
1.2
CT 502 B3 Activity DDDur Signal
8
1
1
0.8
0.8
fbF F T
6
Amplitude
Amplitude
Amplitude
4 0.6
0.6
2 0.4
0
0.4
0.2
-2
0.2
0
0
1
2
3
8
4
0
9
10
11
12
13
14
15
16
17
18
Data DURING
20
21
22
23
24
25
26
27
28
29
30
31
32
8
8.57
9.23
10
10.9
12
13.33
15
17.14
20
24
30
Periods
Proposed method
CT 502 B3 Activity DDAft Signal
10
19
Periods
Number of days
FFT
CT 502 B3 Activity DDAft VBA
CT 502 B3 Activity DDAft FFT
CT 502 B3 Activity DDAft Signal
8
2
2
1.8
1.8
1.6
1.6
1.4
1.4
1.2
1.2
fbF F T
6
2
0
1
Amplitude
Amplitude
Amplitude
4
0.8
1 0.8
0.6
0.6
0.4
0.4
0.2
0.2
-2 0
0
1
Number of days
Data AFTER A. Mohammad-Djafari,
8
0
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
Periods
Proposed method
30
31
32
6
6.85
8
9.6
12
16
24
Periods
FFT
SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.
27/42
48
Dimension reduction, PCA, Factor Analysis, ICA ◮
M variables g(t) are observed. They are redundant. Can we express them with N ≤ M factors f ? How many factors (Principal Components, Independent Components) can describe the observed data?
( P A : (M × N) Loading matrix , N ≤ M g i (t) = N j=1 aij f j (t) + ǫi (t) f (t) : factors, sources g(t) = Af (t) + ǫ(t) ◮
How to find both A and factors f (t) ?
◮
Bayesian methods:
b f b ) = arg max {p(A, f |g)} = arg min {ln p(g|A, f ) − ln p(A) − ln p(f (A, (A,f )
A. Mohammad-Djafari,
(A,f )
SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.
28/42
How to determine the number of factors ◮
When N is given: p(A, f |g) ∝ p(g|A, f ) p(A) p(f )
◮
Different choices for p(A) and p(f ) and Different methods to estimate both A and f : JMAP, EM, Variational Bayesian Approximation When N is not known: ◮ ◮ ◮
◮
◮
Model selection Bayesian or Maximum likelihood methods To determine the number of factors we do the analyze with different N factors and use two criteria: -log likelihood − ln p(g|A, N) of the observations and DFE: Degrees of freedom error (N − M)2 − (N + M))/2 related to AIC or BIC model selection criteria.
A. Mohammad-Djafari,
SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.
29/42
Factor Analysis: 13 variables 2 factors
Time series
FT Amplitudes
1
P53
Ccna2 0.8
0.8
Bax
Rev
P53
1
1
Ccnb2
UGT
Bax
Per2 Bmal1 DBP
0.8
Mdm2
0.6
0.6
Bax
Bcl2 0.4
Ccnb2
Rev
0.2
Wee1 DBP
0
UGT
Component 2
0.4
Ccna2
Mdm2
Wee1
P53 Bcl2 DBP
-0.2
Top1
DBP
0.4
-0.8
Bmal1
-0.8
Per2
0
Per2 -0.6
-1
-1 -1
1
-0.6
CE2
Per2
Rev
Bax P53
-0.2 -0.4
0.2
Wee1
-0.6
-0.4
0
Top1
-0.2
Bmal1
Wee1 Ccna2 CE2 Top1
0.2
UGT -0.4
Top1 CE2
0.6
Ccna2
CE2 0
Ccnb2 Mdm2 Bcl2
0.4
Ccnb2
UGT
Bmal1
0.2
0.6
0.8
Bcl2
Component 2
Mdm2
2
A. Mohammad-Djafari,
-0.8
-0.6
-0.4
-0.2
0 0.2 Component 1
0.4
0.6
0.8
1
Rev
-1
1
-0.8
-0.6
-0.4
-0.2
2
SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.
0 0.2 Component 1
0.4
0.6
0.8
30/42
1
Factor Analysis: Time series, Number of factors
1 P53
P53
Bax
Bax
0.8
Mdm2
Mdm2
Bcl2
0.6 Bcl2
P53 0.8
0.8
Bax Mdm2
Ccna2 We1
Bax
0.8
Ccnb2
0.4
0.6
Bcl2
Bax
Ccnb2
Ccnb2
Ccna2
Ccna2
We 1
0.2 Wee1
0.2 Wee1
0.4
DBP
UGT
0
Top1
Top1
Ccna2 0.2
CE2
-0.2 CE2
Wee1
UGT
-0.2
CE2
CE2
Bmal1
-0.4 Bmal1
-0.4 Per2
Per2
Per2
.
2
.
-0.2
CE2 -0.4
0
Top1
1
2
3
30
-0.2
CE2
Bmal1
Bmal1
.
1
2
3
A. Mohammad-Djafari,
4
.
-0.6 1
2
3
4
5
.
10
0 -0.2
Per2
Rev
-0.6 1
2
3
4
5
6
20
0
CE2 Bmal1
Per2
Rev
UGT
-0.4
Per2
-0.6
Rev
0.2
Top1
-0.4
Rev
40
DBP
UGT
0
Top1
-0.2
-0.6
-0.6 1
50
0.4
Wee1
DBP
UGT
Top1
Per2
Rev
DBP
UGT -0.2
Bmal1
1
Ccna2
0.2
0
-0.4 Bmal1
Rev
Ccna2
Wee1
DBP 0
Top1
0.6
Ccnb2 0.4
0.2 0
60
Bcl2
0.4
Ccna2
DBP
UGT
0.6
-Log L DFE
0.8
Mdm2
Bcl2
80
70
Bax
Mdm2
0.6
0.2 DBP
1
P53
0.8
0.8
Bcl2
Ccnb2
0.4
1
P53
1
Mdm2
0.6 Bcl2
Ccnb2 0.4
P53
Bax Mdm2
0.6
Ccnb2
P53
-10
Rev 1
2
3
4
5
6
7
-20
1
SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.
2
3
4
5
31/42
6
7
Sparse PCA ◮
In classical PCA, FA and ICA, one looks to obtain principal (uncorrelated or independent) components.
◮
In Sparse PCA or FA, one looks for the loading matrix A with sparsest components.
◮
This can be imposed via the prior p(A). This leads to least variables selections. PCA
SPCA
PCA
SPCA 1
1 P53
0.8
Bax
P53
0.6
Bax
P53
0.8
P53
Bax
0.6
Bax
0.6
0.6 Mdm2
0.4
Mdm2
Mdm2
0.4
Mdm2 0.4
0.4
Bcl2 P21
0.2
Wee1
Bcl2
Bcl2 0.2
P21 Wee1
Wee1
0
0.2
P21 Wee1
0
0 DBP
Bcl2 0.2
P21
0
DBP
DBP
UGT
UGT
DBP -0.2
UGT
-0.2
Top1
-0.4
CE2
-0.2
Top1
UGT -0.4
Top1
CE2
-0.4
CE2
-0.6
-0.2
Top1 CE2
-0.4
-0.6 Bmal1
Bmal1 -0.8
Per2 Rev
-1 1
2
A. Mohammad-Djafari,
Bmal1 -0.6
Per2 Rev
-0.8
Per2
2
-0.6
Per2 -1
Rev 1
Bmal1
1
2
3
Rev 1
2
SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.
3
32/42
Discriminant Analysis ◮
◮
When we have data and classes, the question to answer is: What are the most discriminant factors? There are many variants: ◮ ◮ ◮ ◮
Linear Discriminant Analysis (LDA), Quadratic Discriminant Analysis (QDA), Exponential Discriminant Analysis (EDA), Regularized LDA (RLDA), ...
◮
One can also ask for Sparsest Linear Discriminant factors (SLDA)
◮
Deterministic point of view (Geometrical distances)
◮
Probabilistic point of view (Mixture densities)
◮
Mixture of Gaussians models: Each classe is modelled by a Gaussian pdf
A. Mohammad-Djafari,
SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.
33/42
Discriminant Analysis: Time series, Colon 1 2 3
6 4 2
P53
12
Bax
10
Mdm2 8
0
Bcl2
-2 Ccnb2
6
-6
Ccna2
4
-8
Wee1
2
DBP
1
UGT
-4
2
0
0
Top1
-2
CE2
-4
-1 -2
Bmal1
-3
-6
-4
Per2
-5
-8
Rev
-6 -5
0
A. Mohammad-Djafari,
5
-6
-4
-2
0
2
1
SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.
2
34/42
Sparse Discriminant Analysis: Time series, colon What are the sparsest discriminant factors? 12
1 2 3
10
P53
12
Bax
8 Mdm2 10
6 Bcl2
4 Ccnb2 8
50
Ccna2
40
Wee1
30
6
DBP
20
UGT
10
4
Top1
20 CE2
15
2
Bmal1
10
Per2
5
0
Rev
0 4
6
8
10
12 10
A. Mohammad-Djafari,
20
30
40
50
0
5
10
15
20
1
2
SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.
3
35/42
LDA and SLDA study on time serie: 1:before, 2:during, 3: SLDA-Time
LDA-Time -2
1 2 3
-4 -6
1 2 3
3 2 1
-8
0
-10
-1
-12
1.5 1
-14
0.5 0
13
-0.5
12
-1
11
2
10 9
1
8 0
7 6
-1
5 -14
-12
-10
-8
-6
-4
A. Mohammad-Djafari,
-2
6
8
10
12
-1
0
1
2
3
-1
0
1
SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.
-1
0
1
36/42
2
Dependancy graphs ◮
The main objective here is to show the dependencies between variables
◮
Three different measures can be used: Pearson ρ, Spearman ρs and Kendall τ
◮
In this study we used ρs
◮
A table of 2 by 2 mutual ρs are computed and used in different forms: Hinton, Adjacency table and Graphical network representation Hinton
Adjacency 1
P53
Network 1
P53
Bax
0.8
Mdm2
0.9
Bax Mdm2
P53
0.8
0.6 Bcl2 Ccnb2
0.4
Ccna2
Per2 Bmal1
0.7
Ccnb2 Ccna2
0.6
0.2 Wee1
Rev
Bax
Bcl2
Mdm2
CE2
Wee1 0.5
DBP
0
UGT
DBP 0.4
UGT
Bcl2
Top1
-0.2 Top1
Top1
CE2
-0.4
Bmal1
0.3
Ccnb2
CE2 0.2
Bmal1 -0.6
Per2
Per2
Rev
-0.8 Rev Per2 Bmal1 CE2 Top1 UGT DBP Wee1 Ccna2Ccnb2 Bcl2 Mdm2 Bax P53
A. Mohammad-Djafari,
UGT
0.1
Ccna2
Wee1
DBP
Rev Rev Per2 Bmal1 CE2 Top1 UGT DBP Wee1 Ccna2Ccnb2 Bcl2 Mdm2 Bax P53
0
SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.
37/42
Graph of Dependancies: Colon, Class 1 Time series 1 P53
1 P53
Bax
0.8
Mdm2
0.9
Bax Mdm2
P53
0.8
0.6 Bcl2 Ccnb2
0.4
Ccna2
Per2 Bmal1
0.7
Ccnb2 Ccna2
0.6
0.2 Wee1
Rev
Bax
Bcl2
Mdm2
CE2
Wee1 0.5
DBP
0
UGT
DBP 0.4
UGT
Bcl2
Top1
-0.2 Top1
Top1
CE2
-0.4
Bmal1
0.3
Ccnb2
CE2 0.2
Bmal1
Ccna2
-0.6 Per2
Per2
Rev
-0.8
UGT
0.1
Wee1
DBP
Rev
Rev Per2 Bmal1 CE2 Top1 UGT DBP Wee1 Ccna2Ccnb2 Bcl2 Mdm2 Bax P53
Rev Per2 Bmal1 CE2 Top1 UGT DBP Wee1 Ccna2Ccnb2 Bcl2 Mdm2 Bax P53
0
FT amplitudes 1 P53
P53
Bax
0.9
Bax 0.8
Mdm2 Bcl2
Mdm2
P53
0.8
0.6
Ccnb2 Ccna2
0.4
Per2 Bmal1
0.7
Ccnb2 Ccna2
Wee1
Rev
Bax
Bcl2
0.6
Mdm2
CE2
Wee1 0.5
DBP
DBP
UGT
0.2
Top1
0.4
UGT Top1
CE2 Bmal1
UGT
0.2
Per2
0.1
-0.2 Rev
Top1
Ccnb2
Bmal1
Per2
Bcl2
0.3
CE2 0
Ccna2
Wee1
DBP
Rev Rev Per2 Bmal1 CE2 Top1 UGT DBP Wee1 Ccna2Ccnb2 Bcl2 Mdm2 Bax P53
A. Mohammad-Djafari,
Rev Per2 Bmal1 CE2 Top1 UGT DBP Wee1 Ccna2Ccnb2 Bcl2 Mdm2 Bax P53
0
SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.
38/42
Classification tools
◮
Supervised classification ◮ ◮ ◮
◮
K nearest neighbors methods Needs Training sets data Must be careful to measure the performances of the classification on a different set of data (Test set)
Unsupervised classification ◮ ◮ ◮ ◮
Mixture models Expectation-Maximization methods Bayesian versions of EM Bayesian Variational Approximation (VBA)
A. Mohammad-Djafari,
SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.
39/42
Classification tools 4
4
x 10
3.5
3
2.5
2
1.5
1
0.5
0
0
10
20
30
40
50 60 Time in hours
70
80
4
4500 13 14 19 20 21 22 35 36 37
4000
3500
3000
2
90
100
110
4
x 10
4 1 2 3 4 5 6 7 8 9 10 11 12 18 33 34
1.8
1.6
1.4
1.2
2500 1
2000 0.8
x 10
15 16 17 23 24 25 26 27 28 29 30 31 32
3.5
3
2.5
2
1.5
1500 0.6 1
1000 0.4
500
0
0
10
20
30
40
50 60 Time in hours
70
80
90
100
110
4
2
0.5
0.2
0
0
10
20
30
40
50 60 Time in hours
70
80
90
100
110
0
0
10
20
30
40
50 60 Time in hours
70
80
90
100
110
4500
x 10
1 2 3 4 10 11 12 18 33 34
1.8
1.6
1.4
13 14 19 20 21 22 35 36 37
4000
3500
3000
1.2
2500 1
2000 0.8
1500 0.6
1000 0.4
500
0.2
0
0
10
20
30
40
50 60 Time in hours
70
A. Mohammad-Djafari,
80
90
100
110
0
0
10
20
30
40
50 60 Time in hours
70
80
90
100
110
SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.
40/42
Input-Output modeling using training data and test data ◮ ◮
Linear models: gk = Afk + ǫk , k = 1, ·, K Bayesian framework, MAP estimation with hyperparameter estimation Y p(gk |A, fk )p(A) p(A|{gk , fk }) ∝ k
◮
Gaussian priors for ǫk and for A and MAP solution: b = arg maxA {p(A|{gk , fk })} A !−1 ! X X b = fk f ′ + λI gk f ′ A k
k
k
◮
◮
k
Other priors to enforce sparsity or bloc-sparsity of the prediction matrix A See the poster of Mircea Dumitru et al for weight loss prediction from the two genes expressions of Bmal1 and Rev-erb-alpha
A. Mohammad-Djafari,
SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.
41/42
Conclusions ◮
A lot to do to answer the questions of biologists
◮
Forward modeling and Bayesian inference are natural tools to answer these questions
◮
Very often the questions are ill-posed inverse problems which need prior knowledge
◮
Appropriate translation of prior knowledge to prior laws is very important
◮
Carefull computational algorithms have to be developped
◮
Carfeful presentation and interpretation of the inference results are very important
◮
Constant dialogue between ”biologists” and ”Data and Signal processors” is of great importance.
A. Mohammad-Djafari,
SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.
42/42