Advanced data, signal and image processing tools for biomedical and

Time representation f(t), Fourier Transform and Fourier representation .... Main question: Has something changed during and some medical ... Advantages: Well-known and understood, fast ... Semi-Parametric: νk = kν0,ν0 = 1/T,K = T −→ DFT.
6MB taille 3 téléchargements 382 vues
. Advanced data, signal and image processing tools for biomedical and medical applications Ali Mohammad-Djafari ` Groupe Problemes Inverses Laboratoire des Signaux et Syst`emes UMR 8506 CNRS - SUPELEC - Univ Paris Sud 11 ´ Supelec, Plateau de Moulon, 91192 Gif-sur-Yvette, FRANCE. [email protected] http://djafari.free.fr http://www.lss.supelec.fr

SPIS2015, December 16-17, 2015, Amirkabir University of Technology (AUT)

A. Mohammad-Djafari,

SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.

1/42

Summary 1 ◮

Data, signals, images in Biological and medical applications ◮ ◮



A great number of data, variables, time series, signals, images, ... ◮ ◮ ◮



Individual cells, Population of cells, Small animals, Human In vitro and In Vivo

Genes expression, Hormones, temperature, ECG, EMG, ... Tomographic images (X rays, PET, SPECT, IRM), 3D body volume, fMRI, Holographic, multi- and Hyper-spectral images, ...

Need for Visualization tools ◮ ◮ ◮ ◮

multicomponent, multivariate and multidimensional Time domain Transformed domain: Fourier, Wavelets, Time-Frequency... Scatter plots, histograms, statistics, ...

A. Mohammad-Djafari,

SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.

2/42

Summary 2 ◮

Modeling time series ◮







Modeling images ◮ ◮



Parametric: Superposition of sinusoids, Gaussians shapes, ... Non Parametric: Fourier, Wavelets, Time-frequency, time-scale,... Probabilistic: Moving Average (MA), Autoregressive (AR), ARMA, Markovian models, ...

Simple Markovian models (intensity, texture,...) Hierarchical Markovian models with hidden variables of contours and regions

Modeling the relation between observed data and unknowns ◮ ◮

Linear / Non linear Training and test data

A. Mohammad-Djafari,

SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.

3/42

Summary 3 ◮

Simple Analysis: Estimating periods, Computing harmonic components, spectra, , ...



Multicomponent/Multivariate data analysis: Dimensional Reduction PCA, FA, ICA, Sparse PCA for dimensional reduction and main factors extraction



Multicomponent/Multivariate Discriminant Analysis with classification: LDA, EDA, RDA, Sparse LDA for finding the most discriminant factors



Blind sources separation



Correlation (Pearson or Spearman) computation and dependency graph visualization



Modelling input-output relations

A. Mohammad-Djafari,

SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.

4/42

1D data: one variable

xi , i = 1, · · · , M



Data:



1D plot, mean, median, variance



No order: exchangeable



histogram, probability distribution,



Statistical modelling: expected value, variance, mode, median, Higher order moments, entropy



Parametric, semi-parametric and Non Parametric modelling



Parameter estimation: MM, ML, Bayesian



Model selection: AIC, BIC, ...

A. Mohammad-Djafari,

SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.

5/42

1D data (Gaussian) 3

2

1

0

-1

-2

-3

-4 0

20

40

60

80

100

120

140

160

180

200

30

V=0.9727

0.1

20

0.08

15

0.06

10

0.04

5

0 -4

E=0.12323

0.12

25

0.02

-3

-2

A. Mohammad-Djafari,

-1

0

1

2

3

0 -3

-2

-1

0

1

SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.

2

3

6/42

1D data (Gamma) 6

5

4

3

2

1

0 0

20

40

60

80

100

120

140

160

180

200

60

0.35

0.3

50

0.25 40

0.2 30

0.15 20

E=1.0291 V=1.0623

0.1

10

0.05

0 0

1

2

A. Mohammad-Djafari,

3

4

5

6

0

0

1

2

3

4

5

SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.

6

7

7/42

1D signals: Time series ◮

1D Signal: Time series: xi = f (ti )



In general no exchangeable.



Time representation f (t), Fourier Transform and Fourier representation F (ν), Auto Correlation function R(τ ), Power Spectral Density S(ν)



Stationary and non stationary



STFT, Time-Frequency, Time-Scale, Wavelets, ...



Smoothing, Noise removing, Filtering



Periodic signals, estimation of the period, Fourier series Modeling:



◮ ◮ ◮

Sum of sinusoids model and parameter estimation Moving average (MA) model Autoregressive (AR) model

A. Mohammad-Djafari,

SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.

8/42

1D signals f (t) and its FT F (ν) f(t)

|F(nu)|

1

0.45

0.5

0.35

0

0.25

-0.5

0.15

-1

0.05

0.4

0.3

0.2

0.1

0 0

24

48

72

96

0

1

time hours

Modulus of its FT |F (ν)|2

signal f (t) $\Gamm(\tau)$

1

2

frequency 1/day

|F(nu)|

0.8 0.2 0.6

0.4 0.15 0.2

0 0.1 -0.2

-0.4 0.05 -0.6

-0.8

0 24

48

72

96

time hours

Auto-Correlation function R(τ )

A. Mohammad-Djafari,

0

1

2

frequency 1/day

Spectral density S(ν)

SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.

9/42

Multi-component, multi-variate, multi-dimensional data {(xi , yi )}, i = 1, · · · , M}, 2M elements



bi-component:



bi-variate: {xi , i = 1, · · · , M}, {yj , j = 1, · · · , N}, M + N elements



bi-dimensional: Images: xi,j , i = 1, · · · , M, j = 1, · · · , N, M ∗ N elements 2 data sets

3

x y

2

3

10

i j

2

20

0

2

30

1

-2

1

40 0 0

50 -1

60 -1

2

70 -2 -2

0

80

-3

90

-2

-3

100

-4 0

20

40

60

80

100

A. Mohammad-Djafari,

120

140

160

180

200

50

100

150

200

10

20

30

40

SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.

50

60

70

80

10/42

90

100

Bi-component or Bi-variate data ◮ ◮ ◮ ◮

◮ ◮

2D distribution: joint probability distribution p(x, y) Conditionals p(x|y), p(y|x) Marginal distributions p(x), p(y) Expected values E (X ), E (Y ), variances V (X ), V (Y ), and Covariances, Higher order moments, entropy Independence tests Copula, ... 3 2 data sets

3

2

xi yj

2

1

1

0

0

y

-1

-2

-1

-3

-4 0

20

40

60

80

100

120

140

160

180

200

-2 2 data sets

3

xi yj

2

-3

1

-3

0

-2

-1

0

1

2

3

4

x -1

1 -2

-3

-4 0

20

40

60

80

A. Mohammad-Djafari,

100

120

140

160

180

200

SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.

11/42

Bivariate data ◮

Joint, marginals and conditional probability density functions 0.14

2

0.12 0.1 0.08 0.06

1

0.04

y

0.02 0

0

5 20 10

-1

15 10

15 5 20

-2 -3 0

x

2

Probability Density

0.14

-2

0.12 0.1 0.08 0.06 0.04 0.02

2 1

3 2

0

x2

1 0

-1 -2

A. Mohammad-Djafari,

-1 -2

x1

SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.

12/42

Images: Space, Fourier and Wavelets representations

50

100

150

200

250 50

100

150

200

250

50

100

150

200

250

50

100

150

200

250

A. Mohammad-Djafari,

SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.

13/42

Sparse images (Fourier and Wavelets domain)

Image

Fourier

3500

Wavelets

6000

10 3000

5000

8

2500 4000 2000

6 3000

1500

4 2000 1000

0

2

1000

500

0

50

100

150

200

250

0

Image hist.

0

2

4

6

8

10

12

14

0 16 -500

-400

-300

-200

-100

0

100

200

300

400

500

Fourier coeff. hist. Wavelet coeff. hist.

8000

1400

450

400

7000

1200 350

6000 1000 300 5000 800

250

4000 200

600 3000

150 400 2000 100 200

1000

0 -200

50

-150

-100

-50

0

50

100

150

200

0 -200

-150

-100

-50

0

50

100

150

200

0 -200

-150

-100

-50

0

50

100

150

200

bands 1-3 4-6University (Polytechnique), bands 7-9 14/42 SPIS2015, December bands 16-17, Amirkabir Tehran.

A. Mohammad-Djafari,

Multi-variate data and signals ◮

Multi-variate data: ◮ ◮ ◮ ◮ ◮



Multi-variate signals: ◮ ◮ ◮



Scatterplots, Correlation coefficients (Pearson and Spearman) Multivariate probability distribution, means, variances, covariances, Joint, marginals and conditional probability density functions

Multi-variate time series, Auto and Inter Correlation functions, Auto and Inter Power Spectral Densities

Needs for advances visualization, Dimensionality Reduction, Factorial analysis, modelling, parameter estimation, classification, ... and Knowledge extraction

A. Mohammad-Djafari,

SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.

15/42

Temperature and activity Time series before, during and some treatment BOD: before 800

36

600

34

400

32

Activity

Temperature

38

200

30 0

24

48

0

BOD: during 800

36

600

34

400

32

Activity

Temperature

38

200

30 0

24

48

0

BOD: after 800

36

600

34

400

32

200

30 0 A. Mohammad-Djafari,

24 48 SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.

0

16/42

Activity

Temperature

38

A. Mohammad-Djafari,

SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.

17/42

Simple questions for 1D time series ◮

Main question: Has something changed during and some medical action?



In this study: effects of circadian cycle on cancer cells.

1. Is there any periodic components in these signals? Yes/No (Detection)? Confidence ? 2. If Yes, How many? 3. What are those components (Periods pi or Frequencies νi , Amplitudes ai )? ◮





When questions 1 and 2 are answered, the problem becomes easier: Parameter estimation Trying to answer all the three questions at the same time: semi- or Non-Parametric modelling Biologists always need uncertainties −→ Bayesian inference

A. Mohammad-Djafari,

SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.

18/42

Simple Analysis tools may not be successful even in very simple cases Case of 1 sinusoid

1

0.7 0.8

0.6

0.6

0.4

0.5

0.2

0.4 0

0.3

-0.2

-0.4

0.2 -0.6

0.1 -0.8

-1

0

24

48

72

0

24 period in hours

time (hours)

Case of 2 sinusoids+noise 1

0.7

0.8

0.6 0.6

0.4

0.5

0.2

0.4 0

0.3

-0.2

-0.4

0.2

-0.6

0.1 -0.8

-1

0

24

48

72

0

24

12 period in hours

time (hours)

Case of 3 sinusoids+noise 1

0.45

0.8

0.4

0.6

0.35

0.4

0.2

0.3

0

0.25

-0.2

0.2

-0.4

0.15

0.1

-0.6

0.05

-0.8

-1

0

24

48 time (hours)

A. Mohammad-Djafari,

72

0 Inf

36

24

14.4 period in hours

SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.

19/42

Classical methods: Spectral estimation S(ω) ? ◮

Fast Fourier Transform (FFT): g(t) −→ FFT −→ f (ω) −→ S(ω) = |f (ω)|2 ◮ ◮



Advantages: Well-known and understood, fast Drawbacks: linear in frequencies ν, but not equidistance in periods ν = [0, · · · , N − 1] −→ p = [∞, 1, · · · , 1/(N − 1)]

Autocorrelation function: γ(τ ) ◮



If g(t) is periodic, then γ(τ ) is also periodic, but much smoother γ(0) = 1 γ(τ ) ≤ γ(0), ∀τ



Power spectral density: γ(τ ) −→ FFT −→ S(ω)



Autoregressive (AR), Moving Average (MA) and ARMA models



Non-stationary GARCH models



Sum of sinusoidal components

A. Mohammad-Djafari,

SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.

20/42

Parametric, Semi- and Non-Parametric models ◮

Parametric: g(t) =

K X

ak sin(2πνk t + φk ) + ǫ(t),

θ = {ak , φk , νk }

k =1

g(t) =

K X

ak cos(2πνk t)+bk sin(2πνk t)+ǫ(t), θ = {ak , bk , νk }

k =1

g(t) =

K X

ck exp [j2πνk t]+ǫ(t),

θ = {ck , νk }, t = 0, · · · , T

k =1 ◮ ◮

Semi-Parametric: νk = kν0 , ν0 = 1/T , K = T −→ DFT Non-Parametric: νk fixed in a given interval with given precision, so K is fixed but can be as large as necessary.

g(t) =

K X

ck exp [j2πνk t] + ǫ(t), θ = {ck } Linear model

k =1 A. Mohammad-Djafari,

SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.

21/42

Can we propose a unifying approach for all these problems? My answer is Yes: ◮ Identify what your are looking for. (red color f ) ◮ Identify what are the data : (blue color g) ◮ Consider the errors (modeling and measurement ǫ) ◮ Write the Forward model relating them: g = Hf + ǫ ◮ Write the expression of the likelihood p(g|f ) ◮ Translate your prior knowledge on the unknowns in p(f ) ◮ Use the Bayes rule: p(f |g) = ◮

p(g|f ) p(f ) ∝ p(g|f ) p(f ) p(g)

Infer on f using the posterior p(f |g): ◮



Maximum A Posteriori (MAP): Posterior Mean (PM):

A. Mohammad-Djafari,

b= f

Z

b = arg max {p(f |g)} f f

f p(f |g) df

SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.

22/42

Estimating Periodic Components: Inverse Problems Approach g(t) =

K X

ck exp [j2πνk t] + ǫ(t), θ = {ck } Linear model

k =1

Slight changes of notations: use of periods pn in place of frequencies νk and f n in place of ck : g(t) =

N X

f n exp [j2π/pm t] + ǫ(t), t = m∆t, m = 1, · · · , M

n=1

Defining the vectors: g = [g 1 , · · · , g M ]′ , ǫ = [ǫ1 , · · · , ǫM ]′ , f = [f 1 , · · · , f N ]′ and the matrix H: Hm,n = exp [j2π/pm m∆t], we obtain: g = Hf + ǫ The objective is to infer on f . A. Mohammad-Djafari,

SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.

23/42

Inverse Problems Approach g = Hf + ǫ Bayesian approach: ◮ Assign the Likelihood : p(g|f ) ◮ Assign the prior law: p(f ) ◮ Use the Bayes rule : p(f |g) ∝ p(g|f ) p(f ) ◮ MAP: b = arg max {p(f |g)} = arg min {J(f )} f f



f

Assuming Gaussian noise and Gaussian prior

b , Σ) b with Σ b = (H ′ H+λI)−1 and f b = arg max {J(f )} p(f |g) = N (f |f f

J(f ) = kg − Hf k2 + λkf k2



Other priors (Generalized Gaussian, Student-t or Cauchy) J(f ) = kg − Hf k2 + λΩ(f )

A. Mohammad-Djafari,

SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.

24/42

Bayesian estimation with priors enforcing sparsity ◮





Sparsity: For any periodic signal, the spectrum is a set of Diracs Biological signals related to clock genes: a few independent oscillators Spectrum has a few non zero elements in any given interval g(m∆t) =

N X

f n exp [−j2π/pm m∆t] + ǫ(t), m = 1, · · · , M

n=1

◮ ◮



g = Hf + ǫ with f sparse The question is now: How to translate sparsity? Two solutions: L1 regularization and Bayesian sparsity enforcing priors. Three main options in Bayesian: Genealized Gaussian, Student-t, mixtures models

A. Mohammad-Djafari,

SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.

25/42

Bayesian estimation with priors enforcing sparsity ◮ ◮



g = Hf + ǫ with f sparse To translate this information use the heavy tailed prior law Student-t with its hierarchical structure and hidden variables    ν+1 2 log 1 + f j /ν St(f j |ν) ∝ exp − 2

Infinite Gaussian Scaled Mixture (IGSM) property: Z ∞ N (f j |, 0, 1/zj ) G(zj |α, β) dzj , with α = β = ν/2 St(f j |ν) ∝= 0



Hiearchical prior model: p(f j |zj ) = N (f j |0, 1/zj ), p(zj ) = G(zj |α, β) Q   p(f |z) = Qj p(f j |zj ) p(z) = j p(zj ) −→ p(f , z|g) ∝ p(g|f )p(f |z)p(z)  p(g|f ) = N (g|Hf , σǫ2 I)

A. Mohammad-Djafari,

SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.

26/42

Results on simulated and real activity data CT 502 B3 Activity DDBef Signal

10

CT 502 B3 Activity DDBef VBA CT 502 B3 Activity DDBef Signal

8

CT 502 B3 Activity DDBef FFT

2

2

1.8

1.8

1.6

1.6

1.4

1.4

1.2

1.2

fbF F T

6

2

1

Amplitude

Amplitude

Amplitude

4

0.8

1 0.8

0.6

0.6

0.4

0.4

0

0.2

-2

0.2

0

0

-4 0

1

8

2

9

10

11

12

13

14

15

16

17

18

Data BEFORE

20

21

22

23

24

25

26

27

28

29

30

31

32

6

6.54

7.2

8

9

10.28

12

14.4

18

24

36

Periods

Proposed method

CT 502 B3 Activity DDDur Signal

10

19

Periods

Number of days

FFT

CT 502 B3 Activity DDDur VBA

CT 502 B3 Activity DDDur FFT

1.4

1.4

1.2

1.2

CT 502 B3 Activity DDDur Signal

8

1

1

0.8

0.8

fbF F T

6

Amplitude

Amplitude

Amplitude

4 0.6

0.6

2 0.4

0

0.4

0.2

-2

0.2

0

0

1

2

3

8

4

0

9

10

11

12

13

14

15

16

17

18

Data DURING

20

21

22

23

24

25

26

27

28

29

30

31

32

8

8.57

9.23

10

10.9

12

13.33

15

17.14

20

24

30

Periods

Proposed method

CT 502 B3 Activity DDAft Signal

10

19

Periods

Number of days

FFT

CT 502 B3 Activity DDAft VBA

CT 502 B3 Activity DDAft FFT

CT 502 B3 Activity DDAft Signal

8

2

2

1.8

1.8

1.6

1.6

1.4

1.4

1.2

1.2

fbF F T

6

2

0

1

Amplitude

Amplitude

Amplitude

4

0.8

1 0.8

0.6

0.6

0.4

0.4

0.2

0.2

-2 0

0

1

Number of days

Data AFTER A. Mohammad-Djafari,

8

0

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

Periods

Proposed method

30

31

32

6

6.85

8

9.6

12

16

24

Periods

FFT

SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.

27/42

48

Dimension reduction, PCA, Factor Analysis, ICA ◮

M variables g(t) are observed. They are redundant. Can we express them with N ≤ M factors f ? How many factors (Principal Components, Independent Components) can describe the observed data?

(  P A : (M × N) Loading matrix , N ≤ M g i (t) = N j=1 aij f j (t) + ǫi (t) f (t) : factors, sources g(t) = Af (t) + ǫ(t) ◮

How to find both A and factors f (t) ?



Bayesian methods:

b f b ) = arg max {p(A, f |g)} = arg min {ln p(g|A, f ) − ln p(A) − ln p(f (A, (A,f )

A. Mohammad-Djafari,

(A,f )

SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.

28/42

How to determine the number of factors ◮

When N is given: p(A, f |g) ∝ p(g|A, f ) p(A) p(f )



Different choices for p(A) and p(f ) and Different methods to estimate both A and f : JMAP, EM, Variational Bayesian Approximation When N is not known: ◮ ◮ ◮





Model selection Bayesian or Maximum likelihood methods To determine the number of factors we do the analyze with different N factors and use two criteria: -log likelihood − ln p(g|A, N) of the observations and DFE: Degrees of freedom error (N − M)2 − (N + M))/2 related to AIC or BIC model selection criteria.

A. Mohammad-Djafari,

SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.

29/42

Factor Analysis: 13 variables 2 factors

Time series

FT Amplitudes

1

P53

Ccna2 0.8

0.8

Bax

Rev

P53

1

1

Ccnb2

UGT

Bax

Per2 Bmal1 DBP

0.8

Mdm2

0.6

0.6

Bax

Bcl2 0.4

Ccnb2

Rev

0.2

Wee1 DBP

0

UGT

Component 2

0.4

Ccna2

Mdm2

Wee1

P53 Bcl2 DBP

-0.2

Top1

DBP

0.4

-0.8

Bmal1

-0.8

Per2

0

Per2 -0.6

-1

-1 -1

1

-0.6

CE2

Per2

Rev

Bax P53

-0.2 -0.4

0.2

Wee1

-0.6

-0.4

0

Top1

-0.2

Bmal1

Wee1 Ccna2 CE2 Top1

0.2

UGT -0.4

Top1 CE2

0.6

Ccna2

CE2 0

Ccnb2 Mdm2 Bcl2

0.4

Ccnb2

UGT

Bmal1

0.2

0.6

0.8

Bcl2

Component 2

Mdm2

2

A. Mohammad-Djafari,

-0.8

-0.6

-0.4

-0.2

0 0.2 Component 1

0.4

0.6

0.8

1

Rev

-1

1

-0.8

-0.6

-0.4

-0.2

2

SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.

0 0.2 Component 1

0.4

0.6

0.8

30/42

1

Factor Analysis: Time series, Number of factors

1 P53

P53

Bax

Bax

0.8

Mdm2

Mdm2

Bcl2

0.6 Bcl2

P53 0.8

0.8

Bax Mdm2

Ccna2 We1

Bax

0.8

Ccnb2

0.4

0.6

Bcl2

Bax

Ccnb2

Ccnb2

Ccna2

Ccna2

We 1

0.2 Wee1

0.2 Wee1

0.4

DBP

UGT

0

Top1

Top1

Ccna2 0.2

CE2

-0.2 CE2

Wee1

UGT

-0.2

CE2

CE2

Bmal1

-0.4 Bmal1

-0.4 Per2

Per2

Per2

.

2

.

-0.2

CE2 -0.4

0

Top1

1

2

3

30

-0.2

CE2

Bmal1

Bmal1

.

1

2

3

A. Mohammad-Djafari,

4

.

-0.6 1

2

3

4

5

.

10

0 -0.2

Per2

Rev

-0.6 1

2

3

4

5

6

20

0

CE2 Bmal1

Per2

Rev

UGT

-0.4

Per2

-0.6

Rev

0.2

Top1

-0.4

Rev

40

DBP

UGT

0

Top1

-0.2

-0.6

-0.6 1

50

0.4

Wee1

DBP

UGT

Top1

Per2

Rev

DBP

UGT -0.2

Bmal1

1

Ccna2

0.2

0

-0.4 Bmal1

Rev

Ccna2

Wee1

DBP 0

Top1

0.6

Ccnb2 0.4

0.2 0

60

Bcl2

0.4

Ccna2

DBP

UGT

0.6

-Log L DFE

0.8

Mdm2

Bcl2

80

70

Bax

Mdm2

0.6

0.2 DBP

1

P53

0.8

0.8

Bcl2

Ccnb2

0.4

1

P53

1

Mdm2

0.6 Bcl2

Ccnb2 0.4

P53

Bax Mdm2

0.6

Ccnb2

P53

-10

Rev 1

2

3

4

5

6

7

-20

1

SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.

2

3

4

5

31/42

6

7

Sparse PCA ◮

In classical PCA, FA and ICA, one looks to obtain principal (uncorrelated or independent) components.



In Sparse PCA or FA, one looks for the loading matrix A with sparsest components.



This can be imposed via the prior p(A). This leads to least variables selections. PCA

SPCA

PCA

SPCA 1

1 P53

0.8

Bax

P53

0.6

Bax

P53

0.8

P53

Bax

0.6

Bax

0.6

0.6 Mdm2

0.4

Mdm2

Mdm2

0.4

Mdm2 0.4

0.4

Bcl2 P21

0.2

Wee1

Bcl2

Bcl2 0.2

P21 Wee1

Wee1

0

0.2

P21 Wee1

0

0 DBP

Bcl2 0.2

P21

0

DBP

DBP

UGT

UGT

DBP -0.2

UGT

-0.2

Top1

-0.4

CE2

-0.2

Top1

UGT -0.4

Top1

CE2

-0.4

CE2

-0.6

-0.2

Top1 CE2

-0.4

-0.6 Bmal1

Bmal1 -0.8

Per2 Rev

-1 1

2

A. Mohammad-Djafari,

Bmal1 -0.6

Per2 Rev

-0.8

Per2

2

-0.6

Per2 -1

Rev 1

Bmal1

1

2

3

Rev 1

2

SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.

3

32/42

Discriminant Analysis ◮



When we have data and classes, the question to answer is: What are the most discriminant factors? There are many variants: ◮ ◮ ◮ ◮

Linear Discriminant Analysis (LDA), Quadratic Discriminant Analysis (QDA), Exponential Discriminant Analysis (EDA), Regularized LDA (RLDA), ...



One can also ask for Sparsest Linear Discriminant factors (SLDA)



Deterministic point of view (Geometrical distances)



Probabilistic point of view (Mixture densities)



Mixture of Gaussians models: Each classe is modelled by a Gaussian pdf

A. Mohammad-Djafari,

SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.

33/42

Discriminant Analysis: Time series, Colon 1 2 3

6 4 2

P53

12

Bax

10

Mdm2 8

0

Bcl2

-2 Ccnb2

6

-6

Ccna2

4

-8

Wee1

2

DBP

1

UGT

-4

2

0

0

Top1

-2

CE2

-4

-1 -2

Bmal1

-3

-6

-4

Per2

-5

-8

Rev

-6 -5

0

A. Mohammad-Djafari,

5

-6

-4

-2

0

2

1

SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.

2

34/42

Sparse Discriminant Analysis: Time series, colon What are the sparsest discriminant factors? 12

1 2 3

10

P53

12

Bax

8 Mdm2 10

6 Bcl2

4 Ccnb2 8

50

Ccna2

40

Wee1

30

6

DBP

20

UGT

10

4

Top1

20 CE2

15

2

Bmal1

10

Per2

5

0

Rev

0 4

6

8

10

12 10

A. Mohammad-Djafari,

20

30

40

50

0

5

10

15

20

1

2

SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.

3

35/42

LDA and SLDA study on time serie: 1:before, 2:during, 3: SLDA-Time

LDA-Time -2

1 2 3

-4 -6

1 2 3

3 2 1

-8

0

-10

-1

-12

1.5 1

-14

0.5 0

13

-0.5

12

-1

11

2

10 9

1

8 0

7 6

-1

5 -14

-12

-10

-8

-6

-4

A. Mohammad-Djafari,

-2

6

8

10

12

-1

0

1

2

3

-1

0

1

SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.

-1

0

1

36/42

2

Dependancy graphs ◮

The main objective here is to show the dependencies between variables



Three different measures can be used: Pearson ρ, Spearman ρs and Kendall τ



In this study we used ρs



A table of 2 by 2 mutual ρs are computed and used in different forms: Hinton, Adjacency table and Graphical network representation Hinton

Adjacency 1

P53

Network 1

P53

Bax

0.8

Mdm2

0.9

Bax Mdm2

P53

0.8

0.6 Bcl2 Ccnb2

0.4

Ccna2

Per2 Bmal1

0.7

Ccnb2 Ccna2

0.6

0.2 Wee1

Rev

Bax

Bcl2

Mdm2

CE2

Wee1 0.5

DBP

0

UGT

DBP 0.4

UGT

Bcl2

Top1

-0.2 Top1

Top1

CE2

-0.4

Bmal1

0.3

Ccnb2

CE2 0.2

Bmal1 -0.6

Per2

Per2

Rev

-0.8 Rev Per2 Bmal1 CE2 Top1 UGT DBP Wee1 Ccna2Ccnb2 Bcl2 Mdm2 Bax P53

A. Mohammad-Djafari,

UGT

0.1

Ccna2

Wee1

DBP

Rev Rev Per2 Bmal1 CE2 Top1 UGT DBP Wee1 Ccna2Ccnb2 Bcl2 Mdm2 Bax P53

0

SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.

37/42

Graph of Dependancies: Colon, Class 1 Time series 1 P53

1 P53

Bax

0.8

Mdm2

0.9

Bax Mdm2

P53

0.8

0.6 Bcl2 Ccnb2

0.4

Ccna2

Per2 Bmal1

0.7

Ccnb2 Ccna2

0.6

0.2 Wee1

Rev

Bax

Bcl2

Mdm2

CE2

Wee1 0.5

DBP

0

UGT

DBP 0.4

UGT

Bcl2

Top1

-0.2 Top1

Top1

CE2

-0.4

Bmal1

0.3

Ccnb2

CE2 0.2

Bmal1

Ccna2

-0.6 Per2

Per2

Rev

-0.8

UGT

0.1

Wee1

DBP

Rev

Rev Per2 Bmal1 CE2 Top1 UGT DBP Wee1 Ccna2Ccnb2 Bcl2 Mdm2 Bax P53

Rev Per2 Bmal1 CE2 Top1 UGT DBP Wee1 Ccna2Ccnb2 Bcl2 Mdm2 Bax P53

0

FT amplitudes 1 P53

P53

Bax

0.9

Bax 0.8

Mdm2 Bcl2

Mdm2

P53

0.8

0.6

Ccnb2 Ccna2

0.4

Per2 Bmal1

0.7

Ccnb2 Ccna2

Wee1

Rev

Bax

Bcl2

0.6

Mdm2

CE2

Wee1 0.5

DBP

DBP

UGT

0.2

Top1

0.4

UGT Top1

CE2 Bmal1

UGT

0.2

Per2

0.1

-0.2 Rev

Top1

Ccnb2

Bmal1

Per2

Bcl2

0.3

CE2 0

Ccna2

Wee1

DBP

Rev Rev Per2 Bmal1 CE2 Top1 UGT DBP Wee1 Ccna2Ccnb2 Bcl2 Mdm2 Bax P53

A. Mohammad-Djafari,

Rev Per2 Bmal1 CE2 Top1 UGT DBP Wee1 Ccna2Ccnb2 Bcl2 Mdm2 Bax P53

0

SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.

38/42

Classification tools



Supervised classification ◮ ◮ ◮



K nearest neighbors methods Needs Training sets data Must be careful to measure the performances of the classification on a different set of data (Test set)

Unsupervised classification ◮ ◮ ◮ ◮

Mixture models Expectation-Maximization methods Bayesian versions of EM Bayesian Variational Approximation (VBA)

A. Mohammad-Djafari,

SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.

39/42

Classification tools 4

4

x 10

3.5

3

2.5

2

1.5

1

0.5

0

0

10

20

30

40

50 60 Time in hours

70

80

4

4500 13 14 19 20 21 22 35 36 37

4000

3500

3000

2

90

100

110

4

x 10

4 1 2 3 4 5 6 7 8 9 10 11 12 18 33 34

1.8

1.6

1.4

1.2

2500 1

2000 0.8

x 10

15 16 17 23 24 25 26 27 28 29 30 31 32

3.5

3

2.5

2

1.5

1500 0.6 1

1000 0.4

500

0

0

10

20

30

40

50 60 Time in hours

70

80

90

100

110

4

2

0.5

0.2

0

0

10

20

30

40

50 60 Time in hours

70

80

90

100

110

0

0

10

20

30

40

50 60 Time in hours

70

80

90

100

110

4500

x 10

1 2 3 4 10 11 12 18 33 34

1.8

1.6

1.4

13 14 19 20 21 22 35 36 37

4000

3500

3000

1.2

2500 1

2000 0.8

1500 0.6

1000 0.4

500

0.2

0

0

10

20

30

40

50 60 Time in hours

70

A. Mohammad-Djafari,

80

90

100

110

0

0

10

20

30

40

50 60 Time in hours

70

80

90

100

110

SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.

40/42

Input-Output modeling using training data and test data ◮ ◮

Linear models: gk = Afk + ǫk , k = 1, ·, K Bayesian framework, MAP estimation with hyperparameter estimation Y p(gk |A, fk )p(A) p(A|{gk , fk }) ∝ k



Gaussian priors for ǫk and for A and MAP solution: b = arg maxA {p(A|{gk , fk })} A !−1 ! X X b = fk f ′ + λI gk f ′ A k

k

k





k

Other priors to enforce sparsity or bloc-sparsity of the prediction matrix A See the poster of Mircea Dumitru et al for weight loss prediction from the two genes expressions of Bmal1 and Rev-erb-alpha

A. Mohammad-Djafari,

SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.

41/42

Conclusions ◮

A lot to do to answer the questions of biologists



Forward modeling and Bayesian inference are natural tools to answer these questions



Very often the questions are ill-posed inverse problems which need prior knowledge



Appropriate translation of prior knowledge to prior laws is very important



Carefull computational algorithms have to be developped



Carfeful presentation and interpretation of the inference results are very important



Constant dialogue between ”biologists” and ”Data and Signal processors” is of great importance.

A. Mohammad-Djafari,

SPIS2015, December 16-17, Amirkabir University (Polytechnique), Tehran.

42/42