Unmixing hyperspectral images using Markov random fields

Unmixing: crucial step in hyperspectral images analysis. Spectral mixing. Measured pixel: mixture of pure spectra (endmembers) characterized by their.
4MB taille 0 téléchargements 306 vues
Unmixing hyperspectral images using Markov random fields Olivier Eches, Nicolas Dobigeon and Jean-Yves Tourneret Universit´ e de Toulouse - IRIT/INP-ENSEEIHT Toulouse, FRANCE

http://www.enseeiht.fr/~eches

MaxEnt 2010, Chamonix, France

1 / 37

Hyperspectral imagery

What is an hyperspectral image? Same scene observed in different spectral bands 3-Dimension image: length, width and wavelength

Example of hyperspectral cube (Big Island, Hawaii, USA) 2 / 37

Hyperspectral imagery

Single pixel Pixel represented by a vector of hundreds of measurements

Applications mineral exploration agriculture: soil quality, crop forecasting, forest monitoring environment: pollution detection, climatic change detection military: target detection (minefields, vehicles,...), cartography 3 / 37

Unmixing: crucial step in hyperspectral images analysis

Spectral mixing Measured pixel: mixture of pure spectra (endmembers) characterized by their corresponding fractions (abundances). Common assumption: Linear mixing model If the pure materials are spatially disjoint in the pixel, the measured spectrum is the linear combination of the corresponding pure spectra.

Figure: Linear observation model

4 / 37

Standard mixing model

Linear mixing model (LMM) For a given pixel p: yp =

R X

mr ar,p + np ,

r=1 T

yp = [y1,p , . . . , yL,p ] the observed pixel p in L bands, R number of pure materials or endmembers, mr = [mr,1 , . . . , mr,L ]T the spectrum of the rth endmember, ar,p fraction or abundance of the rth endmember in the pth pixel, T

np = [n1,p , . . . , nL,p ] the additive noise in the pth observed pixel (assumed white Gaussian), T

Constraints on the abundance vectors ap = [a1,p , . . . , aR,p ]  ar,p ≥ 0, ∀r = 1, . . . , R PR r=1 ar,p = 1.

(1) 5 / 37

Spectral unmixing

Linear Mixing Model (LMM):

yp =

PR

r=1

mr ar,p + np L = 825 (0.4µm → 2.5µm), R = 3: green grass (solid line), galvanized steel metal (dashed line), bare red brick (dotted line), T

ap = [0.4, 0.2, 0.4] , SNR ≈ 20dB. Spectral unmixing problem Estimation of {m1 , . . . , mR } and αp . 6 / 37

Spectral unmixing

Unmixing steps 1

Endmember extraction step: estimation of R and m1 , . . . , mR (Vertex Component Analysis, N-FINDR, Pixel Purity Index,...)

2

Inversion step: estimation of the corresponding abundances T ap = [a1,p , . . . , aR,p ] (LS, ML and Bayesian approaches)

Most inversion strategies ignore the possible interactions between the pixels.

7 / 37

Spectral unmixing

Unmixing steps 1

Endmember extraction step: estimation of R and m1 , . . . , mR (Vertex Component Analysis, N-FINDR, Pixel Purity Index,...)

2

Inversion step: estimation of the corresponding abundances T ap = [a1,p , . . . , aR,p ] (LS, ML and Bayesian approaches)

Most inversion strategies ignore the possible interactions between the pixels.

Problem addressed in this work Estimation of ap under positivity and additivity constraints.

Main contributions + Exploiting spatial correlations in a new Bayesian inversion procedure + Using Markov random fields (MRFs) to model spatial interactions + Spatial correlations ⇒ image classification/segmentation 7 / 37

Outline

1

Introducing spatial structures

2

Hierarchical Bayesian model Likelihood Prior distributions Joint posterior distribution

3

Simulations Synthetic data Real data

4

Conclusions

8 / 37

Introducing spatial structures

Outline

1

Introducing spatial structures

2

Hierarchical Bayesian model Likelihood Prior distributions Joint posterior distribution

3

Simulations Synthetic data Real data

4

Conclusions

9 / 37

Introducing spatial structures

Image partitioning

Defining homogeneous regions Image of P pixels divided into K regions or classes. In each region, the pixels approximately share the same composition. T

Introducing hidden variable: label vector z = [z1 , . . . , zP ] , where zp = k ⇔ p ∈ Ik , k = {1, . . . , K}.

∀p ∈ I1 : zp = 1, E [ap ] = µ1 , Covar(ap ) = Γ1 , ∀p ∈ I2 : zp = 2, E [ap ] = µ2 , Covar(ap ) = Γ2 , ∀p ∈ I3 : zp = 3, E [ap ] = µ3 , Covar(ap ) = Γ3 .

10 / 37

Introducing spatial structures

Image partitioning

Example

Figure: Scatter-plot of dual-band data of an image partitioned in 3 classes. 11 / 37

Introducing spatial structures

Image partitioning

Defining homogeneous regions Image of P pixels divided into K regions or classes. In each region, the pixels approximately share the same composition. T

Introducing hidden variable: label vector z = [z1 , . . . , zP ] , where zp = k ⇔ p ∈ Ik , k = {1, . . . , K}.

∀p ∈ I1 : zp = 1, E [ap ] = µ1 , Covar(ap ) = Γ1 , ∀p ∈ I2 : zp = 2, E [ap ] = µ2 , Covar(ap ) = Γ2 , ∀p ∈ I3 : zp = 3, E [ap ] = µ3 , Covar(ap ) = Γ3 .

12 / 37

Introducing spatial structures

Image partitioning

Abundance reparametrization T

Introducing logistic coefficientsa tp = [t1,p . . . , tR,p ] where: exp(tr,p ) . ar,p = PR r=1 exp(tr,p )

(2)

⇒ Ensure positivity and sum-to-one constraints. ⇒ Each class k fully characterized by: E [tp |zp = k] = ψk , Covar [tp |zp = k] = Σk . a [A.

Gelman et al., J. Amer. Math. Soc., 1996]

13 / 37

Hierarchical Bayesian model

Outline

1

Introducing spatial structures

2

Hierarchical Bayesian model Likelihood Prior distributions Joint posterior distribution

3

Simulations Synthetic data Real data

4

Conclusions

14 / 37

Hierarchical Bayesian model

Likelihood

Unknown parameter vector

Unknown parameter vector Θ = {T , z, s} with T = [t1 , . . . , tP ] logistic coefficient matrix T

z = [z1 , . . . , zP ] label vector  T s = s21 , . . . , s2P noise variance vector

15 / 37

Hierarchical Bayesian model

Likelihood

Likelihood

Likelihood The LMM model and the Gaussian property of the noise vector yield f

yp |tp , s2p



 =

1 2πs2p

 L2

  kyp − M ap (tp )k2 exp − , 2s2p

(3)

where √ kxk = xT x is the standard `2 norm, ap (tp ) explicitly mention dependance of the abundance vector ap over the logistic coefficient vector tp . Assuming independence between the different noise vector np ⇒ f (Y |T , s) =

P Y

 f yp |tp , s2p .

(4)

p=1

16 / 37

Hierarchical Bayesian model

Prior distributions

Label prior Which prior for the labels? Independent prior distributions: f (z) =

P Y

f (zp )

p=1

⇒ no correlations between the pixels!

17 / 37

Hierarchical Bayesian model

Prior distributions

Label prior Which prior for the labels? Independent prior distributions: f (z) =

P Y

f (zp )

p=1

⇒ no correlations between the pixels! Markov Random fields... Potts-Markov random field   P X X 1 δ(zp − zp0 ) f (z) = exp β G(β) 0 p=1 p ∈V(p)

V(p) the 1-order neighborhood (4 pixels), G(β) normalizing constant (or partition function), β: granularity coefficient (known in this study),  1, if x = 0, δ(x) = 0, otherwise. +

Naturally introduces spatial correlations between labels of neighboring pixels 17 / 37

Hierarchical Bayesian model

Prior distributions

Markov random fields

Influence of β on the label prior Generation of synthetic images with K = 3 using Potts-Markov random fields with different values of β and 1-order neighborhood structure.

Figure: From left to right: β = 0.8, 1.4, 2.

+ small values of β: many small regions, + large values of β: few large regions, 18 / 37

Hierarchical Bayesian model

Prior distributions

Logistic coefficient prior distribution

Logistic coefficient prior distribution For each class k (k = 1, . . . , K), p ∈ Ik tp |zp = k, ψk , Σk ∼ N (ψk , Σk ), ⇒ Two hyperparameters (fixed or estimated) that characterize the kth class: ψk and Σk . For the whole set of pixels, the distribution for T = [t1 , . . . , tP ] is: f (T |Ψ, Σ) =

K Y Y

f (tp |zp = k, ψk , Σk )

k=1 p∈Ik

with Ψ = [ψ1 , . . . , ψK ] and Σ = {Σ1 , . . . , ΣK }.

19 / 37

Hierarchical Bayesian model

Prior distributions

Noise variance and hyperparameter prior distributions

Noise variance prior s2p |δ ∼ E(δ), where δ is an adjustable hyperparameter. P Y  f (s|δ) = f s2p |δ

(5)

p=1

Hyperparameter priors (hierarchical model) Let Ω = {δ, Ψ, Σ} the hyperparameter vector: K Y f (Ω) = f (δ) f (ψk )f (Σk ) k=1

with Ψ = [ψ1 , . . . , ψK ] (means of the logistic coefficient vectors) Σ = {Σ1 , . . . , ΣK } (covariance matrices of the logistic coefficient vectors). 20 / 37

Hierarchical Bayesian model

Joint posterior distribution

Joint posterior distribution of Θ and Ω

Joint posterior of parameter Θ and hyperparameter Ω vectors: f (Θ, Ω|Y ) ∝ f (Y |Θ)f (Θ|Ω)f (Ω).

(6)

Straightforward computations lead to:  L  P  Y 1 2 kyp − M ap (tp )k2 f (Θ, Ω|Y ) ∝ exp − s2p 2s2p p=1 " !# P 2 Y 1 2γ + p∈Ik (tr,p − ψr,k )2 ψr,k + × nk +1 exp − 2 2υ 2 2σr,k σr,k r,k

2     RK P  2 +1 Y 1 δ 1 ×δ exp − wp2 wp2 υ2 p=1   P X X × exp  βδ(zp − zp0 )

(7)

P −1

p=1 p0 ∈V(p)

where nk = card(Ik ) (combinatorial problem). 21 / 37

Hierarchical Bayesian model

Joint posterior distribution

Computing estimates of Θ

Joint posterior distribution too complex to obtain Bayesian estimates of Θ Simulation of samples Θ(t) asymptotically distributed according to f (Θ, Ω|Y ) using MCMC methods MMSE and MAP estimators ˆ MMSE ≈ Θ

1 NMC − Nbi

NMC −Nbi X

Θ(t) ,

(8)

t=1

ˆ MAP ≈ arg max f (Θ(t) |Y ). Θ

(9)

Θ(t)

22 / 37

Simulations

Outline

1

Introducing spatial structures

2

Hierarchical Bayesian model Likelihood Prior distributions Joint posterior distribution

3

Simulations Synthetic data Real data

4

Conclusions

23 / 37

Simulations

Synthetic data

Synthetic data

Simulation parameters 25 × 25 synthetic image composed of K = 3 classes

24 / 37

Simulations

Synthetic data

Synthetic data

Simulation parameters 25 × 25 synthetic image composed of K = 3 classes label map generated using a Potts-Markov random field with β = 1.1

Figure: Synthetic label map

24 / 37

Simulations

Synthetic data

Synthetic data

Simulation parameters 25 × 25 synthetic image composed of K = 3 classes label map generated using a Potts-Markov random field with β = 1.1 Each pixel mixed with R = 3 endmembers whose spectra are: construction concrete, green grass, micaceous loam (L = 413 bands)

Figure: Endmember spectra

25 / 37

Simulations

Synthetic data

Synthetic data

Simulation parameters 25 × 25 synthetic image composed of K = 3 classes label map generated using a Potts-Markov random field with β = 1.1 Each pixel mixed with R = 3 endmembers whose spectra are: construction concrete, green grass, micaceous loam (L = 413 bands) Noise level SNR ' 15dB

26 / 37

Simulations

Synthetic data

Synthetic data

Simulation parameters 25 × 25 synthetic image composed of K = 3 classes label map generated using a Potts-Markov random field with β = 1.1 Each pixel mixed with R = 3 endmembers whose spectra are: construction concrete, green grass, micaceous loam (L = 413 bands) Noise level SNR ' 15dB Abundances maps generated from truncated Gaussian distributions (with different means and covariance matrices in each class)

Figure: Synthetic abundances map 26 / 37

Simulations

Synthetic data

Synthetic data

Simulation results: label map Marginal MAP label estimates

Figure: Left: original labels. Right: estimated labels.

27 / 37

Simulations

Synthetic data

Synthetic data

Simulation results: abundance maps MMSE logistic coefficient estimates (conditionally to the MAP label estimates)

Figure: Top: Actual abundances. Bottom: estimated abundances. 28 / 37

Simulations

Synthetic data

Synthetic data

Simulation results: abundance means Table: Actual and estimated abundance mean and variance for each class.

Class 1

µ1 = E[ap |I1 ] Var[ap |I1 ] (×10−3 )

Real values [0.6, 0.3, 0.1]T [5, 5, 5]T

Estimated values [0.58, 0.29, 0.13]T [4.5, 4.3, 5.5]T

Class 2

µ2 = E[ap |I2 ] Var[ap |I2 ] (×10−3 )

[0.3, 0.5, 0.2]T [5, 5, 5]T

[0.29, 0.49, 0.2]T [4.5, 4.7, 5.3]T

Class 3

µ3 = E[ap |I3 ] Var[ap |I3 ] (×10−3 )

[0.3, 0.2, 0.5]T [5, 5, 5]T

[0.31, 0.19, 0.49]T [7, 4.2, 11.7]T

Example of results with the 2nd class

Figure: Histograms of the abundance means µ2 29 / 37

Simulations

Real data

Real data

Simulation parameters Real hyperspectral image of 50 × 50 pixels extracted from a larger image acquired in 1997 by AVIRIS (Moffett Field, CA, USA) data set reduced from 224 to 189 bands (water absorption bands removed) N-FINDR used to extract the R = 3 endmember spectra associated to water, soil and vegetation. K = 4 classes have been considered

Real hyperspectral data: Moffett field acquired by AVIRIS in 1997 (left) and the region of 30 / 37 interest shown in true colors (right).

Simulations

Real data

Real data

Abundance maps

Comparison with other results

Results from FCLS algorithma . a [D.

C. Heinz, C.-I Chang, IEEE Trans. on Geosci. and Remote Sensing, 2001] 31 / 37

Simulations

Real data

Real data

Label map

Figure: β = 1.1.

32 / 37

Conclusions

Outline

1

Introducing spatial structures

2

Hierarchical Bayesian model Likelihood Prior distributions Joint posterior distribution

3

Simulations Synthetic data Real data

4

Conclusions

33 / 37

Conclusions

Conclusions “Spatial” unmixing algorithm Exploitation of spatial correlations using MRFs within a Bayesian framework Hidden labels introduced to identify several classes defined by homogeneous composition of macroscopic materials Appropriate reparametrization for the abundance vector Results Good estimation of abundance coefficients Classification map obtained thanks to the estimates of the underlying labels Price to pay: high computational cost Perspectives Estimation of the granularity coefficient β, Other models for spatial correlations (discriminative random fields...) Replace MCMC methods by other computational methods: variational methods, (constrained ?) Hamiltonian MCMC,... 34 / 37

Conclusions

Unmixing hyperspectral images using Markov random fields Olivier Eches, Nicolas Dobigeon and Jean-Yves Tourneret Universit´ e de Toulouse - IRIT/INP-ENSEEIHT Toulouse, FRANCE

http://www.enseeiht.fr/~eches

MaxEnt 2010, Chamonix, France

35 / 37

Conclusions

Simulation results: comparisons

Synthetic data: comparisons with other algorithms MSE2r =

P 1 X (ˆ ar,p − ar,p )2 P p=1

Table: MSEs for abundance coefficients.

MSE21 MSE22 MSE23 a [D. b [N.

FCLSa 0.0019 4.3 × 10−4 0.0014

Bayesian independentb 0.0016 4.1 × 10−4 0.0013

Spatial 0.001 3.1 × 10−4 8.6 × 10−4

C. Heinz, C.-I Chang, IEEE Trans. on Geosci. and Remote Sensing, 2001] Dobigeon et al., IEEE Trans. on Signal Proc., 2008]

36 / 37

Conclusions

Real data

Label map

Figure: β = 1.1.

Figure: β = 2.

37 / 37