Unmixing hyperspectral images using Markov random fields Olivier Eches, Nicolas Dobigeon and Jean-Yves Tourneret Universit´ e de Toulouse - IRIT/INP-ENSEEIHT Toulouse, FRANCE
http://www.enseeiht.fr/~eches
MaxEnt 2010, Chamonix, France
1 / 37
Hyperspectral imagery
What is an hyperspectral image? Same scene observed in different spectral bands 3-Dimension image: length, width and wavelength
Example of hyperspectral cube (Big Island, Hawaii, USA) 2 / 37
Hyperspectral imagery
Single pixel Pixel represented by a vector of hundreds of measurements
Applications mineral exploration agriculture: soil quality, crop forecasting, forest monitoring environment: pollution detection, climatic change detection military: target detection (minefields, vehicles,...), cartography 3 / 37
Unmixing: crucial step in hyperspectral images analysis
Spectral mixing Measured pixel: mixture of pure spectra (endmembers) characterized by their corresponding fractions (abundances). Common assumption: Linear mixing model If the pure materials are spatially disjoint in the pixel, the measured spectrum is the linear combination of the corresponding pure spectra.
Figure: Linear observation model
4 / 37
Standard mixing model
Linear mixing model (LMM) For a given pixel p: yp =
R X
mr ar,p + np ,
r=1 T
yp = [y1,p , . . . , yL,p ] the observed pixel p in L bands, R number of pure materials or endmembers, mr = [mr,1 , . . . , mr,L ]T the spectrum of the rth endmember, ar,p fraction or abundance of the rth endmember in the pth pixel, T
np = [n1,p , . . . , nL,p ] the additive noise in the pth observed pixel (assumed white Gaussian), T
Constraints on the abundance vectors ap = [a1,p , . . . , aR,p ] ar,p ≥ 0, ∀r = 1, . . . , R PR r=1 ar,p = 1.
(1) 5 / 37
Spectral unmixing
Linear Mixing Model (LMM):
yp =
PR
r=1
mr ar,p + np L = 825 (0.4µm → 2.5µm), R = 3: green grass (solid line), galvanized steel metal (dashed line), bare red brick (dotted line), T
ap = [0.4, 0.2, 0.4] , SNR ≈ 20dB. Spectral unmixing problem Estimation of {m1 , . . . , mR } and αp . 6 / 37
Spectral unmixing
Unmixing steps 1
Endmember extraction step: estimation of R and m1 , . . . , mR (Vertex Component Analysis, N-FINDR, Pixel Purity Index,...)
2
Inversion step: estimation of the corresponding abundances T ap = [a1,p , . . . , aR,p ] (LS, ML and Bayesian approaches)
Most inversion strategies ignore the possible interactions between the pixels.
7 / 37
Spectral unmixing
Unmixing steps 1
Endmember extraction step: estimation of R and m1 , . . . , mR (Vertex Component Analysis, N-FINDR, Pixel Purity Index,...)
2
Inversion step: estimation of the corresponding abundances T ap = [a1,p , . . . , aR,p ] (LS, ML and Bayesian approaches)
Most inversion strategies ignore the possible interactions between the pixels.
Problem addressed in this work Estimation of ap under positivity and additivity constraints.
Main contributions + Exploiting spatial correlations in a new Bayesian inversion procedure + Using Markov random fields (MRFs) to model spatial interactions + Spatial correlations ⇒ image classification/segmentation 7 / 37
Outline
1
Introducing spatial structures
2
Hierarchical Bayesian model Likelihood Prior distributions Joint posterior distribution
3
Simulations Synthetic data Real data
4
Conclusions
8 / 37
Introducing spatial structures
Outline
1
Introducing spatial structures
2
Hierarchical Bayesian model Likelihood Prior distributions Joint posterior distribution
3
Simulations Synthetic data Real data
4
Conclusions
9 / 37
Introducing spatial structures
Image partitioning
Defining homogeneous regions Image of P pixels divided into K regions or classes. In each region, the pixels approximately share the same composition. T
Introducing hidden variable: label vector z = [z1 , . . . , zP ] , where zp = k ⇔ p ∈ Ik , k = {1, . . . , K}.
∀p ∈ I1 : zp = 1, E [ap ] = µ1 , Covar(ap ) = Γ1 , ∀p ∈ I2 : zp = 2, E [ap ] = µ2 , Covar(ap ) = Γ2 , ∀p ∈ I3 : zp = 3, E [ap ] = µ3 , Covar(ap ) = Γ3 .
10 / 37
Introducing spatial structures
Image partitioning
Example
Figure: Scatter-plot of dual-band data of an image partitioned in 3 classes. 11 / 37
Introducing spatial structures
Image partitioning
Defining homogeneous regions Image of P pixels divided into K regions or classes. In each region, the pixels approximately share the same composition. T
Introducing hidden variable: label vector z = [z1 , . . . , zP ] , where zp = k ⇔ p ∈ Ik , k = {1, . . . , K}.
∀p ∈ I1 : zp = 1, E [ap ] = µ1 , Covar(ap ) = Γ1 , ∀p ∈ I2 : zp = 2, E [ap ] = µ2 , Covar(ap ) = Γ2 , ∀p ∈ I3 : zp = 3, E [ap ] = µ3 , Covar(ap ) = Γ3 .
12 / 37
Introducing spatial structures
Image partitioning
Abundance reparametrization T
Introducing logistic coefficientsa tp = [t1,p . . . , tR,p ] where: exp(tr,p ) . ar,p = PR r=1 exp(tr,p )
(2)
⇒ Ensure positivity and sum-to-one constraints. ⇒ Each class k fully characterized by: E [tp |zp = k] = ψk , Covar [tp |zp = k] = Σk . a [A.
Gelman et al., J. Amer. Math. Soc., 1996]
13 / 37
Hierarchical Bayesian model
Outline
1
Introducing spatial structures
2
Hierarchical Bayesian model Likelihood Prior distributions Joint posterior distribution
3
Simulations Synthetic data Real data
4
Conclusions
14 / 37
Hierarchical Bayesian model
Likelihood
Unknown parameter vector
Unknown parameter vector Θ = {T , z, s} with T = [t1 , . . . , tP ] logistic coefficient matrix T
z = [z1 , . . . , zP ] label vector T s = s21 , . . . , s2P noise variance vector
15 / 37
Hierarchical Bayesian model
Likelihood
Likelihood
Likelihood The LMM model and the Gaussian property of the noise vector yield f
yp |tp , s2p
=
1 2πs2p
L2
kyp − M ap (tp )k2 exp − , 2s2p
(3)
where √ kxk = xT x is the standard `2 norm, ap (tp ) explicitly mention dependance of the abundance vector ap over the logistic coefficient vector tp . Assuming independence between the different noise vector np ⇒ f (Y |T , s) =
P Y
f yp |tp , s2p .
(4)
p=1
16 / 37
Hierarchical Bayesian model
Prior distributions
Label prior Which prior for the labels? Independent prior distributions: f (z) =
P Y
f (zp )
p=1
⇒ no correlations between the pixels!
17 / 37
Hierarchical Bayesian model
Prior distributions
Label prior Which prior for the labels? Independent prior distributions: f (z) =
P Y
f (zp )
p=1
⇒ no correlations between the pixels! Markov Random fields... Potts-Markov random field P X X 1 δ(zp − zp0 ) f (z) = exp β G(β) 0 p=1 p ∈V(p)
V(p) the 1-order neighborhood (4 pixels), G(β) normalizing constant (or partition function), β: granularity coefficient (known in this study), 1, if x = 0, δ(x) = 0, otherwise. +
Naturally introduces spatial correlations between labels of neighboring pixels 17 / 37
Hierarchical Bayesian model
Prior distributions
Markov random fields
Influence of β on the label prior Generation of synthetic images with K = 3 using Potts-Markov random fields with different values of β and 1-order neighborhood structure.
Figure: From left to right: β = 0.8, 1.4, 2.
+ small values of β: many small regions, + large values of β: few large regions, 18 / 37
Hierarchical Bayesian model
Prior distributions
Logistic coefficient prior distribution
Logistic coefficient prior distribution For each class k (k = 1, . . . , K), p ∈ Ik tp |zp = k, ψk , Σk ∼ N (ψk , Σk ), ⇒ Two hyperparameters (fixed or estimated) that characterize the kth class: ψk and Σk . For the whole set of pixels, the distribution for T = [t1 , . . . , tP ] is: f (T |Ψ, Σ) =
K Y Y
f (tp |zp = k, ψk , Σk )
k=1 p∈Ik
with Ψ = [ψ1 , . . . , ψK ] and Σ = {Σ1 , . . . , ΣK }.
19 / 37
Hierarchical Bayesian model
Prior distributions
Noise variance and hyperparameter prior distributions
Noise variance prior s2p |δ ∼ E(δ), where δ is an adjustable hyperparameter. P Y f (s|δ) = f s2p |δ
(5)
p=1
Hyperparameter priors (hierarchical model) Let Ω = {δ, Ψ, Σ} the hyperparameter vector: K Y f (Ω) = f (δ) f (ψk )f (Σk ) k=1
with Ψ = [ψ1 , . . . , ψK ] (means of the logistic coefficient vectors) Σ = {Σ1 , . . . , ΣK } (covariance matrices of the logistic coefficient vectors). 20 / 37
Hierarchical Bayesian model
Joint posterior distribution
Joint posterior distribution of Θ and Ω
Joint posterior of parameter Θ and hyperparameter Ω vectors: f (Θ, Ω|Y ) ∝ f (Y |Θ)f (Θ|Ω)f (Ω).
(6)
Straightforward computations lead to: L P Y 1 2 kyp − M ap (tp )k2 f (Θ, Ω|Y ) ∝ exp − s2p 2s2p p=1 " !# P 2 Y 1 2γ + p∈Ik (tr,p − ψr,k )2 ψr,k + × nk +1 exp − 2 2υ 2 2σr,k σr,k r,k
2 RK P 2 +1 Y 1 δ 1 ×δ exp − wp2 wp2 υ2 p=1 P X X × exp βδ(zp − zp0 )
(7)
P −1
p=1 p0 ∈V(p)
where nk = card(Ik ) (combinatorial problem). 21 / 37
Hierarchical Bayesian model
Joint posterior distribution
Computing estimates of Θ
Joint posterior distribution too complex to obtain Bayesian estimates of Θ Simulation of samples Θ(t) asymptotically distributed according to f (Θ, Ω|Y ) using MCMC methods MMSE and MAP estimators ˆ MMSE ≈ Θ
1 NMC − Nbi
NMC −Nbi X
Θ(t) ,
(8)
t=1
ˆ MAP ≈ arg max f (Θ(t) |Y ). Θ
(9)
Θ(t)
22 / 37
Simulations
Outline
1
Introducing spatial structures
2
Hierarchical Bayesian model Likelihood Prior distributions Joint posterior distribution
3
Simulations Synthetic data Real data
4
Conclusions
23 / 37
Simulations
Synthetic data
Synthetic data
Simulation parameters 25 × 25 synthetic image composed of K = 3 classes
24 / 37
Simulations
Synthetic data
Synthetic data
Simulation parameters 25 × 25 synthetic image composed of K = 3 classes label map generated using a Potts-Markov random field with β = 1.1
Figure: Synthetic label map
24 / 37
Simulations
Synthetic data
Synthetic data
Simulation parameters 25 × 25 synthetic image composed of K = 3 classes label map generated using a Potts-Markov random field with β = 1.1 Each pixel mixed with R = 3 endmembers whose spectra are: construction concrete, green grass, micaceous loam (L = 413 bands)
Figure: Endmember spectra
25 / 37
Simulations
Synthetic data
Synthetic data
Simulation parameters 25 × 25 synthetic image composed of K = 3 classes label map generated using a Potts-Markov random field with β = 1.1 Each pixel mixed with R = 3 endmembers whose spectra are: construction concrete, green grass, micaceous loam (L = 413 bands) Noise level SNR ' 15dB
26 / 37
Simulations
Synthetic data
Synthetic data
Simulation parameters 25 × 25 synthetic image composed of K = 3 classes label map generated using a Potts-Markov random field with β = 1.1 Each pixel mixed with R = 3 endmembers whose spectra are: construction concrete, green grass, micaceous loam (L = 413 bands) Noise level SNR ' 15dB Abundances maps generated from truncated Gaussian distributions (with different means and covariance matrices in each class)
Figure: Synthetic abundances map 26 / 37
Simulations
Synthetic data
Synthetic data
Simulation results: label map Marginal MAP label estimates
Figure: Left: original labels. Right: estimated labels.
27 / 37
Simulations
Synthetic data
Synthetic data
Simulation results: abundance maps MMSE logistic coefficient estimates (conditionally to the MAP label estimates)
Figure: Top: Actual abundances. Bottom: estimated abundances. 28 / 37
Simulations
Synthetic data
Synthetic data
Simulation results: abundance means Table: Actual and estimated abundance mean and variance for each class.
Class 1
µ1 = E[ap |I1 ] Var[ap |I1 ] (×10−3 )
Real values [0.6, 0.3, 0.1]T [5, 5, 5]T
Estimated values [0.58, 0.29, 0.13]T [4.5, 4.3, 5.5]T
Class 2
µ2 = E[ap |I2 ] Var[ap |I2 ] (×10−3 )
[0.3, 0.5, 0.2]T [5, 5, 5]T
[0.29, 0.49, 0.2]T [4.5, 4.7, 5.3]T
Class 3
µ3 = E[ap |I3 ] Var[ap |I3 ] (×10−3 )
[0.3, 0.2, 0.5]T [5, 5, 5]T
[0.31, 0.19, 0.49]T [7, 4.2, 11.7]T
Example of results with the 2nd class
Figure: Histograms of the abundance means µ2 29 / 37
Simulations
Real data
Real data
Simulation parameters Real hyperspectral image of 50 × 50 pixels extracted from a larger image acquired in 1997 by AVIRIS (Moffett Field, CA, USA) data set reduced from 224 to 189 bands (water absorption bands removed) N-FINDR used to extract the R = 3 endmember spectra associated to water, soil and vegetation. K = 4 classes have been considered
Real hyperspectral data: Moffett field acquired by AVIRIS in 1997 (left) and the region of 30 / 37 interest shown in true colors (right).
Simulations
Real data
Real data
Abundance maps
Comparison with other results
Results from FCLS algorithma . a [D.
C. Heinz, C.-I Chang, IEEE Trans. on Geosci. and Remote Sensing, 2001] 31 / 37
Simulations
Real data
Real data
Label map
Figure: β = 1.1.
32 / 37
Conclusions
Outline
1
Introducing spatial structures
2
Hierarchical Bayesian model Likelihood Prior distributions Joint posterior distribution
3
Simulations Synthetic data Real data
4
Conclusions
33 / 37
Conclusions
Conclusions “Spatial” unmixing algorithm Exploitation of spatial correlations using MRFs within a Bayesian framework Hidden labels introduced to identify several classes defined by homogeneous composition of macroscopic materials Appropriate reparametrization for the abundance vector Results Good estimation of abundance coefficients Classification map obtained thanks to the estimates of the underlying labels Price to pay: high computational cost Perspectives Estimation of the granularity coefficient β, Other models for spatial correlations (discriminative random fields...) Replace MCMC methods by other computational methods: variational methods, (constrained ?) Hamiltonian MCMC,... 34 / 37
Conclusions
Unmixing hyperspectral images using Markov random fields Olivier Eches, Nicolas Dobigeon and Jean-Yves Tourneret Universit´ e de Toulouse - IRIT/INP-ENSEEIHT Toulouse, FRANCE
http://www.enseeiht.fr/~eches
MaxEnt 2010, Chamonix, France
35 / 37
Conclusions
Simulation results: comparisons
Synthetic data: comparisons with other algorithms MSE2r =
P 1 X (ˆ ar,p − ar,p )2 P p=1
Table: MSEs for abundance coefficients.
MSE21 MSE22 MSE23 a [D. b [N.
FCLSa 0.0019 4.3 × 10−4 0.0014
Bayesian independentb 0.0016 4.1 × 10−4 0.0013
Spatial 0.001 3.1 × 10−4 8.6 × 10−4
C. Heinz, C.-I Chang, IEEE Trans. on Geosci. and Remote Sensing, 2001] Dobigeon et al., IEEE Trans. on Signal Proc., 2008]
36 / 37
Conclusions
Real data
Label map
Figure: β = 1.1.
Figure: β = 2.
37 / 37