Parsimonious Gaussian process models for the spectral-spatial

Parsimonious Gaussian process models. Experimental results. Conclusions and perspectives. Parametric kernel method for spectral-spatial classification. SVM ...
10MB taille 4 téléchargements 290 vues
Parsimonious Gaussian process models for the spectral-spatial classification of high dimensional remote sensing images STATLEARN 2016

M. Fauvel 1 , C. Bouveyron

2

2

and S. Girard

3

1 UMR 1201 DYNAFOR INRA & Institut National Polytechnique de Toulouse Laboratoire MAP5, UMR CNRS 8145, Université Paris Descartes & Sorbonne Paris Cité 3 Equipe MISTIS, INRIA Grenoble Rhône-Alpes & LJK

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

Outline Introduction High dimensional remote sensing images Classification of hyperspectral/hypertemporal images Parametric kernel method for spectral-spatial classification Parsimonious Gaussian process models Gaussian process in the feature space Parsimonious Gaussian process Model inference Link with existing models Experimental results Hyperspectral data Hypertemporal data Conclusions and perspectives

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 2 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

Introduction High dimensional remote sensing images Classification of hyperspectral/hypertemporal images Parametric kernel method for spectral-spatial classification Parsimonious Gaussian process models Gaussian process in the feature space Parsimonious Gaussian process Model inference Link with existing models Experimental results Hyperspectral data Hypertemporal data Conclusions and perspectives

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 3 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

High dimensional remote sensing images

Introduction High dimensional remote sensing images Classification of hyperspectral/hypertemporal images Parametric kernel method for spectral-spatial classification Parsimonious Gaussian process models Gaussian process in the feature space Parsimonious Gaussian process Model inference Link with existing models Experimental results Hyperspectral data Hypertemporal data Conclusions and perspectives

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 4 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

High dimensional remote sensing images

Nature of remote sensing images A remote sensing image is a sampling of a spatial, spectral and temporal process.

1 0.8 0.6 0.4 0.2 0 500 600 700 800 900

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 5 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

High dimensional remote sensing images

Nature of remote sensing images A remote sensing image is a sampling of a spatial, spectral and temporal process.

1 0.8 0.6 0.4 0.2 0 500 600 700 800 900

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 5 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

High dimensional remote sensing images

Nature of remote sensing images A remote sensing image is a sampling of a spatial, spectral and temporal process.

1 0.8 0.6 0.4 0.2 0 500 600 700 800 900

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 5 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

High dimensional remote sensing images

Nature of remote sensing images A remote sensing image is a sampling of a spatial, spectral and temporal process.

1 0.8 0.6 0.4 0.2 0 500 600 700 800 900

t nu

Ja

ary

br Fe

u

ary

rch Ma

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

ril Ap

y Ma

Ju

ne

Ju

ly

st m gu pte Au Se

r be

er

r be

er b tob vem cem Oc De No

DYNAFOR - INRA 5 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

High dimensional remote sensing images

Hyperspectral High number of spectral measurements and one temporal measurement: x ∈ R180 1

x(λ)

Reflectance

0.8

0.6

0.4

0.2

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

0 95

0 90

0

0

85

Wavelenghts

80

0 75

0 70

0 65

0 60

0 55

50 0

45 0

0

DYNAFOR - INRA 6 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

High dimensional remote sensing images

Hypertemporal Image High number of temporal measurements for few spectral measurements: x ∈ R4×46 500

xb (t) xg (t) xr (t) xir (t)

Numerical count

400

300

200

100

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

0

0 34

32

0

0 30

28

0

0

26

0

Day of the year

24

0

22

0

0

0

20

18

16

0

0

14

12

10

80

60

40

0

DYNAFOR - INRA 7 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

High dimensional remote sensing images

Some numbers Hyperspectral sensors Instrument

Range (nm)

# Bands

Spatial resolution (m)

AVIRIS ROSIS-03 Hyspec HyMAP CASI

400-2500 400-900 400-2500 400-2500 380-1050

224 115 427 126 288

1-4 1 1 5 1-2

Hypertemporal missions Mission

Revisit time (days)

# Bands

Spatial resolution (m)

Sentinel-2 MODIS Landsat-8

5 1 16

13 7 8

10-20-60 500 30

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 8 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

High dimensional remote sensing images

Thematic applications Land cover/use at the national scale

Monitor ecosystems & biodiversity Disaster management

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 9 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

Classification of hyperspectral/hypertemporal images

Introduction High dimensional remote sensing images Classification of hyperspectral/hypertemporal images Parametric kernel method for spectral-spatial classification Parsimonious Gaussian process models Gaussian process in the feature space Parsimonious Gaussian process Model inference Link with existing models Experimental results Hyperspectral data Hypertemporal data Conclusions and perspectives

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 10 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

Classification of hyperspectral/hypertemporal images

Image classification in high dimensional space High number of measurements but limited number of training samples: d ≈ n. Curse of dimensionality: Statistical, geometrical and computational issues. Conventional methods failed [Jimenez and Landgrebe, 1998]. Pixelwise classification is not adapted [Fauvel et al., 2013]: invariant to pixels location, sensible to mixels!

Random permutation of pixels location

Spectral classification

Same results!

Spectral classification

Need to incorporate spatial information in the classification process: additional complexity. M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 11 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

Classification of hyperspectral/hypertemporal images

Image classification in high dimensional space High number of measurements but limited number of training samples: d ≈ n. Curse of dimensionality: Statistical, geometrical and computational issues. Conventional methods failed [Jimenez and Landgrebe, 1998]. Pixelwise classification is not adapted [Fauvel et al., 2013]: invariant to pixels location, sensible to mixels!

Need to incorporate spatial information in the classification process: additional complexity. M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 11 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

Classification of hyperspectral/hypertemporal images

Spatial-spectral classification in remote sensing Extract spatial features Using some image processing filters I I I I

Morphological Profile, adaptive neighborhood [Fauvel et al., 2013] Local statistical moments [Camps-Valls et al., 2006] Wavelets [Mercier and Girard-Ardhuin, 2006] Texture, Gabor . . .

Then combine the spectral and the spatial information I I I

Stack vector of all variables or few extracted variables (PCA . . . ) & kernel classifier Fusion of separate classifiers . . . Composite kernels [Camps-Valls et al., 2006]

3 Easy to implement. 3 Easy to plug different spatial features. 5 Additional variables: statistical issues & computational issues. 5 Most of the image processing tools are defined for R1 -valued pixel. M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 12 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

Classification of hyperspectral/hypertemporal images

Spatial-spectral classification in remote sensing

Original image

Top-hat

Local Stand. Deviat.

Gabor features

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 12 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

Classification of hyperspectral/hypertemporal images

Spatial-spectral classification in remote sensing Model the spatial dependencies Markov Random Field: local neighborhood is used in the decision p(yi = c|xi , Ni )

yi

yi

First-order

Second order

3 Allow for a fine modelization of inter-pixel dependencies. 5 Estimation of the spectral energy term (through GMM).

Mean kernel [Gurram and Kwon, 2013] K (xi , xj ) = γ −2

P m∈Ni ,n∈Nj

k(xm , xn )

3 Good performances in terms of classification accuracy. 5 Rough modelization of the inter-pixel dependencies. 5 Computing time. M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 12 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

Parametric kernel method for spectral-spatial classification

Introduction High dimensional remote sensing images Classification of hyperspectral/hypertemporal images Parametric kernel method for spectral-spatial classification Parsimonious Gaussian process models Gaussian process in the feature space Parsimonious Gaussian process Model inference Link with existing models Experimental results Hyperspectral data Hypertemporal data Conclusions and perspectives

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 13 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

Parametric kernel method for spectral-spatial classification

SVM & MRF Maximum a posteriori: maxY P(Y |X) When Y is MRF: P(Y Pn |X) ∝ exp(−U (Y |X)) where U (Y |X) = i=1 U (yi |xi , Ni ) with U (yi |xi , Ni ) = Ω(xi , yi ) + ρ E(yi , Ni ) Spectral term: − log[p(xi |yi )] I

SVM outputs [Farag et al., 2005, Tarabalka et al., 2010, Moser and Serpico, 2013]

I

Kernel-probabilistic model [Dundar and Landgrebe, 2004]

Spatial term I

Potts model: E(yi , Ni ) =

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

P j∈Ni

[1 − δ(yi , yj )]

DYNAFOR - INRA 14 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

Parametric kernel method for spectral-spatial classification

SVM & MRF Maximum a posteriori: maxY P(Y |X) When Y is MRF: P(Y Pn |X) ∝ exp(−U (Y |X)) where U (Y |X) = i=1 U (yi |xi , Ni ) with U (yi |xi , Ni ) = Ω(xi , yi ) + ρ E(yi , Ni ) Spectral term: − log[p(xi |yi )] I

SVM outputs [Farag et al., 2005, Tarabalka et al., 2010, Moser and Serpico, 2013]

I

Kernel-probabilistic model [Dundar and Landgrebe, 2004]

Spatial term I

Potts model: E(yi , Ni ) =

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

P j∈Ni

[1 − δ(yi , yj )]

DYNAFOR - INRA 14 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

Parametric kernel method for spectral-spatial classification

SVM & MRF Maximum a posteriori: maxY P(Y |X) When Y is MRF: P(Y Pn |X) ∝ exp(−U (Y |X)) where U (Y |X) = i=1 U (yi |xi , Ni ) with U (yi |xi , Ni ) = Ω(xi , yi ) + ρ E(yi , Ni ) Spectral term: − log[p(xi |yi )] I

SVM outputs [Farag et al., 2005, Tarabalka et al., 2010, Moser and Serpico, 2013]

I

Kernel-probabilistic model [Dundar and Landgrebe, 2004]

Spatial term I

Potts model: E(yi , Ni ) =

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

P j∈Ni

[1 − δ(yi , yj )]

DYNAFOR - INRA 14 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

Parametric kernel method for spectral-spatial classification

SVM & MRF Maximum a posteriori: maxY P(Y |X) When Y is MRF: P(Y Pn |X) ∝ exp(−U (Y |X)) where U (Y |X) = i=1 U (yi |xi , Ni ) with U (yi |xi , Ni ) = Ω(xi , yi ) + ρ E(yi , Ni ) Spectral term: − log[p(xi |yi )] I

SVM outputs [Farag et al., 2005, Tarabalka et al., 2010, Moser and Serpico, 2013]

I

Kernel-probabilistic model [Dundar and Landgrebe, 2004]

Spatial term I

Potts model: E(yi , Ni ) =

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

P j∈Ni

[1 − δ(yi , yj )]

DYNAFOR - INRA 14 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

Introduction High dimensional remote sensing images Classification of hyperspectral/hypertemporal images Parametric kernel method for spectral-spatial classification Parsimonious Gaussian process models Gaussian process in the feature space Parsimonious Gaussian process Model inference Link with existing models Experimental results Hyperspectral data Hypertemporal data Conclusions and perspectives

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 15 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

Gaussian process in the feature space

Introduction High dimensional remote sensing images Classification of hyperspectral/hypertemporal images Parametric kernel method for spectral-spatial classification Parsimonious Gaussian process models Gaussian process in the feature space Parsimonious Gaussian process Model inference Link with existing models Experimental results Hyperspectral data Hypertemporal data Conclusions and perspectives

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 16 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

Gaussian process in the feature space

Kernel induced feature space R2

φ

F

From Mercer theorem: k(xi , xj ) = hφ(xi ), φ(xj )iF



Gaussian kernel: k(xi , xj ) = exp −γkxi − xj k2Rd with dF = +∞ Build a probabilistic model in F is not directly possible:

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 17 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

Gaussian process in the feature space

Kernel induced feature space R2

φ

F

From Mercer theorem: k(xi , xj ) = hφ(xi ), φ(xj )iF



Gaussian kernel: k(xi , xj ) = exp −γkxi − xj k2Rd with dF = +∞ Build a probabilistic model in F is not directly possible: Work with subspace models

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 17 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

Gaussian process in the feature space

Gaussian process Let us assume that φ(x), conditionally on y = c, is a Gaussian process with mean µc and covariance function Σc . The random vector [φ(x)1 , . . . , φ(x)r ] ∈ Rr is, conditionally on y = c, a multivariate normal vector. Gaussian mixture model (Quadratic Discriminant) decision rules:



Dc φ(xi ) =

" r X hφ(xi ) − µc , qcj i2 j=1

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

λcj

# + ln(λcj ) − 2 ln(πc )

DYNAFOR - INRA 18 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

Gaussian process in the feature space

Gaussian process Let us assume that φ(x), conditionally on y = c, is a Gaussian process with mean µc and covariance function Σc . The random vector [φ(x)1 , . . . , φ(x)r ] ∈ Rr is, conditionally on y = c, a multivariate normal vector. Gaussian mixture model (Quadratic Discriminant) decision rules: rc = min(nc , r) Dc (φ(xi )) =

" rc X hφ(xi ) − µc , qcj i2 j=1 r

+

X j=rc +1

λcj

"

# + ln(λcj ) − 2 ln(πc )

hφ(xi ) − µc , qcj i2 + ln(λcj ) λcj

#

We propose to enforce parsimony in the model to make them computable.

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 18 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

Parsimonious Gaussian process

Introduction High dimensional remote sensing images Classification of hyperspectral/hypertemporal images Parametric kernel method for spectral-spatial classification Parsimonious Gaussian process models Gaussian process in the feature space Parsimonious Gaussian process Model inference Link with existing models Experimental results Hyperspectral data Hypertemporal data Conclusions and perspectives

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 19 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

Parsimonious Gaussian process

Definitions Definition (Parsimonious Gaussian process with common noise) pGP is a Gaussian process φ(x) for which, conditionally to y = c, the eigen-decomposition of its covariance operator Σc is such that A1. It exists a dimension r < +∞ such that λcj = 0 for j ≥ r and for all c = 1, . . . , C . A2. It exists a dimension pc < min(r, nc ) such that λcj = λ for pc < j < r and for all c = 1, . . . , C . A1 is motivated by the quick decay of the eigenvalues of Gaussian kernels [Braun et al., 2008]. A2 expresses that the data of each class lives in a specific subspace of size pc . Refers in the following to as pGP 0

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 20 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

Parsimonious Gaussian process

F1

λ

λ11 λ

λ12 λ21

F2

λ22

Figure: Visual illustration of parsimonious model. Dimension of Fc is common to both classes, they have specific variance inside Fc and they have common noise level.

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 21 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

Parsimonious Gaussian process

pGP models: List of sub-models Model

Variance inside Fc

qcj

pc

Free Free Free Free Free Free Free

Free Common Free Common Common Free Common

Free Free Free Free Free

Free Common Free Common Common

Variance outside Fc : Common pGP 0 pGP 1 pGP 2 pGP 3 pGP 4 pGP 5 pGP 6

Free Free Common within groups Common within groups Common between groups Common within and between groups Common within and between groups Variance outside Fc : Free

npGP 0 npGP 1 npGP 2 npGP 3 npGP 4

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

Free Free Common within groups Common within groups Common between groups

DYNAFOR - INRA 22 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

Parsimonious Gaussian process

Decision rules for pGP 0 Proposition For pGP 0 , the decision rule can be written:



Dc φ(xi )

=

pc X λ − λcj j=1 pc

+

λcj λ

X

hφ(xi ) − µc , qcj i2 − 2 ln(πc ) +

kφ(x) − µc k2 λ

ln(λcj ) + (pM − pc ) ln(λ) + γ

j=1

where γ is a constant term that does not depend on the index c of the class.

µc is the mean function of GP conditionally to y = c. (λcj ,qcj ) are the first eigenvalues/eigenfunctions of the covariance function of GP conditionally to y = c. pc is the size of Fc Proofs are given in [Bouveyron et al., 2014]. M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 23 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

Model inference

Introduction High dimensional remote sensing images Classification of hyperspectral/hypertemporal images Parametric kernel method for spectral-spatial classification Parsimonious Gaussian process models Gaussian process in the feature space Parsimonious Gaussian process Model inference Link with existing models Experimental results Hyperspectral data Hypertemporal data Conclusions and perspectives

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 24 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

Model inference

Estimation of the parameters Centered Gaussian kernel function according to class c: nc nc  1 X 1 X k¯c (xi , xj ) = k(xi , xj ) + 2 k(xl , xl 0 ) − k(xi , xl ) + k(xj , xl ) . nc 0 nc l=1 yl =c

l,l =1 yl ,yl0 =c

and Kc of size nc × nc : (Kc )l,l 0 =

k¯c (xl , xl 0 ) . nc

ˆ cj is the j th largest eigenvalue of Kc , and β is its associated normalized λ cj eigenvector. ˆ= P λ C c=1

1 π ˆc (rc − p ˆc )

PC c=1

π ˆ trace(Kc ) −

Pˆpc ˆ  λcj . j=1

π ˆc = nc /n. p ˆc : percentage of cumulative variance. M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 25 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

Model inference

Computable decision rule Proposition The decision rule can be computed as:



Dc φ(xi ) =

ˆ pc ˆ−λ ˆ cj 1 Xλ 2 ˆ ˆ nc λcj λ j=1

X nc

βcjl k¯c (xi , xl )

2

l=1 yl =c

ˆ pc

+

k¯c (xi , xi ) X ˆ ˆ − 2 ln(ˆ + ln(λcj ) + (ˆ pM − p ˆc ) ln(λ) πc ) ˆ λ j=1

Use of the property that the eigenfunction of the covariance function is a linear combination of φ(xi ) − µc hφ(xi ) − µc , φ(xj ) − µc i = k¯c (xi , xj )

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 26 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

Model inference

Estimation of the hyperparameters

The parsimonious Gaussian process model and two hyperparameters have to be fitted: I

(n)pGP 0...6

I

The kernel parameter γ

I

The size of signal subspace pc

Done by k-fold cross validation. Given one kernel hyperparameter value, all the others pc s values can be tested at not cost since model parameters are derived from the eigendecomposition of Kc .

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 27 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

Link with existing models

Introduction High dimensional remote sensing images Classification of hyperspectral/hypertemporal images Parametric kernel method for spectral-spatial classification Parsimonious Gaussian process models Gaussian process in the feature space Parsimonious Gaussian process Model inference Link with existing models Experimental results Hyperspectral data Hypertemporal data Conclusions and perspectives

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 28 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

Link with existing models

Existing models [Dundar and Landgrebe, 2004] Equal covariance matrix assumption and ridge regularization. Complexity: O(n 3 ). Similar to pGP 4 with equal eigenvectors. [Pekalska and Haasdonk, 2009] Ridge regularization, per class. Complexity: O(nc3 ). [Xu et al., 2009] The last nc − p − 1 eigenvalues are equal to λcp . Complexity: O(nc3 ). Similar to npGP 1 .

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 29 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

Link with existing models

Existing models [Dundar and Landgrebe, 2004] Equal covariance matrix assumption and ridge regularization. Complexity: O(n 3 ). Similar to pGP 4 with equal eigenvectors. [Pekalska and Haasdonk, 2009] Ridge regularization, per class. Complexity: O(nc3 ). [Xu et al., 2009] The last nc − p − 1 eigenvalues are equal to λcp . Complexity: O(nc3 ). Similar to npGP 1 . 101 Ridge pGP Z. Xu et al. λci

10−1

5 They all use eigenvectors associated to (very) small eigenvalues. 3 pGP models use only eigenvectors associated to the largest eigenvalues.

10−3

10−5

10−7 0 M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

20

40

60

80

100

DYNAFOR - INRA 29 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

Introduction High dimensional remote sensing images Classification of hyperspectral/hypertemporal images Parametric kernel method for spectral-spatial classification Parsimonious Gaussian process models Gaussian process in the feature space Parsimonious Gaussian process Model inference Link with existing models Experimental results Hyperspectral data Hypertemporal data Conclusions and perspectives

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 30 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

Protocol [Fauvel et al., 2015] 50 training pixels for each class have been randomly selected from the samples. The remaining set of pixels has been used for validation to compute the correct classification rate. Repeated 20 times. Variables have been scaled between 0 and 1. Competitive methods I

SVM

I

RF

I

Kernel-DA [Dundar and Landgrebe, 2004].

Hyperparameters learn by 5-cv. MRF: Metropolis-Hasting M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 31 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

Hyperspectral data

Introduction High dimensional remote sensing images Classification of hyperspectral/hypertemporal images Parametric kernel method for spectral-spatial classification Parsimonious Gaussian process models Gaussian process in the feature space Parsimonious Gaussian process Model inference Link with existing models Experimental results Hyperspectral data Hypertemporal data Conclusions and perspectives

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 32 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

Hyperspectral data

data sets university of pavia: uint16, 103 spectral bands, 9 classes and 42,776 referenced pixels. kennedy space center: uint16, 224 spectral bands, 13 classes and 4,561 referenced pixels. heves: uint16, 252 spectral bands, 16 classes and 360,953 pixels.

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 33 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

Hyperspectral data

Samples 12000

12000

10000

10000

8000

8000

6000

6000

4000

4000

2000

2000

0

50

100

150

200

250

0

12000

12000

10000

10000

8000

8000

6000

6000

4000

4000

2000

2000

0

50

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

100

150

200

250

0

50

100

150

200

250

50

100

150

200

250 DYNAFOR - INRA 34 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

Hyperspectral data

Classification accuracy and processing time Kappa coefficient

Processing time (s)

University

KSC

Heves

University

KSC

Heves

pGP 0 pGP 1 pGP 2 pGP 3 pGP 4 pGP 5 pGP 6

0.768 0.793 0.617 0.603 0.661 0.567 0.610

0.920 0.922 0.844 0.842 0.870 0.820 0.845

0.664 0.671 0.588 0.594 0.595 0.582 0.583

18 18 18 19 19 18 19

31 33 31 33 34 32 34

148 151 148 152 152 148 152

npGP 0 npGP 1 npGP 2 npGP 3 npGP 4

0.730 0.792 0.599 0.578 0.578

0.911 0.921 0.838 0.817 0.817

0.640 0.677 0.573 0.585 0.585

17 18 18 19 19

31 33 31 33 33

148 151 148 152 152

KDC RF SVM

0.786 0.646 0.799

0.924 0.853 0.928

0.666 0.585 0.658

98 3 10

253 3 28

695 18 171

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 35 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

Hyperspectral data

pGPMRF

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 36 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

Hypertemporal data

Introduction High dimensional remote sensing images Classification of hyperspectral/hypertemporal images Parametric kernel method for spectral-spatial classification Parsimonious Gaussian process models Gaussian process in the feature space Parsimonious Gaussian process Model inference Link with existing models Experimental results Hyperspectral data Hypertemporal data Conclusions and perspectives

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 37 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

Hypertemporal data

Data set Formosat-2 SITS: uint8, 4 spectral bands, 43 dates for 2006 and 13 woody classes.

t1

t2

t3

t4

t5

t6

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 38 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

Hypertemporal data

Samples 500

500

400

400

300

300

200

200

100

100

0

20

40

60

80

100

120

140

160

0

500

500

400

400

300

300

200

200

100

100

0

20

40

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

60

80

100

120

140

160

0

20

40

60

80

100

120

140

160

20

40

60

80

100

120

140

160 DYNAFOR - INRA 39 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

Hypertemporal data

Classification accuracy and processing time Kappa Coefficient

Processing time (s)

pGP 0 pGP 1 pGP 2 pGP 3 pGP 4 pGP 5 pGP 6

0.950 0.955 0.887 0.887 0.932 0.846 0.891

13 9 13 9 9 13 8

npGP 0 npGP 1 npGP 2 npGP 3 npGP 4

0.941 0.943 0.883 0.871 0.871

12 8 13 9 9

KDA RF SVM

0.942 0.896 0.944

69 1 10

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 40 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

Hypertemporal data

pGPMRF

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 41 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

Hypertemporal data

pGPMRF

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 41 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

Hypertemporal data

pGPMRF

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 41 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

Introduction High dimensional remote sensing images Classification of hyperspectral/hypertemporal images Parametric kernel method for spectral-spatial classification Parsimonious Gaussian process models Gaussian process in the feature space Parsimonious Gaussian process Model inference Link with existing models Experimental results Hyperspectral data Hypertemporal data Conclusions and perspectives

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 42 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

Family of parsimonious Gaussian process models has been presented. Good performances w.r.t SVM and KDA. Faster computation than previous KDA. (n)pGP 1 perform the best. MRF extension. https://github.com/mfauvel/PGPDA Extension: I

Non numerical data [Bouveyron et al., 2014]

I

Binary data [Sylla et al., 2015]

I

Unsupervised learning [Bouveyron et al., 2014]

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 43 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

References I [Bouveyron et al., 2014] Bouveyron, C., Fauvel, M., and Girard, S. (2014). Kernel discriminant analysis and clustering with parsimonious gaussian process models. Statistics and Computing, pages 1–20. [Braun et al., 2008] Braun, M. L., Buhmann, J. M., and Muller, K.-R. (2008). On Relevant Dimensions in Kernel Feature Spaces. Journal of Machine Learning Research, 9:1875–1908. [Camps-Valls et al., 2006] Camps-Valls, G., Gomez-Chova, L., Muñoz-Marí, J., Vila-Francés, J., and Calpe-Maravilla, J. (2006). Composite kernels for hyperspectral image classification. Geoscience and Remote Sensing Letters, IEEE, 3(1):93–97. [Dundar and Landgrebe, 2004] Dundar, M. and Landgrebe, D. A. (2004). Toward an optimal supervised classifier for the analysis of hyperspectral data. IEEE Trans. Geoscience and Remote Sensing, 42(1):271–277. [Farag et al., 2005] Farag, A., Mohamed, R., and El-Baz, A. (2005). A unified framework for map estimation in remote sensing image segmentation. IEEE Trans. on Geoscience and Remote Sensing, 43(7):1617–1634.

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 44 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

References II [Fauvel et al., 2015] Fauvel, M., Bouveyron, C., and Girard, S. (2015). Parsimonious gaussian process models for the classification of hyperspectral remote sensing images. Geoscience and Remote Sensing Letters, IEEE, 12(12):2423–2427. [Fauvel et al., 2013] Fauvel, M., Tarabalka, Y., Benediktsson, J. A., Chanussot, J., and Tilton, J. (2013). Advances in Spectral-Spatial Classification of Hyperspectral Images. Proceedings of the IEEE, 101(3):652–675. [Gurram and Kwon, 2013] Gurram, P. and Kwon, H. (2013). Contextual svm using hilbert space embedding for hyperspectral classification. Geoscience and Remote Sensing Letters, IEEE, 10(5):1031–1035. [Jimenez and Landgrebe, 1998] Jimenez, L. and Landgrebe, D. (1998). Supervised classification in high-dimensional space: geometrical, statistical, and asymptotical properties of multivariate data. Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on, 28(1):39 –54. [Mercier and Girard-Ardhuin, 2006] Mercier, G. and Girard-Ardhuin, F. (2006). Partially supervised oil-slick detection by sar imagery using kernel expansion. Geoscience and Remote Sensing, IEEE Transactions on, 44(10):2839–2846.

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 45 of 46

Introduction

Parsimonious Gaussian process models

Experimental results

Conclusions and perspectives

References III [Moser and Serpico, 2013] Moser, G. and Serpico, S. (2013). Combining support vector machines and markov random fields in an integrated framework for contextual image classification. IEEE Trans. on Geoscience and Remote Sensing, 51(5):2734–2752. [Pekalska and Haasdonk, 2009] Pekalska, E. and Haasdonk, B. (2009). Kernel discriminant analysis for positive definite and indefinite kernels. IEEE Trans. Pattern Anal. Mach. Intell., 31(6):1017–1032. [Sylla et al., 2015] Sylla, S. N., Girard, S., Diongue, A. K., Diallo, A., and Sokhna, C. (2015). A classification method for binary predictors combining similarity measures and mixture models. Dependence Modeling, 3:240–255. [Tarabalka et al., 2010] Tarabalka, Y., Fauvel, M., Chanussot, J., and Benediktsson, J. (2010). SVM- and MRF-based method for accurate classification of hyperspectral images. IEEE Geoscience and Remote Sensing Letters, 7(4):736–740. [Xu et al., 2009] Xu, Z., Huang, K., Zhu, J., King, I., and Lyu, M. R. (2009). A novel kernel-based maximum a posteriori classification method. Neural Netw., 22(7):977–987.

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 46 of 46