Concentration inequalities for order statistics - CiteSeerX

S. Boucheron & M. Thomas (LPMA) ... Extreme value theory and classical statistics. Asymptotic ... much on any of them is concentrated around its mean value.
195KB taille 2 téléchargements 295 vues
Concentration inequalities for order statistics Using the entropy method and Rényi’s representation

S. Boucheron1 1

M. Thomas1

LPMA Université Paris-Diderot

Probability and Harmonic Analysis, Angers, September 2012

S. Boucheron & M. Thomas (LPMA)

Concentration & order statistics

Angers 2012

1 / 30

Motivation

Concentration, asymptotics for order statistics

Background : order statistics Sample : X1 , . . . , Xn ∼i.i.d. F Order statistics X1,n ≥ . . . ≥ Xn,n non-increasing rearrengement of X1 , . . . , Xn . If n clear from context, X1,n , . . . , Xn,n denoted by X(1) , . . . , X(n) . X(1) : sample maximum X(n/2) : sample median ... Extreme value theory and classical statistics Asymptotic distributions Convergence of moments ....

Goal: derive simple, non-asymptotic variance/tail bounds for order statistics

S. Boucheron & M. Thomas (LPMA)

Concentration & order statistics

Angers 2012

2 / 30

Motivation

Concentration, asymptotics for order statistics

Background : concentration Concentration of measure phenomenon Any function of many independent random variables that does not depend too much on any of them is concentrated around its mean value. A new (non-asymptotic) look at independence Example: Gaussian concentration (Bonami,Beckner, Nelson, Gross, Borell, Ehrhard, Bobkov, Ledoux, ...)

X = (X1 , . . . , Xn ) a standard Gaussian vector Poincaré’s inequality: Var f (X ) ≤ Ek∇f k2 Gross logarithmic Sobolev inequality: Ent(f (X )2 ) ≤ 2Ek∇f k2 Cirelson’s inequality: P{f (X ) ≥ Ef (X ) + t} ≤ exp(−t 2 /(2L2 )) if k∇f k ≤ L

Product spaces: Talagrand’s inequalities Order statistics are not (usually) sums of independent random variables

S. Boucheron & M. Thomas (LPMA)

Concentration & order statistics

Angers 2012

3 / 30

Motivation

Off the shelf

Off-the shelf concentration inequalities and order statistics

f (X1 , . . . , Xn ) = max(X1 , . . . , Xn ): a simple function of many independent random variables that does not depend too much on any of them.

Scenario : Xi are standard Gaussian Almost surely, k∇f k = 1. Poincaré’s inequality ⇒ Var(f (X1 , . . . , Xn )) ≤ 1 Extreme Value Theory asserts : Var(max(X1 , . . . , Xn )) = O(1/ log n)

We do not understand (clearly) in which way the maximum is a smooth function of the sample.

S. Boucheron & M. Thomas (LPMA)

Concentration & order statistics

Angers 2012

4 / 30

Order statistics

Central, intermediate and extreme order statistics X1 , . . . , Xn ∼i.i.d. F Order statistics X1,n ≥ . . . ≥ Xn,n non-increasing rearrengement of X1 , . . . , Xn . If n clear from context, X1,n , . . . , Xn,n denoted by X(1) , . . . , X(n) . (Xk ,n ) is a sequence of extreme order statistics, central order statistics, intermediate order statistics,

if k fixed, n → ∞; if k /n → p ∈ (0, 1) while, n → ∞; if k /n → 0, k → ∞.

Different asymptotics Central and intermediate order statistics (often): Extreme order statistics (sometimes):

S. Boucheron & M. Thomas (LPMA)

Concentration & order statistics

Gaussian Generalized Extreme Value

Angers 2012

5 / 30

Variance bounds

Order statistics and spacings

Variance bounds, order statistics and spacings

A connection The variance (and more generally the higher moments) of the k th order statistics can be upper-bounded by moments of the k th spacing X(k ) − X(k +1) . Lemma (Jackknife bounds) Var[X(k ) ] ≤ k E

h

X(k ) − X(k +1)

2 i

.

Convention ∆k = X(k ) − X(k +1)

S. Boucheron & M. Thomas (LPMA)

Concentration & order statistics

Angers 2012

6 / 30

Variance bounds

Order statistics and spacings

Proof (i) Theorem (Efron-Stein inequalities, 1981) Let f : Rn → R be measurable, and let Z = f (X1 , . . . , Xn ). Let Zi = fi (X1 , . . . , Xi−1 , Xi+1 , . . . , Xn ) where fi : Rn−1 → R is an arbitrary measurable function. Suppose Z is square-integrable. Then " n # X 2 Var[Z ] ≤ E (Z − Zi ) . i=1

Efron-Stein inequalities provide a key ingredient in the derivation of Poincaré’s inequality. Pn

i=1

2

(Z − Zi ) is a jackknife estimate of variance.

S. Boucheron & M. Thomas (LPMA)

Concentration & order statistics

Angers 2012

7 / 30

Variance bounds

Order statistics and spacings

Proof (ii)

Z = X(k ) Zi as the rank k statistic from subsample X1 , . . . , Xi−1 , Xi+1 , . . . , Xn : ( Zi = X(k +1) if Xi ≥ X(k ) Zi = Zi = Z otherwise. Jackknife estimate of variance of X(k ) : n X (Z − Zi )2 = i=1

S. Boucheron & M. Thomas (LPMA)

X

(X(k ) − X(k +1) )2 = k ∆2k

i:Xi ≥X(k )

Concentration & order statistics

Angers 2012

8 / 30

Variance bounds

Asymptotics for extremes

Asymptotic assessment for extreme order statistics Definition (Quantile function) F ← (p) = inf {x : F (x) ≥ p} Definition (MDA(γ), γ ∈ R) F ∈ MDA(γ) if the exists a function a : R+ → R+ , such that     max(X1 , . . . , Xn ) − F ← (1 − 1/n) P ≤ x → exp −(1 + γx)−1/γ a(n)

  > 0 according to the sign of extreme value index γ = 0   0, letting Xn+1,n = 0, Xk ,n =

n X

(Xi,n − Xi+1,n )

i=k

where the spacings ∆i = (Xi,n − Xi+1,n )i=1,...,n form an independent family of random variables and i × (Xi,n − Xi+1,n ) ∼ F

S. Boucheron & M. Thomas (LPMA)

Concentration & order statistics

Angers 2012

12 / 30

Rényi’s representation

Quantile transformations

Quantile transformation Definition (Quantile function (bis)) F ← (p) = inf {x : F (x) ≥ p} , p ∈ (0, 1)

U(t) = F ← (1 − 1/t), t ∈ (1, ∞)

Representation for order statistics If Y(1) , . . . , Y(n) are the order statistics of an exponential sample, then  F ← 1 − exp(−Y(i) ) i=1,...,n is distributed as the order statistics of a sample drawn according to F . Representation for order statistics If Y(1) , . . . , Y(n) are the order statistics of an exponential sample, then U(eY(1) ) ≥ U(eY(2) ) ≥ . . . ≥ U(eY(n) ) is distributed as the order statistics of a sample drawn according to F . S. Boucheron & M. Thomas (LPMA)

Concentration & order statistics

Angers 2012

13 / 30

Rényi’s representation

Hazard rates

Hazard rate, spacings and order statistics Definition (Hazard rate) The hazard rate of a differentiable distribution function F is F 0 /F = F 0 /(1 − F ). Lemma The distribution function F has non-decreasing hazard rate, iff U ◦ exp is concave. Lemma If the distribution function F has non-decreasing hazard rate, then X(k +1) and ∆k = X(k ) − X(k +1) are negatively associated. Negative association For increasing functions f , g     E f (X(k +1) g(∆k ) ≤ E f (X(k +1) E [g(∆k )] S. Boucheron & M. Thomas (LPMA)

Concentration & order statistics

Angers 2012

14 / 30

Rényi’s representation

Hazard rates

Gaussian hazard rate U(t) = Φ← (1 − 1/t) for t > 1. 4

4

3

U(2 exp(x))

U(exp(x))

3 2 1 0

Gaussian distribution

2

Absolute value of Gaussian random variable

1

−1 0 0

2

4

6

8

10

0

2

4

6

8

10

Gaussian and absolute value of Gaussian have non-decreasing hazard rate. The absolute value of a Gaussian random variable has both non-decreasing hazard rate and bounded inverse hazard rate. S. Boucheron & M. Thomas (LPMA)

Concentration & order statistics

Angers 2012

15 / 30

Rényi’s representation

Hazard rates

Taking advantage of increasing hazard rate Lemma If F has non-decreasing hazard rate h, then for 1 ≤ k ≤ n/2,   2 h Var X(k ) ≤ EVk ≤ E k

2 1 h(X(k +1) )

i

,

Lemma Let n ≥ 3, let X(1) ≥ . . . ≥ X(n) be the order statistics of absolute values of a standard Gaussian sample, For 1 ≤ k ≤ n/2,

S. Boucheron & M. Thomas (LPMA)

Var[X(k ) ] ≤

8 1 2n k log 2 log k − log(1 +

Concentration & order statistics

4 k

log log 2n k )

.

Angers 2012

16 / 30

Exponential Efron-Stein inequalities

Modified logarithmic Sobolev inequalities

Goal

Context If F has increasing hazard rate (more concentrated than exponential), extreme and intermediate order statistics have exponentiel moments. Target Derive Establishing Exponential Efron-Stein inequalities Bernstein-like deviation inequalities statistics. for order statitics

S. Boucheron & M. Thomas (LPMA)

Concentration & order statistics

Angers 2012

17 / 30

Exponential Efron-Stein inequalities

Modified logarithmic Sobolev inequalities

Modified logarithmic Sobolev inequalities Theorem (M ODIFIED LOGARITHMIC S OBOLEV INEQUALITY. L. W U, P. M ASSART, 2000) Let τ (x) = ex − x − 1. Then for any λ ∈ R,         Ent eλZ = E eλZ log eλZ − E eλZ log E eλZ       = λE ZeλZ − E eλZ log E eλZ " n # X ≤ E eλZ τ (−λ(Z − Zi )) i=1

Remark Logarithmic-Sobolev inequalities and Efron-Stein inequalities are derived in a similar way, proofs rely on variational representations of variance and entropy. S. Boucheron & M. Thomas (LPMA)

Concentration & order statistics

Angers 2012

18 / 30

Exponential Efron-Stein inequalities

Entropy, order statistics and spacings

Application to order statistics

Notation ψ(x) = ex τ (−x) = 1 + (x − 1)ex Lemma For all λ ∈ R,     Ent eλX(k ) ≤ k E eλX(k +1) ψ(λ(X(k ) − X(k +1) ))   = k E eλX(k +1) ψ(λ∆k ) Proof parallels the variance bounds derived from Efron-Stein inequalities.

S. Boucheron & M. Thomas (LPMA)

Concentration & order statistics

Angers 2012

19 / 30

Exponential Efron-Stein inequalities

Entropy, order statistics and spacings

Bernstein bounds, sub-Gamma distributions

Sub-gamma on the right tail with variance factor v and scale parameter c log Eeλ(X −EX ) ≤

λ2 v for every λ 2(1 − cλ)

such that

0 < λ < 1/c .

Bernstein’s inequality n o √ for t > 0, P X ≥ EX + 2vt + ct ≤ exp (−t) .

S. Boucheron & M. Thomas (LPMA)

Concentration & order statistics

Angers 2012

20 / 30

Exponential Efron-Stein inequalities

Entropy, order statistics and spacings

Exponential Efron-Stein inequality for order statistics

Vk = k ∆2k : the Efron-Stein estimate of the variance of X(k ) . Theorem If F has non-decreasing hazard rate h, then for λ ≥ 0, and 1 ≤ k ≤ n/2, log Eeλ(X(k ) −EX(k ) )

S. Boucheron & M. Thomas (LPMA)

 k  ≤ λ E ∆k eλ∆k − 1 2 " # r  k Vk  λ√Vk /k = λ E e −1 . 2 k

Concentration & order statistics

Angers 2012

21 / 30

Exponential Efron-Stein inequalities

Entropy, order statistics and spacings

Assessment

Does not follow from exponential Efron-Stein inequality from B., Lugosi and Massart (Ann. Probab. 2003). log Eeλ(X(k ) −EX(k ) ) ≤

λθ log EeλVk /θ for θ > 0, 0 ≤ λ ≤ 1/θ 1 − λθ

as Vk may not have exponential moments! Sharp (up to constants) for exponential samples. Works both for central, intermediate and extreme order statistics.

S. Boucheron & M. Thomas (LPMA)

Concentration & order statistics

Angers 2012

22 / 30

Exponential Efron-Stein inequalities

Hazard rates, association, Herbst’s arguments

Proof (i)

ψ(x) = x(ex − 1) is non-decreasing over R+ , X(k +1) and ∆k are negatively associated:   Ent eλX(k ) ≤ ≤ ≤

  k E eλX(k +1) ψ(λ∆k )   k E eλX(k +1) × E [ψ(λ∆k )]   k E eλX(k ) × E [ψ(λ∆k )] .

Multiplying both sides by exp(−λEX(k ) ), leads to h i h i Ent eλ(X(k ) −EX(k ) ) ≤ k E eλ(X(k ) −EX(k ) ) × E [ψ(λ∆k )] .

S. Boucheron & M. Thomas (LPMA)

Concentration & order statistics

Angers 2012

23 / 30

Exponential Efron-Stein inequalities

Hazard rates, association, Herbst’s arguments

Proof (ii) Herbst’s argument

Let G(λ) = Eeλ∆k . Obviously, G(0) = 1, and as ∆k ≥ 0, G and its derivatives are increasing on [0, ∞), E [ψ(λ∆k )] = 1 − G(λ) + λG0 (λ) =

Z

λ

sG00 (s)ds ≤ G00 (λ)

0

λ2 . 2

Hence, for λ ≥ 0,   Ent eλ(X(k ) −EX(k ) ) d 1 log Eeλ(X(k ) −EX(k ) ) k dG0  λ(X −EX )  = λ ≤ . (k ) dλ 2 dλ λ2 E e (k )

S. Boucheron & M. Thomas (LPMA)

Concentration & order statistics

Angers 2012

24 / 30

Exponential Efron-Stein inequalities

Hazard rates, association, Herbst’s arguments

Proof (iii) solving the differential inequality

Integrating both sides, using the fact that 1 log Eeλ(X(k ) −EX(k ) ) = 0, λ→0 λ lim

leads to 1 log Eeλ(X(k ) −EX(k ) ) λ

≤ =

S. Boucheron & M. Thomas (LPMA)

k 0 (G (λ) − G0 (0)) 2  k  E ∆k eλ∆k − 1 . 2

Concentration & order statistics

Angers 2012

25 / 30

Exponential Efron-Stein inequalities

Gaussian order statistics

Maxima of Gaussians Lemma For n such that the solution vn of equation 16/x + log(1 + 2/x + 4 log(4/x)) = log(2n) is smaller than 1, for all 0 ≤ λ < √1vn , log Eeλ(X(1) −EX(1) ) ≤ For all t > 0,

vn λ2 √ . 2(1 − vn λ)

n √ o √ P X(1) − EX(1) > vn (t + 2t) ≤ e−t .

S. Boucheron & M. Thomas (LPMA)

Concentration & order statistics

Angers 2012

26 / 30

Exponential Efron-Stein inequalities

Gaussian order statistics

Median of Gaussians ... The same approach works for extreme, intermediate and central order statistics Lemma Let vn = 8/(n log 2). √ For all 0 ≤ λ < n/(2 vn ), log Eeλ(X(n/2) −EX(n/2) ) ≤

vn λ2 p . 2(1 − 2λ vn /n)

For all t > 0, n o p p P X(n/2) − EX(n/2) > 2vn t + 2 vn /nt ≤ e−t .

S. Boucheron & M. Thomas (LPMA)

Concentration & order statistics

Angers 2012

27 / 30

Caveat

Assessment

Rényi’s representation : order statistics are functions of sums of independent random variables (spacings of exponential samples).

If the function is concave, concavity may be used twice.

What about plugging tail bounds for order statistics of exponential samples ?

S. Boucheron & M. Thomas (LPMA)

Concentration & order statistics

Angers 2012

28 / 30

Caveat

Ad hoc arguments

What can be obtained from Rényi’s representation and exponential inequalities for sums of Gamma-distributed random variables ? Lemma Let X(1) be the maximum of the absolute values of n independent standard e Gaussian random variables, and let U(s) = Φ← (1 − 1/(2s)) for s ≥ 1. For t > 0, n o √ e e P X(1) − EX(1) ≥ t/(3U(n)) + t/U(n) + δn ≤ exp (−t) , 3 e where δn > 0 and limn (U(n)) δn =

S. Boucheron & M. Thomas (LPMA)

π2 12

.

Concentration & order statistics

Angers 2012

29 / 30

Caveat

References

S. B. and M. Thomas. Concentration inequalities for order statistics. http://arxiv.org/abs/1207.7209 P. Massart. Concentration inequalities and model selection Springer. 2006. Lecture Notes in Mathematics 1896. L. de Haan and A. Ferreira. Extreme value theory. Springer. 2006

S. Boucheron & M. Thomas (LPMA)

Concentration & order statistics

Angers 2012

30 / 30