Entropy-based Goodness-of-fit Test for Positive Stable Law

The results reveal that the introduced approach here shows more ... This concept is widely used in information theory, statistical mechanics and statistics. ... respectively, the standard exponential and Weibull with pdf f(w) = αxα−1 exp(wα); w > 0 ... The methodology used here is similar to that proposed in previous section.
151KB taille 2 téléchargements 317 vues
Entropy-based Goodness-of-fit Test for Positive Stable Law Mahdi Teimouri∗,† , Saeid Rezakhah∗ and Adel Mohammadpour∗ ∗

Department of Mathematics and Computer Science, Amirkabir University of Technology, 424, Hafez Ave., Tehran 15914, Iran † Department of Statistics, Gonbad Kavous University, Gonbad Kavous, Iran

Abstract. A goodness-of-fit test for positive stable law is proposed. For this mean, the Kullback-Leibler distance measure, as a basic tool in entropy theory, is considered. A simulation study is performed to compare the performance of the proposed method and old ones that suggested by Shkol’nik [6]. The results reveal that the introduced approach here shows more performance, in the sense of the test statistic empirical power, than the Shkol’nik [6] treatment in both cases: 1- both exponent and scale parameters of positive stable law are unknown and sample size is small; 2- exponent parameter is known. Real data application is illustrated to show proposed approach’s superior performance versus the old method. Keywords: Characterization; Goodness-of-fit test; Kullback-Leibler distance; positive stable law PACS: 02.50.-r

INTRODUCTION The class of positive stable distributions received much interest in some fields as economics, finance and insurance. This is primarily due to the fact that observations found in such fields follow a heavy tail. For being familiar with the theory and applications of stable distributions, we refer the readers to see [8]. By definition, the characteristic function of a positive stable random variable takes the form, n o π ϕS (t) = E exp(itS) = exp − |σt|α exp[−i sign(t)] , (1) 2 where parameters 0 < α < 1 and σ ∈ (0, ∞) represent the tail index and scale respectively. We write S(α, σ ) to denote a positive stable random variable. Generally, positive stable distributions have not been a closed-form expression for their probability density function. The only member of positive stable laws which has closed-form probability density function (pdf) is Lévy for α = 0.5. The concept of entropy, which has its origin in Shannon [2], is a fundamental tool to measure the uncertainty associated with a distribution. Let X be a non-negative continuous random variable with pdf f (.). The Shannon entropy is defined by Z ∞

H(X) = −

f (x) log f (x)dx.

(2)

0

This concept is widely used in information theory, statistical mechanics and statistics. For comprehensive accounts of the theory and application of Shannon’s entropy, we refer the readers to [7]. A fundamental tool associated with H(.) in (2) is the Kullback-Leibler divergence measure. This measure can be used for goodness-of-fit test. To test H0 : f (x) = f0 (x; θ ) for some θ ∈ Θ versus H0 : f (x) 6= f0 (x; θ ) the Kullback-Leibler [5] measure is introduced as: Z ∞

K( f , f0 ) =

f (x) log 0

f (x) dx = f0 (x; θ )

Z ∞

f (x) log f (x)dx −

0

Z ∞ 0

f (x) log f0 (x; θ )dx = I1 + I2 .

(3)

To construct the test function Kmn , firstly, we estimate I1 through the m−spacing method which has been developed by Song [3] as:  n  1 n Ib1 = ∑ log (x(i+m) − x(i−m) ) , (4) n i=1 2m where m is spacing order and x(i) denotes the ith order statistic in the random sample. Here, we define x( j) = x(1) if j < 1 and x( j) = x(n) if j > n. Also, an estimator of I2 is given by: 1 n ˆ Ib2 = ∑ log f0 (xi ; Θ), n i=1

(5)

ˆ is maximum likelihood estimation of vector space Θ based on observations x1 , x2 , ..., xn . Thus, Kmn = where, Θ b b −I1 − I2 . We reject the null hypothesis in significance level γ if Kmn > Cmn , where the critical value Cmn is determined by the quantile (1 − γ) × 100 of the distribution of Kmn under H0 . Here we note that if the distribution under H0 is location-scale one, then Cmn is location and scale invariant. So, Cmn can be computed in a simple tabulation form which depends only on m, n and significance level γ.

TESTING WHEN BOTH EXPONENT AND SCALE ARE UNKNOWN iid

Let X1 , X2 , ..., Xn ∼ S(α, σ ). Using the Mellin transform, see [8], one can see that: E d W (α) = , X σ

(6)

where, E and W (α) are, respectively, the standard exponential and Weibull with pdf f (w) = αxα−1 exp(wα ); w > 0 random variables. It is known that log X − log E ∼  EV (1/α, log σ ). Here; EV (µ, λ ) denotes an extreme value random variable with cdf FEV (y) = exp{− exp{− y−µ }} for y ∈ (−∞, ∞). Thus, using a log-transformation, the original λ random sample X1 , X2 , ..., Xn can be converted to an independent sample from EV family. Pérez-Rodríguez et al. have discussed the goodness-of-fit test for EV family. Since EV is a location-scale family, we used the critical values Cmn tabulated by these authors when the maximum likelihood estimations of parameters under H0 are used, (see [4], page 846, Table 1). Shkol’nik [6] proposed an exponent and scale invariant goodness-of-fit test for positive stable distributions. His method is also based on log-transformation of representation (6). There, as Author claimed, the distribution of proposed test function is asymptotically normal. In contrast, our method works even the sample size is so small. Comparisons based on 15 alternatives (of unity scale and zero location, if there exist and for Pareto law the support is (1, ∞)) proposed by Shkol’nik [6] reveal that our method is more powerful in rejecting the false H0 . The empirical powers of our and Shkol’nik method, based on N = 15000 samples of size n = 5, 10, 15, ..., 35, 40, are depicted in Fig. 1. To compute the power of the test statistic Kmn , we used the simple algorithm proposed in [4]. For samples of size larger than n = 40, both approaches show the same performance, in the sense of empirical power, approximately. Here, we note that the result of Shkol’nik’s method is not reliable for n < 30 since the test statistic does not follow the Gaussian law. The following results are concluded from Fig. 1. Proposed method’s power increases with increasing the sample size n. Except for Pareto alternatives which correspond to cases (d)-(f), the Shkol’nik’s method power increases with increasing the sample size n. • Our method exhibits significantly greater powers than the Shkol’nik’s method. • Both approaches show lower performance when Pareto alternatives are used.





GOODNESS-OF-FIT TEST WHEN EXPONENT IS KNOWN Consider the situation in which α is known. The methodology used here is similar to that proposed in previous section. We need to tabulate the critical values of Kmn for each α ∈ (0, 1). This process not only is time consuming, but also needs many pages to save tabulation. To solve this obstacle, we seek a way to reduce, as much as possible, critical 0 values tabulation space. For this mean, let S(α , σ ) and S(α, 1) are independent. The Laplace transform of a positive stable random variable is E exp{−tS(α, σ )} = e−σ 0

0

α tα

,

(7)

1 α

where t > 0. Setting X = S(α, 1),Y = S(α , σ ) and Z = S(αα , σ ), one can see Z ∞ n o Z∞ n o n 0 0o 1 1 E exp −ty α X fY (y)dy = exp {−t α y} fY (y)dy = exp −t αα σ α = E exp {−tZ} . (8) E exp −tXY α = 0

0

So, h i1 1 0 0 d α S(αα , σ α ) = S(α, 1) S(α , σ )

(9)

power

0.4 0.2 10 20 30 40 n

1

0.1

0.1

0.08

0.08

0.06 0.04

10 20 30 40 n

0.02

(d)

power

power

0.6 0.4 0.2 0

0.02

0.4

0.3

0.3 0.2 0

10 20 30 40 n

10 20 30 40 n

0.1 0

10 20 30 40 n

0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

10 20 30 40 n (i)

0.3 0.2 0.1 0

10 20 30 40 n (k)

power

10 20 30 40 n

0.2

(h)

0.6 0.5 0.4 0.3 0.2 0.1 0

10 20 30 40 n (f)

0.1

power

power power

10 20 30 40 n

0.4

(j)

0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

0.04

0.5

(g)

0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

0.06

(e)

0.8

10 20 30 40 n (c)

power

0.12 0.1 0.08 0.06 0.04 0.02

0.1

(b)

power

power

(a)

0.2

0

10 20 30 40 n

power

0

power

0.6

0.3

10 20 30 40 n

10 20 30 40 n (l)

power

power

0.8

0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

power

1

0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

10 20 30 40 n

(m) (n) (o) FIGURE 1. Empirical power of our method (solid line) and Shkol’nik method (broken line) for a sample of size n. The used alternatives are: Weibull(0.5) (a), Weibull(1) (b), Weibull(2) (c), Pareto(0.5) (d), Pareto(1) (e), Pareto(2) (f), gamma(0.5) (g), gamma(1.5) (h), gamma(2) (i), F(1,1) (j), F(2,2) (k), lognormal(0,1) (l), half-normal (m), half-Cauchy (n), half-logistic (o).

TABLE 1.

Critical values Cmn for the test statistic Kmn

n

0.01 m Cmn

Significance level γ 0.025 0.05 m Cmn m Cmn

m

5 10 15 20 25 30 35 40 45 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200

2 4 4 4 4 5 4 5 5 6 6 7 7 7 7 8 9 10 9 10 10 12 12 11 12

2 3 3 4 4 4 4 5 5 6 6 7 7 7 8 8 9 10 11 11 11 10 12 13 12

2 3 3 3 4 4 4 5 5 5 6 6 7 7 8 8 9 10 11 11 12 12 12 13 13

1.8181 1.0457 0.7404 0.5800 0.4819 0.4230 0.3724 0.3356 0.3034 0.2772 0.2424 0.2128 0.1915 0.1769 0.1621 0.1501 0.1392 0.1312 0.1238 0.1182 0.1118 0.1057 0.1040 0.0981 0.0946

1.6124 0.9037 0.6563 0.5190 0.4341 0.3779 0.3361 0.3032 0.2759 0.2523 0.2224 0.1949 0.1735 0.1606 0.1475 0.1378 0.1282 0.1201 0.1139 0.1087 0.1023 0.0991 0.0931 0.0907 0.0877

2 3 3 3 4 4 4 5 5 5 6 6 7 7 8 8 9 10 11 11 10 12 12 13 13

1.4412 0.8250 0.5962 0.4776 0.4006 0.3477 0.3094 0.2777 0.2533 0.2332 0.2037 0.1810 0.1618 0.1497 0.1371 0.1265 0.1203 0.1122 0.1061 0.0997 0.0957 0.0907 0.0871 0.0832 0.0797

0.10 Cmn 1.2599 0.7423 0.5373 0.4300 0.3605 0.3144 0.2789 0.2529 0.2312 0.2136 0.1853 0.1644 0.1485 0.1358 0.1250 0.1162 0.1077 0.1025 0.0965 0.0907 0.0866 0.0823 0.0790 0.0753 0.0725

0

where, α ∈ (0, 1) ∪ (1, 2] and α < 1. Upon (9), for 0.05 ≤ α < 1, we have:   d  0.1  α α S 0.1, σ 0.1 = S , 1 [S(α, σ )] 0.1 . α

(10)

  α This means that, given that α is known, a vector of size n of S(α, σ ) observations can be converted to S 0.1, σ 0.1 . Now the problem is reduced to a goodness-of-fit test of a positive stable distribution with exponent α = 0.1. We refer to the methodology employed in the previous section with this advantage that we need just to tabulate the critical values Cmn for α = 0.1. Table 1 presents the critical values computed by simulations based on N = 15000 samples of size n. The method of selecting the parameter m is similar to that suggested in [? ].

Asymptotic behavior of test statistic As noted in [4], the Ev family satisfies the required conditions of Theorem √ 1 of [3]. In other word, if m/ log n −→ ∞ and m(log n)2/3 /n1/3 −→ 0 as n −→ ∞ then, under H0 , the statistic Smn = 6mn(Kmn − log(2m) − 0.5772156 + R2m−1 ) follows a standard normal variate, where Rm = ∑m i=1 1/i. The exact and asymptotic critical values of test statistic Kmn are presented in Table 2 for samples of large size such as: n = 100, 120, 140, ..., 220, 240. For sufficiently large n, it is assumed that the limiting distribution of Kmn is normal, see [4]. Hence, critical values calculated in limiting case λmn , are compared with those obtained from the empirical distribution of Kmn . Comparisons show that differences at four selected significance levels γ = 0.01, 0.025, 0.05, 0.10 are negligible. As we claimed in the previous subsection, since α is known, we are free to consider the problem of goodness-of-fit test for α = 0.1. In the following, we make a comparison between the empirical powers of tests: Frosini (Bn ), Cramer-von Mises (Wn2 ), Kolmogorov-Smirnov (Dn ) and the Kmn . The power of the first three tests had been calculated in Shkol’nik [6] for γ = 0.1 when n = 30 observations come from S(0.25, σ ). Readers are referred to Shkol’nik [6] and references therein for being familiar with these tests. Simulation results, based on N = 15000 samples of size n = 30 when m = 4, show that Kmn highlights

TABLE 2. Critical values λmn (obtained from normal approximation) and Cmn for the test statistic Kmn 0.01 n

m

λmn

100 120 140 160 180 200 220 240

7 9 9 10 12 12 14 15

0.1294 0.1149 0.1044 0.0962 0.0902 0.0843 0.0806 0.0768

Cmn

m

0.1621 0.1392 0.1238 0.1118 0.1040 0.0946 0.0866 0.0814

8 9 11 11 12 12 14 15

Significance level γ 0.025 0.05 λmn Cmn m λmn 0.1233 0.1103 0.1017 0.0930 0.0870 0.0812 0.0779 0.0743

0.1475 0.1282 0.1139 0.1023 0.0931 0.0877 0.0807 0.0749

8 9 11 10 12 13 14 15

0.1188 0.1064 0.0984 0.0893 0.0842 0.0795 0.0755 0.0722

Cmn

m

0.1371 0.1203 0.1061 0.0957 0.0871 0.0797 0.0741 0.0690

8 9 11 12 12 13 16 17

0.10 λmn 0.1135 0.1019 0.0946 0.0878 0.0810 0.0766 0.0756 0.0723

Cmn 0.1250 0.1077 0.0965 0.0866 0.0790 0.0725 0.0662 0.0620

surely any violation from stable law with exponent α = 0.25. Further studies indicate that the Kmn shows sufficiently large power for rejecting some other heavy-tailed alternatives at above mentioned significance levels. TABLE 3. Empirical power of tests Bn , Wn2 , Dn and Kmn for the H0 : X ∼ S(0.25, σ ) (significance level 0.1, sample size n = 30) Test Bn

Wn2

Dn

Kmn

W(0.25) W(1) W(2) P(0.25) P(1) P(2) Γ(0.50) Γ(1.50) Γ(2) F(1,1) F(2,2) half-normal(0,1) half-Cauchy(0,1) half-Laplace(0,1) lognormal(0,1)

0.116 1.000 1.000 0.372 1.000 1.000 0.976 1.000 1.000 0.537 0.999 1.000 1.000 1.000 1.000

0.117 1.000 1.000 0.363 1.000 1.000 0.972 1.000 1.000 0.527 0.998 1.000 1.000 1.000 1.000

0.104 1.000 1.000 0.333 1.000 1.000 0.952 1.000 1.000 0.471 0.996 1.000 1.000 1.000 1.000

1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000

Test size

0.092

0.086

0.096

0.1001

Alternative

Application Here, we illustrate applicability of the proposed goodness-of-fit test using real data. We use the Danish fire insurance (DFI) data set for the years from 1980 to 1990. These data were collected at Copenhagen Reinsurance and analyzed by McNeil [1]. The general shape of data set is totally skewed to the right with positive support and long tail. Usually insurance data are fitted by heavy-tailed distributions. For this mean, first, we generated a vector of size n = 2492 form standard exponential distribution. Then, we fitted the EV distribution to the log-ratio whose numerator and denominator are exponential and DFI data set, respectively. A density plot compares the fitted pdf of the EV model with the empirical histogram of the DFI data set. As it is seen from Fig. 2, the EV pdf appears to capture the general pattern of the empirical histograms best. Some descriptive statistics associated with DFI data set are given in Table 4. b = 0.8532 and σ b = 0.5585 via the maximum likelihood Form fitted EV distribution, the parameters are estimated as: α approach. We obtained 0.0042 for test statistic Kmn . A number of N = 15000 samples of size n = 2492 for m = 35 have been used to calculate Cmn . For some significance levels, Cmn as well as the λmn which obtained through the normal approximation are presented in Table 5. By comparing the critical values in both cases with the calculated test statistic in any given significance level, we accept strongly that the positive stable law could be used to model the DFI data set. We test DFI data set by Shkol’nik

Fitted PDF

0.3 0.25 0.2 0.15 0.1 0.05 0

−9

−7 −5 −3 −1 1 Danish Fire Insurance

3

FIGURE 2. Fitted pdf of the EV distribution for the DFI data TABLE 4.

Descriptive statistics for DFI data set

Statistic

Mean

St. Dev.

Min.

Q1

Median

Q3

Max.

Value

-1.2351

1.5033

-9.7864

-2.0793

-1.0073

-0.1563

2.0856

[6] method. Unfortunately, the result was not in favor of positive stable law as the proposed model for DFI data for all significance levels. We obtained the Shkol’nik ’s test statistic 1.891 which yields the P-value=0.0586 while that is 0.4983 for our method. TABLE 5.

Critical vaule Cmn and λmn Significance level γ 0.025 0.05

Critical value

0.01

Cmn

0.0137

0.0123

0.0112

0.0100

0.10

λmn

0.0196

0.0190

0.0186

0.0181

CONCLUSION A goodness-of-fit test for positive stable distribution is introduced based on the Kullback-Leibler distance measure. In the usual case that both parameters of positive stable law are unknown, comparisons based on simulation studies between the old methods that suggested by Shkol’nik [6] and the new approach introduced here, reveal that the new method is more powerful than the old one for samples of small size. In asymptotic case, when sample size gets large, both approaches have approximately equal power. When the exponent parameter is known, we get rid of tabulation of test statistic critical values for all levels of an exponent parameter α ∈ (0, 1). Instead, this problem is easily solved by tabulating critical values just for α = 0.1 and a given significance level. As it is seen, for a significance level γ = 0.10, our test statistics rejects almost surely all other 15 selected alternatives in favor of positive stable law at size 0.1001 while the old method presents low power when the used alternatives are coming from W (0.25), P(0.25), F(1, 1). One of the other advantages of our method over the old one is that our methodology is the scale invariant. In other words, being known or unknown, the scale parameter has no effect on the proposed test statistic structure. The applicability of entropy-based goodness-of-fit test, which illustrated through a real data set, can be considered as its attractive feature comparison with other one. We note that all programs have been implemented using codes written in R software.

REFERENCES 1. 2. 3. 4. 5. 6. 7. 8.

A. McNeil, ASTIN Bulletin, 27, 117–137 (1997). C. E. Shannon, Bell System Technical Journal, 27, 379–423, 623–656 (1948). K. Song, IEEE Transaction on information theory, 48, 1103–1117 (2002). P. Pérez-Rodríguez, H. Vaquera-Huerta, and J. A., Villaseñor-Alva, Communications in Statistics-Theory and Methods, 38, 842–855 (2009). S. Kullback, and R. A. Leibler, Annals of Mathematical Statistics, 22, pp. 79–86 (1951). S. M. Shkol’nik, Journal of Mathematical Sciences, 78(1), 109–114 (1996). T. Cover, and J., Thomas, Elements of Information Theory, Wiley, New York, 2006. V. M. Zolotarev, One-Dimensional Stable Distributions, American Mathematical Society, Providence, R. I., 1986.