Riding the Temp
Michael Moreno ISFA - Universite Lyon 1 43, bd du 11 novembre 1918 69622 - Villeurbanne CEDEX in association with Speedwell Weather Derivatives♣ 19, St Mary At Hill London EC3R 8EE 44. (0)207.929.79.79 http://www.weatherderivs.com/
Abstract : Two processes have been extensively discussed for the simulation of the temperature in order to price weather options. Past studies revealed that these kinds of processes were fitting correctly the temperature data. However no direct comparison of these two processes have been written yet. Firstly explaining these two processes, we then look at their goodness of fit for Marseille and Paris-Orly. Going further than past studies, the third part of this article finally reveals if one can use such processes to simulate the temperature.
Keywords : Option, Temperature, Weather.
♣
Speedwell Derivatives Limited is regulated by the SFA. 1
T
emperature based contracts are currently the most traded in the weather market. As an example, the EnronOnline weather trading system [6] offers no less than 99 weather contracts that can be traded real-time. All are based on Heating Degree-Day (HDD) or Cooling Degree-Day (CDD). HDD and CDD are cumulative differences between a reference temperature (65°F) and the average of the daily high and low. The current dominance of HDD and CDD contracts arose from the protection against weather needed by energy companies. Nevertheless, irrespective of the required protection (see Life Outside the energy industry [5]) it is worthwhile constructing a simulation of the temperature process instead of statistically estimating prices using HDD data. Indeed, many problems can be encountered with straightforward estimation of HDD or CDD distributions. The first one is that irrespective of the extent to which the average temperature on a given day is below the reference, such a day contributes zero to the cumulative index. But clearly the nature of the risk in the underlying temperature process is different depending on whether that daily average was, say, 64.5 or 0. Secondly using the CDD or HDD series directly is likely to mean that only 30 or 40 points are available. Moreover, suppose that only 10 years of data are available or that there is a strong case that only the last ten years are representative of the likely future distribution of the index: it may then be hard to estimate the distribution with a good fit. More problems come to light when trying to estimate a contract that is currently within its reference period. Using a statistical approach will inevitably drive one to an estimation based on conditional probability. Using a temperature simulation is likely to be a superior method. As a consequence much effort on the part of banks, insurance companies and energy companies has been expended on simulating the temperature process. While temperature is often continuously measured, the values used in order to calculate HDD or CDD are the daily average of the temperature. These values are therefore discrete. Using a continuous process to estimate the daily average temperature process in order to forecast it is consequently not necessary. Therefore instead of starting with a continuous mean-reverting model as did Dischel [3] or Dornier & Queruel (DQ hereafter) [4] and then to discrete it, discrete processes could be used directly. The main purpose of this article is to compare the two discrete processes for modelling the temperature process that have been published in previous papers. The first one is the well known mean reverting process that we explain building it in discrete time. The second one is that introduced by Carmona [2]. We will finally conclude on their goodness of fit.
I. Presentation of the two processes A. The mean reverting process The daily average temperature Ti can be viewed as a time series process. Figure 1 visualises this.
2
Ti+1 =Ti - Θi
Θi+1
=-Θi+1-Θi Ti
Θi -Θ
curve
figure 1 : Building first property of Ti.
The temperature Ti+1 at time i+1 can be built starting with the following assumptions: - It depends on the temperature at time Ti - It depends on the increase or the decrease of the average temperature between date i and i+1 So, the actual “deterministic process” can be written as: (1) Ti +1 = Ti + (Θ i +1 − Θ i ) Where Θi is the mean of the temperature at date i. The actual “process” is completely deterministic. In order to create some uncertainty, a noise φi is added for each date i (see figure 2). Ti+1 Θi+1 Ti
ϕi+1 distribution
Θi
figure 2 : Adding the uncertainty to Ti+1.
So we rewrite the process as : (2) Ti +1 = Ti + (Θ i +1 − Θ i ) + φ i No conditions are made on the noise at this stage.
But, one of the most well known properties of temperature is that it is mean reverting to the global average temperature. Expressed otherwise, the noise φi is conditional on the past and should be written as: (3) φi = α (Θ i -1 − Ti -1 ) + ε i Where α is a constant and εi is a “complementary” noise.
3
To end this first part of the construction of the process, we now have: Ti +1 = Ti + (Θ i +1 − Θ i )
+α(Θ i − Ti ) + ε i
(4)
This process generalises the discretisation of the extensively described continuous process: dTt = α(Θ t − Tt )dt + γ t dWt
Where Wt is a Wiener process and γt a deterministic function of time t. Indeed, if we restrict εi to the following process: ε i = γ i GWN i with GWNi a Gaussian white noise then we obtain the corrected process developed by DQ which was generated from the discretisation of the continuous process. DQ asserted that without losing any generality one could suppose γi constant. Unfortunately, this last hypothesis is not always adapted to represent the real temperature process.
B. The Autoregressive Process The process presented above is based on a difference-process. This can be show since equation 2 could be rewriten: (Ti +1 − Ti ) − (Θi +1 − Θi ) = φi (5) Carmona proposed instead to simulate the process using the following assumption: (6) Ti +1 = Θ i +1 + AR ( p ) The memory of the process of the temperature appears more “clearly” through the use of the autoregressive process with p-order AR(p). Now if we rearrange equations (5) and (6) then we obtain: (5’) Ti +1 − Θi +1 = (Ti − Θi ) + φi and (6’) Ti +1 − Θ i +1 = AR ( p ) Thus, if both processes fit the data then the AR(p) process should be as follows: AR( p ) = (Ti − Θ i ) + φ i .
II. Estimations and numerical comparisons of the process A. Estimating the mean-reverting process Suppose that the temperature process is given by (4). The estimation of the noise εi is described in the following. The data used are Paris-Orly daily average temperatures. The calculation of Θi was determined with detrended temperature using 30 years of data then evaluated with a partial average. A sinusoid was deliberately avoided: the slope is too regular. It does not fit the asymmetric evolution of temperature in summer compared with winter. Nor does the increase in springtime temperatures usually reflect the decrease in autumn temperatures. For the moment, supposing γ constant and without specifying any distributions of the white noise, the least squares method was used in order to obtain the value of α. Then, the residues have been calculated. For Orly, the result residues εi against time are produced in figures 3, 4 & 5.
4
10 8 6 4 2 10698
10135
9572
9009
8446
7883
7320
6757
6194
5631
5068
4505
3942
3379
2816
2253
1690
-4
1127
1
-2
564
0
-6 -8 -10
Figure 3: Paris – Orly Residues Autocorrelation Function Orly Residus Déc Corr.
ErrT
Q
p
1 +.072
.0096
56.23
.0000
2 -.088
.0096
140.2
0.000
3 -.049
.0096
166.7
0.000
4 -.015
.0096
169.0
0.000
5 +.004
.0096
169.2
0.000
6 +.009
.0096
170.1
0.000
7 +.008
.0095
170.8
0.000
8 +.003
.0095
170.9
0.000
9 +.007
.0095
171.5
0.000
10 +.022
.0095
176.7
0.000
11 +.003
.0095
176.8
0.000
12 +.001
.0095
176.8
0.000
13 +.008
.0095
177.6
0.000
14 +.015
.0095
180.1
0.000
15 +.006
.0095
180.5
.0000
-1.0
-0.5
0.0
0.5
1.0
Figure 4 : Autocorrelation function of Paris-Orly residues Testing Normal Distribution Kolmogorov-Smirnov d = .0154697, p < .05 Chi ² : 81.56924, dl = 15, p = .0000000 (adjusted dl) 2000 1800 1600 1400
1000 800 600 400 200 0
-13.000 -12.194 -11.387 -10.581 -9.774 -8.968 -8.161 -7.355 -6.548 -5.742 -4.935 -4.129 -3.323 -2.516 -1.710 -0.903 -0.097 0.710 1.516 2.323 3.129 3.935 4.742 5.548 6.355 7.161 7.968 8.774 9.581 10.387 11.194 12.000
Frequency
1200
Bin (Sup Values)
figure 5 : Overall distribution of Paris-Orly residues
5
Once could conclude from this point that the process fits quite well the data. However, given the bias visible in figure 5, the Gaussian white noise could be replaced by another noise with a slightly asymmetric distribution. (The Pearson distributions could be used for this). Repeating the above analysis with Marseille is a priori also conclusive (see figure 6): Normal Adjustement of Marseille Residus K-S d = .0171673, p < .01 Lilliefors p < .01 Chi ² : 86.98254, dl = 27, p = .0000000 (dl ajustés) 1200 1100 1000 900
Frequency
800 700 600 500 400 300 200 100 0 -7.0 -6.0 -5.0 -4.0 -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 -6.5 -5.5 -4.5 -3.5 -2.5 -1.5 -0.5 0.5 1.5 2.5 3.5 4.5 5.5 6.5 Bins (Sup Values)
figure 6 : Normal adjustment of Marseille Residues
But, those tests are not complete. If we look at the distributions of the residues through time shall we still conclude the same goodness of fit. Figure 7 (resp. 8) shows the distribution of the volatility of the residues of Orly (resp. Marseille). Distribution of the volatility of the residus of Paris trough time K-S d = .0338575, p < .01 Lilliefors p < .01 Chi ² : 446.2599, dl = 47, p = 0.000000 700 600
400 300 200 100 0
1.20 1.24 1.28 1.32 1.36 1.40 1.44 1.48 1.52 1.56 1.60 1.64 1.68 1.72 1.76 1.80 1.84 1.88 1.92 1.96 2.00 2.04 2.08 2.12 2.16 2.20 2.24 2.28 2.32 2.36 2.40 2.44 2.48 2.52 2.56 2.60 2.64 2.68 2.72 2.76 2.80 2.84 2.88 2.92 2.96 3.00 3.04 3.08 3.12 3.16 3.20
Frequency
500
Bins (Sup Values)
figure 7 :
6
Distribution of the volatility of the residus of Marseille trough time K-S d = .0498675, p < .01 Lilliefors p < .01 Chi ² : 798.7875, dl = 41, p = 0.000000 800 700 600
Frequency
500 400 300 200 100
0.500 0.568 0.636 0.705 0.773 0.841 0.909 0.977 1.045 1.114 1.182 1.250 1.318 1.386 1.455 1.523 1.591 1.659 1.727 1.795 1.864 1.932 2.000 2.068 2.136 2.205 2.273 2.341 2.409 2.477 2.545 2.614 2.682 2.750 2.818 2.886 2.955 3.023 3.091 3.159 3.227 3.295 3.364 3.432 3.500
0
Bins (Sup Values)
figure 8 :
Given the non-normal distribution of the volatility of the residues for Marseille further analysis is required. An investigation into possible seasonally effects is now carried out: Figure 9 shows the evolution of the 180 days moving average of the volatility of the residues calculated with a 30 days basis for both towns during 3650 days : 2.7 2.5 2.3 2.1 1.9 1.7 1.5 1
1001
2001
3001
Figure 9 :
This last figure demonstrates that the volatility of the residues has a seasonal component that might have been included before. So presuming that γ is constant through time is incorrect. Volatility of temperature during the summer is different from that during winter. Analysis on both towns shows the volatility is higher in winter than in summer. In other words, the risk is different from winter to summer because the distribution of residues is non homogenous. Therefore process must be readjusted. The function of time γ(t) is chosen arbitrarily to be a sinusoid of the form : 1 + sin 2 (θt + β ) . The two above curves are then transformed to the following one :
7
1.9 1.7 1.5 1.3 1.1 0.9 0.7 0.5 1
1001
2001
3001
Figure 10 :
The curves show that including this new function for γt has considerably reduce the effect of the seasonally component the volatility still possesses some seasonnality. Using other sinusoid functions of volatility did not lead to significantly better fits.
B. Estimating the AR process We have seen that the temperature process appears adequately represented by a mean-reverting process with a white noise that is not perfectly Gaussian and with a seasonal volatility. We now turn to estimating the simple AR process in order to compare the results. In order to provide results we first represent the autocorrelation function of the difference: Ti +1 − Θ i +1 . Autocorrelation function - Orly Déc Corr.
ErrT
Q
p
1 +.719
.0096
5658.
0.000
2 +.455
.0096
7925.
0.000
3 +.332
.0096
9134.
0.000
4 +.254
.0096
9839.
0.000
5 +.191
.0096
102E2
0.000
6 +.149
.0096
105E2
0.000
7 +.131
.0095
107E2
0.000
8 +.124
.0095
108E2
0.000
9 +.121
.0095
110E2
0.000
10 +.112
.0095
111E2
0.000
11 +.100
.0095
112E2
0.000
12 +.084
.0095
113E2
0.000
13 +.070
.0095
114E2
0.000
14 +.064
.0095
114E2
0.000
15 +.059
.0095
115E2
0.000
-1.0
-0.5
0.0
0.5
1.0
Figure 11:
With the information of the partial autocorrelation function and the one provided from this first study, the order of the AR process we have chosen to represent the temperature is 6. After having estimating the process we show that the new calculated residues are not correlated (Figure 12) and are normally distributed (Figure 13).
8
Autocorrelation of the resulting residus - Orly ARIMA (6,0,0) process Déc Corr.
ErrT
Q
p
1 -.000
.0096
.00
.9673
2 -.001
.0096
.01
.9966
3 -.000
.0096
.01
.9998
4 -.004
.0096
.16
.9970
5 +.001
.0096
.17
.9994
6 -.021
.0096
4.83
.5661
7 -.004
.0095
4.99
.6613
8 +.007
.0095
5.54
.6988
9 +.025
.0095
12.37
.1935
10 +.019
.0095
16.49
.0864
11 +.021
.0095
21.47
.0288
12 +.009
.0095
22.45
.0328
13 +.001
.0095
22.46
.0487
14 +.012
.0095
24.02
.0456
15 -.003
.0095
24.12
.0631
-1.0
-0.5
0.0
0.5
1.0
Figure 12: Testing Normality Distribution - Orly 4 3 2 1 0 -1 -2 -3 -4 -35
-25
-15
-5
5
15
25
35
Figure 13:
Running the same estimation process with Marseille data we obtain similar results with a 7-order AR process (Figure 14, 15 & 16).
9
Autocorrelation Function - Marseille Déc Corr.
ErrT
Q
1 +.787
.0096
6792.
0.000
2 +.579
.0096
105E2
0.000
3 +.442
.0096
126E2
0.000
4 +.352
.0096
140E2
0.000
5 +.286
.0096
149E2
0.000
6 +.235
.0096
155E2
0.000
7 +.198
.0095
159E2
0.000
8 +.167
.0095
162E2
0.000
9 +.147
.0095
164E2
0.000
10 +.132
.0095
166E2
0.000
11 +.113
.0095
168E2
0.000
12 +.094
.0095
169E2
0.000
13 +.080
.0095
169E2
0.000
14 +.067
.0095
170E2
0.000
15 +.052
.0095
170E2
0.000
-1.0
-0.5
0.0
0.5
p
1.0
Figure 14: Autocorrelation of the Resulting Residus - Marseille ARIMA (7,0,0) process Déc Corr.
ErrT
Q
p
1 -.000
.0096
.00
.9939
2 -.000
.0096
.00
.9995
3 -.000
.0096
.00
1.000
4 -.000
.0096
.00
1.000
5 -.001
.0096
.01
1.000
6 -.001
.0096
.02
1.000
7 -.002
.0095
.08
1.000
8 -.010
.0095
1.08
.9977
9 -.003
.0095
1.20
.9988
10 +.020
.0095
5.49
.8560
11 +.008
.0095
6.12
.8650
12 -.003
.0095
6.22
.9045
13 +.007
.0095
6.71
.9162
14 +.009
.0095
7.59
.9096
15 +.001
.0095
7.59
.9390
-1.0
-0.5
0.0
0.5
1.0
Figure 15 Testing the Normality Distribution of the Residus - Marseille 4 3 2 1 0 -1 -2 -3 -4
-8
-6
-4
-2
0
2
4
6
8
10
Figure 16
10
With these processes, the residues are normally distributed and the goodness of fit seems impressive and even better than with the mean reverting process.
III.
Are these processes really good?
Past studies have suggested that these processes fit the data well It can be demonstrated with basis statistics that these processes may not be as good as previous studies have asserted. Indeed, if they were good then the conditional distributions of the residues given different periods of the year should be the same. The four following graphics show that it is not the case: April Distribution of Residus - Orly
100
100
80
80
36
37
3.
38
2.
45 4.
43 3.
42 2.
40 1.
39 0.
3 .6 -0
4 .6
.6
7 .6
6
-1
-2
-4
-3
9
70 60 50 40 30 20 10 0 .6
40 3.
38
2.
36 1.
33
0.
9 .6
.7
2
-0
.7
4
-1
.7
6
-2
.7
-3
-4
.8
9
Fréquence
120 100 80 60 40 20 0 1
Frequency
39
November Distribution of Residus - Orly
July Distribution of Residus - Orly
-5
1.
0 .6
-0
Bins
Bins
Bins
0.
9
8
.5
.5
-1
-5
7
5 .5
90
6.
4.
18
3.
82
2.
4
-0
-1
0.
0
.5
6
.9
2
.2 -3
.6
.9
-4
-5
-2
0 .5
0
26
20
54
20
6
40
.5
40
60
-4
60
-3
Frequency
120
120
8
Frequency
January Distribution of Residus - Orly
Classes
Clearly the distribution of the residues through the year is non-constant. Except the two first moments, the other moments differ from time to time. Hence the assumptions previously made to fit the process are not fully respected. The problem is not coming from the volatility of the residues. The problem is deeper: the final residues are not identically distributed, they are only independent. Whatever the town and the process we have estimated, non-homogenous distributions have been raised.
Conclusion Many papers suppose that the process of the temperature is either a mean-reverting process or an autoregressive process with a noise that is i.i.d. We have demonstrated that the goodness of fit appears satisfactory, but that these basic processes should not be used to simulate the temperature because the distribution of the noise is not homogenous through time. Other classes of processes should be employed.
11
References [1] [2] [3] [4] [5] [6] [7] [8]
Brickwell P. & Davis R., “Time Series: Theory and Methods”, Springer Series in Statistics, Second Edition,1991. Carmona R., “Calibrating Degree Days Options”, Talk given at the Third Seminar on Stochastic Analysis, Random Fields and Applications, Ascona (Switzerland) September 23, 1999. Dischel B.,”Weather Derivatives”, http://www.adtrading.com/adt37/. Dornier F. & Queruel M., “Caution to the Wind”, Weather Risk, August 2000. Foster D. & Gautam J., “Come rain, come shine”, Weather Risk, August 2000. Foster K., “Optimistic Forecast”, Weather Risk, August 2000. Hamilton J., Time Series Analysis, Ed Princeton University Press, 1994. Storch H. & Zwiers F., Statistical Analysis in Climate Research, Press Syndicate of the University of Cambridge, 1999.
12