Trading with (or against) the Trend Daniel Herlemont
March 11, 2013
Contents 1 Introduction
1
2 To do
3
3 Expected results 3.1 Identifying the option profile and trading 3.2 Distribution of PL . . . . . . . . . . . . 3.2.1 volatility of PL . . . . . . . . . . 3.2.2 Option profile: a straddle . . . . 3.2.3 example with actual data . . . .
impact parts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4 Annex 4.1 The wealth equation . . . . . . . . . . . . . . . . 4.2 Strategies in a prop trading context . . . . . . . . 4.3 About the exponential moving average . . . . . . 4.3.1 Algorithm for the EMA in continous time 4.3.2 The variance of EMA . . . . . . . . . . . . 4.4 Probability of success in case of normal returns . 5 References
1
. . . . . .
. . . . . .
. . . . . .
. . . . .
. . . . . .
. . . . .
. . . . . .
. . . . .
. . . . . .
. . . . .
. . . . . .
. . . . .
. . . . . .
. . . . .
. . . . . .
. . . . .
. . . . . .
. . . . .
. . . . . .
. . . . .
. . . . . .
. . . . .
. . . . . .
. . . . .
. . . . . .
. . . . .
3 3 5 5 6 7
. . . . . .
10 10 12 13 13 14 14 16
Introduction
This practical work is to characterize and back-test trend following (or mean reverting) strategies on a single asset while controlling for the maximum drawdown. 1
1 INTRODUCTION Following the Lyxor white paper in section 5.2. We consider the simple example of a moving average of past daily returns. We choose a moving average with exponential weights: µˆt =
1 X − iδt St−iδt − St−(i+1)δt e τ τ =0,n St−(i+1)δt
Such an estimator has interesting properties. It depends on a single parameter τ , which represents the average duration of estimation. Due to exponential weighting, recent returns have a larger impact than older ones. It does not suffer from threshold effect, that is, changes of regimes due to a past observation that exits the averaging window. Moreover, this estimator has theoretical foundations, as it can be interpreted as the Kalman filter for an unobservable trend. Lastly, it produces simple formulas for the performance of the related strategy. In particular, one can derive the dynamics of the moving average depending on the asset returns: δt 1 St+δt − St (1) µˆt + µt+δt ˆ = 1− τ τ St or, in a continuous time framework: 1 dSt 1 dµˆt = − µˆt dt + τ τ St
(2)
In the following, we suppose that the investor considers that the best returns estimation between times t and t + dt is µˆt . Based on this assumption, the investor can simply apply the optimal Markowitz/Merton/Kelly strategy. Here, we will consider that the risk-free rate is 0. In this case, its exposure to the risky asset will be: πt = m
µˆt σ2
where µ is the trend estimator, σ is the volatility (possibly estimated) of the underlying, and m is a risk tolerance parameter. This exposure is proportional to the trend estimator value µˆt and inversely proportional to the risk. Note that it is not capped and is almost never 0. NA dXt dSt µˆt dSt = πt =m 2 (3) Xt St σ St Now by inverting equation 2, we can show that the wealth is Z T 2 τ µ ˆt 1 1 XT 2 2 ln =m (ˆ µ − µ0 ) + 1− m − dt (4) X0 2σ 2 T σ2 2 2τ 0 Daniel Herlemont
2
3 EXPECTED RESULTS The first term is the option profile, the second one is the trading impact (stochatic part). See annex for the demonstration (including the mean reversion case). τ is the smoothing factor that can be converted in a classical arithmetic moving average where the usual conversion is dt/τ = λ = 2/days with dt = 1/252 that is days ≈ 500τ For example, τ = 1/2 is equivalent to a 1 year moving average, τ = 0.01 is quite the same as a 5 days moving average. It is possible to estimate the recent trend with different methods. For example, we can can consider a simple moving average on prices (or log prices). µˆt dt =
1 1X log(St−(i−1)dt /St−idt ) = (log(St − log St−sdt ) s i=1,s s
with s the window size. This trend estimate consist in comparing the prices over s time periods. Note that the intermediary values are not influencing the result.
2
To do
Represent Xt as a function of St (using simulations). Comments ?
consider the mean reverting strategy where the exposure is πt = −mµˆt /σ 2
Backtest the strategies (trend following and mean reverting) on some underlying (eg CAC40, S&P, NASDAQ, ...)
include transaction costs (ranging from 1 to 3bps)
Try to manage under a maximum drawdown garantee dmax by scaling the exposure by a factor dmax − dt 1 − dt where dt is the current drawdown at date t
3 3.1
Expected results Identifying the option profile and trading impact parts
Example with a trend followinf strategy
Daniel Herlemont
3
3 EXPECTED RESULTS > > > > > > > > > > > > > > > > > > > > > > > > > >
set.seed(1) sigma=0.15 mu=sigma^2/2 N=1500 r=rnorm(N,mean=mu/252,sd=sigma/sqrt(252)) tau=0.04 # 20 days dt=1/252 lambda=dt/tau mut=filter(r*lambda, filter=1-lambda, method="rec",init=r[1])/dt m=1/20 pit=m*mut/sigma^2 # pi(t)*r(t+1) rX=pit[-length(pit)]*r[-1] r=r[-1] #remove first values (intialisation of moving average) b=N-250 r=tail(r,-b) rX=tail(rX,length(r)) mut=tail(mut,length(r)) pit=tail(pit,length(r)) S=cumprod(1+r) X=cumprod(1+rX) #identification of the profil option option.profile=m*tau/(2*sigma^2)*(mut^2-mut[1]^2) #plot(dates,option.profile) trading.impact=m*cumsum(((mut/sigma)^2*(1-m/2)-0.5/tau)*dt)
Daniel Herlemont
4
150 100
(mut/sigma)^2 − 0.5/tau
0.0
0
50
0.4
0.6
log(S) log(X) option profile trading impact OP+TI
0.2
log
200
0.8
3 EXPECTED RESULTS
0
50
100
150
200
250
0
50
100
150
200
250
150
200
250
Index
0
−1
−2
0
pit
mut
2
1
4
2
days
0
50
100
150
200
250
0
50
Index
3.2 3.2.1
100 Index
Distribution of PL volatility of PL
the quadratic variation of dX/X is < dX/X >= m2
µˆt 2 ˆt 2 2µ < dS/S >= m dt σ4 σ2
The realized variance is Z
T
< dX/X >= m 0
Daniel Herlemont
2
Z 0
T
µˆt 2 dt σ2 5
3 EXPECTED RESULTS The√expected value of the variance is m2 /(2τ )T (see annex). Hence the average volatility is m/ 2τ 3.2.2
Option profile: a straddle
The mean reverting strategy exhibits a typical straddle payoff.
●
1.10
● ●
● ● ●
●● ●
● ●
●
●
● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ●● ● ●●● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ●● ●●●●●● ● ● ●● ● ● ● ●● ●● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ●● ●● ● ● ●● ●●● ● ● ● ●● ● ●●●●● ● ● ● ●● ●● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ●●●●● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ●● ● ● ●● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ●●● ● ● ● ●●●● ● ●● ● ●●●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ●●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ●● ● ● ●● ●● ● ●●● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ●● ● ●●●●● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ●● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ●●● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ●● ● ● ● ●● ● ● ● ● ● ● ●● ● ●● ● ● ●● ● ●●● ●● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ●●● ● ● ● ● ●● ●● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●●● ● ● ●● ●●● ● ● ●●●● ● ● ●● ● ● ● ● ●●● ●● ● ● ●●● ● ● ● ●● ● ● ● ● ● ● ●● ● ●● ● ●● ● ● ●● ● ● ●●●● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ●● ● ● ● ●●● ● ● ● ●●●● ● ● ● ● ● ● ●● ● ●
1.00 0.90
0.95
X
1.05
●
● ●
● ●
0.9
1.0
1.1
1.2
S
The probability of success is 0.546 However the mean of the PL is still 0 this is due to a negative skweness (losses are less frequent but more severe than positive returns) skewness = 0.228
Daniel Herlemont
6
3 EXPECTED RESULTS
40 0
20
Frequency
60
80
Histogram of Xs
0.90
0.95
1.00
1.05
1.10
Xs
3.2.3 > > > > > > > > >
example with actual data
code="ALU.PA" q=read.csv(paste("C:/yats/data/yahoo/csv/",code,".csv",sep="")) dates=rev(as.Date(q$Date)) S=rev(q$Adj) r=diff(log(S)) epsilon=1 m=1/100 tau=0.02 dt=1/252
Daniel Herlemont
7
3 EXPECTED RESULTS > > > > > > > > > > > > > > > > > > > >
lambda=dt/tau mut=filter(r*lambda, filter=1-lambda, method="rec",init=r[1])/dt s.lambda=0.1 r2=r^2 sigmat=sqrt(filter(r2*s.lambda, filter=1-lambda, method="rec",init=r2[1])/dt) pit=epsilon*m*mut/sigmat^2 rX=pit[-length(pit)]*r[-1] a=100 rX=tail(rX,-a) n=length(rX) X=cumprod(1+rX) mut=tail(mut,n) sigmat=tail(sigmat,n) pit=tail(pit,n) S=tail(S,n) dates=tail(dates,n) #identification of the profil option option.profile=m*tau/(2*sigmat^2)*(mut^2-mut[1]^2) #plot(dates,option.profile) trading.impact=m*cumsum(((mut/sigmat)^2*(1-m/2)-0.5/tau)*dt) ALU.PA strategy:TF m=0.01 tau=0.02
0.4 0.0
2
4
0.2
6
S
8
total return (log)
10
0.6
12
0.8
14
ALU.PA
2004
2006
2008 dates
Daniel Herlemont
2010
2012
2004
2006
2008
2010
2012
date
8
3 EXPECTED RESULTS
1.5
exposition
1.0 pi(t)
log −2
−1.0
−1
−0.5
0
0.0
0.5
1
2
log(S) log(X) option profile trading impact OP+TI
13000
13500
14000
14500
15000
15500
2004
2006
2008
days
tail(dates, −100)
sharpe(t)^2=(mut/sigmat)^2
sigmat
2010
2012
2010
2012
0.8
volatility
0.6
150 0
0.2
50
0.4
100
sharpe^2
200
1.0
250
1.2
300
1.4
12500
2004
2006
2008
2010
2012
tail(dates, −100)
2004
2006
2008 tail(dates, −100)
−15
−10
−5
mut
0
5
10
trend estimation mu(t)
2004
2006
2008
2010
2012
tail(dates, −100)
Daniel Herlemont
9
4 ANNEX > PL.sigma.year=sd(rX)*sqrt(252) > PL.mu.year=mean(rX)*252 > (sharpe=PL.mu.year/PL.sigma.year) [1] 0.853 > (mdd=max(1-X/cummax(X))) [1] 0.158 > rX0=(rX-mean(rX))/sd(rX) > (skewness=mean(rX0^3)) [1] 0.768 > (kurtosis=mean(rX0^4)) [1] 19.5 > (tstat=sqrt(length(rX)*mean(rX)/sd(rX))) [1] 11
4
Annex
4.1
The wealth equation
The exposure is µˆt σ2 We will consider both the case of trend following with ξ = +1 and mean reverting with ξ = −1 The dynamic of the wealth is now πt = ξm
dXt µˆt dSt = ξm 2 Xt σ St To show 4, we apply ito to d log Xt d log Xt = Daniel Herlemont
dXt 1 dXt 1 2 2 µˆt dSt 1 2 µˆ2t − < dX >= − π σ = ξm − m 2 dt t Xt 2Xt2 Xt 2 t σ 2 St 2 σ 10
4 ANNEX < . > denotes the quadratic variation. By inverting 3 we get dS/S as a function of µˆt dSt = τ dµˆt + µˆt dt St Then µˆt 1 2 µˆ2t m (τ d µ ˆ + µ ˆ dt) − dt t t σ2 2 σ2 µˆ2t 1 τ = ξm 2 µˆt dµˆt + m 2 ξ − m dt σ σ 2
d log Xt = ξm
Note that
dµˆ2t = 2µˆt dµˆt + < µˆt >
and < µˆt >= σ 2 /τ 2 dt. Hence 1 σ2 µˆt dµˆt = dµˆ2t − 2 dt 2 τ By the end we get ! 1 µˆ2t 1 dt ξ− m −ξ σ2 2 2τ
1 τ d log Xt = ξm 2 dµˆ2t + m 2 σ or 1 Xt τ = ξm 2 (µˆ2T − µˆ20 ) + m log X0 2 σ
T
Z 0
(5)
! µˆ2t 1 1 dt ξ− m −ξ σ2 2 2τ
(6)
! µˆ2t 1 1 1− m − dt σ2 2 2τ
(7)
! 1 µˆ2t 1 − 2 1+ m dt 2τ σ 2
(8)
for Trend following ξ = 1 and 1 τ XTF log tT F = m 2 (µˆ2T − µˆ20 ) + m 2 σ X0
T
Z 0
for Mean Reversion ξ = −1 and XMR 1 τ log tM R = m 2 (µˆ20 − µˆ2T ) + m 2 σ X0
Daniel Herlemont
Z 0
T
11
4 ANNEX
4.2
Strategies in a prop trading context
In a Prop Trading context, traders may not even known the AUM (asset under management), rather traders are given limits in terms of VaR or volatility. Suppose that the yearly target vol is σ0 in euros. For example σ0 = 3M E (189KE daily vol), this is quite the same as a managed fund of 10M E with 30% volatility of returns. However managing at target vol in euros is not the same as the vol of returns. σeuro = σr V . Keeping a constant vol in euros means that dσr /σr = −dV /V ; implicitly the volatility of returns should be increased when loosing money ... is it a good idea ? Note that the volatility budget is not mandatory on a day by day basis. In other words, it is an expected volatility target, E[σt ] = σ0 . Constraints on the vol of the vol should be added to keep the vol not too far from the expected value. In a prop trading setting, we will supposed that the increments follow a standard brownian: √ dSt = (.)dt + σ dtBt We still suppose that the underlying has constant volatility, if it not the case, we can build a synthetic contract with a constant volatility. Here, the moving average is defined with the increments (not the returns) Z 1 t −(t−s)/τ e dSs µˆt = τ −∞ so that dµ = −µdt/τ + dSt /τ or dS = µdt + τ dµ Let dXt = θt adSt with θt the number of contracts on the underlying future, and a the multiplier. We want, the position proportionnal to the moving average with the same sign for trend following and opposite sign for mean reversion. Hence θt = ξk µˆt . 2 The instantaneous vol of the PL is σX (t) = k 2 µˆt 2 a2 σ 2 The expected value should be √ 2 4 2 2 σ02 = E[σX ] = k σ2τ a . Hence k = σ0 2τ /(aσ 2 ) and the strategy is √ µˆt dXt = ξσ0 2τ 2 dSt σ . and the number of contracts
√ µˆt θt = ξσ0 2τ 2 aσ p For example the daily vol of increments of the Eurostoxx 50 is about σ 1/252 = 50 pts, the multiplier is a = 10 (1points = 10 euros). Some comments, the position held is proportional Daniel Herlemont
12
4 ANNEX
to
to
µˆt , σ2
√
we recover the well known results of Merton/Kelly.
2τ .
The variation of the position dθt has the same sign as ξdµˆt that is the same sign as ξ(dS − µt dt). For example for ξ = −1 the variation of the position is the same sign of µt dt − dS Suppose that µt is positive, then θt < 0 (short). if dS/dt > 0 is less than the moving average then variation of the position is positive, then we have to buy a raising underlying ... and we are not mean reverting. To be mean reverting the variation of the underling should be large enough. Using the same method as the multiplicative strategy, we get cumulated PL XT √ Z T µ2t 1 ξσ0 τ 3/2 2 2 √ (µT − µ0 ) + ξσ0 2τ [ 2 − ]dt XT − X0 = 2 σ 2τ σ 2 0 If we supposed that S is a brownian, for a mean reversion stragety, the probability of dXt > 0 is the same as the probability that a normal distribution is between ±1, that is 68%. However the expected PL si still 0 !
4.3 4.3.1
About the exponential moving average Algorithm for the EMA in continous time Z 1 t −(t−s)/τ e dSs µˆt = τ −∞
that is discretized as ∞
µ(tn ) =
1 X − tn −tn−i τ e (S(tn−i ) − S(tn−(i+1) )) τ i=0
1 µ(tn ) + (S(tn+1 ) − S(tn )) τ It is not the same as the moving average on the price for example Z 1 t −(t−s)/τ e Ss ds νt = τ −∞ µ(tn+1 ) = e−
tn+1 −tn τ
ν(tn+1 ) = e−
Daniel Herlemont
tn+1 −tn τ
ν(tn ) +
tn+1 − tn S(tn+1 ) τ 13
4 ANNEX 4.3.2
The variance of EMA
Consider an EMA µt+1 = (1 − λ)µt + λrt and taking the unconditional variance of both side, we get the expected variance of µt as σ 2 /(2/λ − 1) So that, in terms of variance, an Exponential Moving average with smoothing factor λ is the same as an arithmetic moving average of s = 2/λ − 1 time periods. If we consider the continous time setting in equation 1, taking quadratic variation on both side, we get 1 −2dt 2 σ (µt ) + 2 σ 2 dt dσ 2 (µt ) = τ τ du to the fact that µt not correlated with St Integrating and taking expectation over a period T , we get Z σ2 1 T 2 σ (µt )dt] = E[ T 0 2τ This the same as the classical EMA µt+1 = (1 − λ)µt + λrt sampled at dt time frame, for λ ≈ dt/τ ≈ 1 − edt/τ As shown by this simple simulation, the expected value of µˆt 2 /σ 2 is 1/2τ + µ2 /σ 2 . > plot(ts,y,type='l')
4.4
30
40
y
50
60
70
> lines(ts,1/(2*ts)+mu^2/sigma^2,col='bl
20
sigma=0.3 mu=1 dt=1/252 g=function(tau) { r=rnorm(1e5, mean=mu*dt, sd=sigma*sqrt(dt)) lambda=dt/tau mut=filter(r*lambda, filter=1-lambda, method="rec", init=r[1])/dt mean(mut^2/sigma^2) } ts=seq(0.01,0.6,len=10) y=sapply(ts,g)
10
> > > > + + + + + + + + + + > >
0.0
0.1
0.2
0.3
0.4
0.5
0.6
ts
Probability of success in case of normal returns
In case the returns a normally distributed, it is possble to relate the sharpe ratio with the probility of succes.
Daniel Herlemont
14
4 ANNEX Suppose we have a daily strategy with normal distribution of returns r = N (µ, σ 2 ). What is the winning probability ? That is P [r > 0] ? µ r−µ < − = P [z < −s] P [r > 0] = 1 − P [r < 0] = 1 − P [ σ σ with s = µ/sigma the daily sharpe ratio.
0.56 0.52
0.54
proba bility
0.58
0.60
probability of winning days
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
annual sharpe
Note that the relation is quite linear and can be approximated by a taylor expansion of the normal cumulative N (x) near 0 s p ≈ 0.5 + √ √ 2π 252 For example, if we want to achieve an annual Sharpe ratio of 1, then the daily Sharpe is √ s = 1/ 256 and the winning probability is p = 0.525. That is quite small !. This probability Daniel Herlemont
15
is even smaller at intraday time√frequencies. On the other side, having a daily success rate of p, the implied sharpe is s = 252N −1 (p) Therefore, a very good Sharpe can be achieved with a very small probability of winning trades, and we can try for find such strategy. However, the contrary is not true. We can have bad strategies with high probability. This is the case when we a are faced to big losses, that is strong negative skewness of returns. An approximation of this probability can be find via the Edgeworth approximation ?? of distrubtion function γ P (X < x) = F (x) ≈ Φ(x) − φ(x) (x2 − 1) 6
(9)
for x = 0, then P (r > 0) = 1 − F (0) 1 1 γ ≈ −√ 2 2π 6 1 = − 0.0665γ 2
(10) (11) (12)
As expected, we can see that, the probability of excess return can predicted with probability different from 1/2 if the distribution is skewed. The probability of excess return is greater than 1/2 if returns are negatively skewed.
5
References
Daniel Herlemont
16