Workshop at Statistical Mathods in Finance, 2016

Random Walk Hypothesis

Random Walk Model

Example 1: Random Walk with Fixed Moves

Suppose price of a stock move up by 10 paisa with probability 0.5 or move down by 10 paisa with probability 0.5 every seconds.
If the price of the stock is Re 1/-; then what will be the price of the stock after 21600 seconds
The model \(P_t=P_{t-1}\pm M_t\), where \(M_t=5~a.s.\)

set.seed(321)
n<-21600
P<-rep(NA,n)
P[1]<-100 ## Current price 100 paisa or Re 1/-
for(sec in 2:n){
  toss<-sample(c("H","T"),1,replace = TRUE,prob = c(0.5,0.5))
  if(toss=="H")P[sec]<-P[sec-1]+5
  if(toss=="T")P[sec]<-P[sec-1]-5
}
plot(ts(P))
abline(h=0,lwd=2,col="red")

Notice: Price of the stock is negative
This model is good candidate to model the stock price movement.
However, it cannot take care of the limited liability feature of the stock market.
Check what happens price of the stock move up by 10 paisa with probability 0.51 or move down by 10 paisa with probability 0.49 on every seconds!

Example 2: Random Walk with Random Moves

Suppose price of a stock move up or down with probability 0.5 or
Size of the movement follow \(Poisson(\lambda=5)\)
If the price of the stock is Re 1/-; then what will be the price of the stock after 21600 seconds
The model: \[ P_t=P_{t-1}+\pm M_t, \] where \(M_t \sim Poisson(\lambda=5)\)

set.seed(321)
n<-21600
P<-M<-rep(NA,n)
P[1]<-100 ## Current price 100 paisa or Re 1/-
for(sec in 2:n){
  toss<-sample(c("H","T"),1,replace = TRUE,prob = c(0.5,0.5))
  M[sec]<-rpois(1,lambda = 5)
  if(toss=="H")P[sec]<-P[sec-1]+ M[sec]
  if(toss=="T")P[sec]<-P[sec-1]- M[sec]
}
par(mfrow=c(1,2))
plot(ts(P))
abline(h=0,lwd=2,col="red")
plot(ts(M))
abline(h=0,lwd=2,col="red")

Example 3: Random Walk with Random Return

Simple return of an asset is nothing but movement of the price with respect to previous price. \[ \begin{eqnarray} R_t&=&\frac{P_t-P_{t-1}}{P_{t-1}}\\ R_t.P_{t-1}&=&P_t-P_{t-1}\\ P_t&=&P_{t-1}(1+R_t) \end{eqnarray} \]
Suppose \(R_t\sim N(\mu=0,\sigma=0.01)\) on every seconds.
If the price of the stock is Re 1/-; then what will be the price of the stock after 21600 seconds

set.seed(321)
n<-21600
P<-rep(NA,n)
P[1]<-100 ## Current price 100 paisa or Re 1/-

rt<-rnorm(n,mean=0,sd=0.01)
for(sec in 2:n) P[sec]<-P[sec-1]*(1+rt[sec])

par(mfrow=c(1,2))
plot(ts(P))
abline(h=0,lwd=2,col="red")
plot(ts(rt))
abline(h=0,lwd=2,col="red")

Check what happend if you choose mean= 0.01 !

Example 4: Random Walk with Random log-Return

The log-return of an asset indicates continuous compounding effect of the proce movement; \[ \begin{eqnarray*} r_t &=& \log(P_t) - \log(P_{t-1})\\ &=& \log\Big(\frac{P_t}{P_{t-1}}\Big)\\ P_t &=& P_{t-1}e^{r_t} \end{eqnarray*} \]
Suppose \(r_t\sim N(\mu=0,\sigma=0.01)\) on every seconds.
If the price of the stock is Re 1/-; then what will be the price of the stock after 21600 seconds

set.seed(321)
n<-21600
P<-rep(NA,n)
P[1]<-100 ## Current price 100 paisa or Re 1/-

rt<-rnorm(n,mean=0,sd=0.01)
for(sec in 2:n) P[sec]<-P[sec-1]*exp(rt[sec])

par(mfrow=c(1,2))
plot(ts(P))
abline(h=0,lwd=2,col="red")
plot(ts(rt))
abline(h=0,lwd=2,col="red")

Arbitrage

In example 1, if fixed move takes place with unequal probability it means it has an arbitrage opportunity for either party.
Similarly, in example 2,3 and 4 if mean: \(\mu \neq 0\) then it means it has an arbitrage opportunity.

Random Walk Model

Suppose \(r_1,r_2,...\) be i.i.d with mean \(\mu\) and standard deviation \(\sigma\) .
Let \(P_0\) be an arbitrary starting point and \[ P_t=P_0+r_1+r_2+...+r_t,~~t\geq 1. \]
The process \(P_0 , P_1,P_2,...\) is known as random walk and \(r_1,r_2,...\) are corresponding steps of that random walk.
The conditional expectation and variance of \(P_t\) given \(P_0\) is \(E(P_t|P_0)=P_0+\mu t\) and \(Var(P_t|P_0)=\sigma^2 t\) .
The parameter \(\mu\) is the drift and set an overall trend of the random walk.
The parameter \(\sigma\) is the volatility and controls how much it fluctuates around \(P_0+\mu t\).
Since the standard deviation of \(P_t\) given \(P_0\) is \(\sigma\sqrt{t}\) , as \(t\) increases the range of variability in the process increases.
This means at the \(t = 0\) you know very little about where the random walk will be in the remote future compared to its current spot value.

Modeling with log-return

Suppose \(R_1,R_2,...\) are simple return from single investment.
You can make a common working assumption as the returns are mutually independent and identically distributed (\(i.i.d\)) random variables with mean \(\mu\) and variance \(\sigma^2\).
However, this is not a good model from a theoretical perspective.
The model assumes \(P_t\) is distributed over the real line, which means the model implies unlimited loss.
But typically in the world of limited liability structure of the companies, the loss is usually limited.
That is \(R_t \geq -1\) means the maximum loss that an investor can experience is the total investment.
You see, for the log return, \[ 1+R_t=\exp(r_t)\geq 0, \] which implies \(R_t\geq -1\).
This satisfies the condition of limited liability, i.e., possible maximum loss is the total investment. In addition, \[ \begin{eqnarray*} 1+R_t(k)&=&(1+R_t)(1+R_{t-1})...(1+R_{t-k+1}),\\ &=&\exp(r_t)\exp(r_{t-1})...\exp(r_{t-k+1}),\\ &=&\exp(r_t+r_{t-1}+...+r_{t-k+1}). \end{eqnarray*} \]
So to sum of \(k\) period log-returns yield \(k\)-period gross return. Now note that \[ \frac{P_t}{P_{t-k}}=1+R_t(k)=\exp(r_t+r_{t-1}+...+r_{t-k+1}). \] can be expressed as for \(k=t\), \[ P_t=P_0\exp(r_t+...+r_1). \]
Therefore, if the log returns are assumed to be i.i.d. random variables, then the price of the stock or market index, denoted by the process \(P=\{P_t:t \geq 0\}\), is the exponential of random walk or also known as geometric random walk.

Stationary Process

In probability, a stochastic process is known as stationary if the joint probability distribution is independent of time.
Parameters such as mean, volatility, correlation etc., do not change over time.
The process \(P=\{P_t: t\geq 0\}\) is strongly stationary if \(\forall\) \(h\) and \(n\) \[ F_P(p_{t_1+h},p_{t_2+h},...,p_{t_n+h})=F_P(p_{t_1},p_{t_2},...,p_{t_n}), \] since \(h\) doe not affect \(F_P\), it is not function of time.
Note that the families \(\{P(t_1),...,P(t_n)\}\) and \(\{P(t_1+h),...,P(t_n+h)\}\) have the same joint distribution for all \(t_1,t_2,...,t_n\) and \(h>0\).
A weak form of stationarity is defined based on the parameters of the process.
The (WSS) process only require that mean and autocovariance do not vary with respect to time.
A continuous time random process \(P_t\) is , if it has the following restrictions on its mean function, \[ \begin{eqnarray} \mathbb{E}(P_t)=\mu(t)=\mu(t+h)=\mu~~\forall t,h \in \mathbb{R},\label{RFR_eq_wss_mean_fun} \end{eqnarray} \] and autocovariance function \[ \begin{eqnarray} \mathbb{C}ov_P(t_1,t_2)=\mathbb{E}[\{P(t_1)-\mu(t_1)\}\{P(t_2)-\mu(t_2)\}]=\mathbb{C}ov_P(t_1-t_2).\label{RFR_eq_wss_cov_fun} \end{eqnarray} \]
The mean function \(\mathbb{E}(P_t)\) must be constant over time
The covariance function depends only on the difference between \(t_{1}\) and \(t_{2}\)
Distribution of log-return must be stationary

Simulate from stationary process

mu<-0
s<-1
n<-100
r<-rnorm(n,mu,s)
plot(ts(r))

P<-r
P[1]<-1
P<-cumsum(P)
plot(ts(P))

Price may be non-stationary. However, its steps could be a stationary process.

Unit Root of Random Walk

The random walk is said to have unit root. To understand what this means, you should consider the AR(1) model (i.e., Auto-Regressive model with lag 1), \[ p_t=\phi p_{t-1}+r_t \] where \(\phi=1\) and \(p_t=\log(P_t)\) is price of the asset in log-scale.
The generic AR(1) model can be presented as \[ \begin{eqnarray*} p_t&=&\phi p_{t-1}+r_t\\ &=& \phi (\phi p_{t-2} + r_{t-1}) + r_t\\ &=& \phi^2 p_{t-2} + \phi r_{t-1} + r_t \\ &=& \phi^2 (\phi p_{t-3} + r_{t-2}) + \phi r_{t-1} + r_t\\ &=& \phi^3 p_{t-3} + \phi^2 r_{t-2} + \phi r_{t-1} + r_t\\ &\vdots&\\ &=& \phi^k p_{t-k} + \phi^{k-1} r_{t-(k-1)}+... + \phi r_{t-1} + r_t\\ &=& \phi^k p_{t-k} + \sum_{i=0}^{k-1}\phi^{i-1} r_{t-(i-1)}. \end{eqnarray*} \]
If \(\phi = 1\) then the process is non-stationary. Because \(\sum_{i=0}^{k-1}\phi^{i-1} r_{t-(i-1)}\) accumulates the information over time. Hence a random walk is a non-stationary process.
However \(|\phi|<1\), i.e., \(-1<\phi<1\) implies, the process is stationary.
If \(\phi=0\) that means the process is stationary and \(p_t\) and \(p_{t-1}\) are independent \(\forall t\).
As the series \(\{p_t: t \geq 0\}\) is a random walk (i.e., \(\phi=1\)) the incremental steps (i.e., log-returns) are independent and stationary process, you can write it as \[ r_t=\phi_1 r_{t-1}+\epsilon_t \] where \(\phi_1=0\) and \(\epsilon_t\) is white noise with mean \(\mu\) and variance \(\sigma^2\).
In order to check if the price of a stock follows the geometric random walk, you have to check following three things.
1. First, you should check if \(\{p_t\}\) is a non-stationary process, i.e., \[ p_t=\phi p_{t-1}+r_t; \] check if \(\phi=1\) or \(\phi<1\).
1. Second, you check if the log-returns are stationary process, i.e., \[ r_t=\phi_1 r_{t-1}+\epsilon_t; \] check if \(\phi_1=1\) or \(\phi_1<1\).
1. Second check only tells you if a log-returns are stationary, but it does not check if \(\phi_1=0\) or not. In addition \(\phi_1=0\) only implies pairwise independence. It does not check the mutual independence of \(r_t\). So you should check if the serial correlations of \(r_t\) are 0 or not. That is check if \(\rho_1=\rho_2=...=\rho_H=0\), where \(\rho_h=corr(r_t,r_{t+h})\) is the lag \(h\) auto-correlation.

Dickey-Fuller test for Stationarity in a Time Series

The Dickey-Fuller test (1979) compares the null hypothesis \[ H_0: p_t=p_{t-1}+r_t \] i.e., that the series is a random walk without drift, where \(r_t\) is a white noise with mean 0 and variance \(\sigma^2\).
The alternative hypothesis is \[ H_1: p_t=\mu + \phi p_{t-1} + r_t \] where \(\mu\) and \(\phi\) are constant with \(|\phi|<1\).
According to \(H_1\), the process is stationary AR(1) with mean \(\frac{\mu}{1-\phi}\).
You can implement the Dickey Fuller test using adf.test function in tseries package.

library(tseries)
head(EuStockMarkets)

##          DAX    SMI    CAC   FTSE
## [1,] 1628.75 1678.1 1772.8 2443.6
## [2,] 1613.63 1688.5 1750.5 2460.2
## [3,] 1606.51 1678.6 1718.0 2448.2
## [4,] 1621.04 1684.1 1708.1 2470.4
## [5,] 1618.16 1686.6 1723.1 2484.7
## [6,] 1610.61 1671.6 1714.3 2466.8

FTSE<-EuStockMarkets[,"FTSE"]
## Plot FTSE
n<-length(FTSE)
plot(ts(FTSE))
grid(col = "red")

## Compute the log-return of the FTSE
log_return<-diff(log(FTSE))*100
n<-length(log_return)
## The time-series plot of the log-return 
plot(ts(log_return))
grid(col="red")

## Dickey-Fuller test for unit-root

## Step 1: Check if log-Price is non-stationary
adf.test(log(FTSE))

## 
##  Augmented Dickey-Fuller Test
## 
## data:  log(FTSE)
## Dickey-Fuller = -2.4976, Lag order = 12, p-value = 0.3677
## alternative hypothesis: stationary

## Step 2: Check if log-return is non-stationary
adf.test(log_return)

## Warning in adf.test(log_return): p-value smaller than printed p-value

## 
##  Augmented Dickey-Fuller Test
## 
## data:  log_return
## Dickey-Fuller = -11.162, Lag order = 12, p-value = 0.01
## alternative hypothesis: stationary

Ljung-Box test for independence in a Time Series

The null hypothesis is \(\rho_h = 0\) for. That is, \[ H_0: \rho_1=\rho_2=...=\rho_H=0 ~\forall~ t, \]
\[ H_1: \text{At least one inequality.} \]
The test statistic for the Ljung-Box test is \[ Q=n(n+2)\sum_{h=1}^H\frac{\hat{\rho}_h^2}{n-h}, \] where \(n\) is the sample size, \(\hat{\rho}_h\) is the sample autocorrelation of lag \(h\).
You can show under \(H_0\), \(Q\) follows a chi-square distribution, \(\chi^2_{(h)}\).
The Ljung-Box test can be done in R using Box.test function available in stats package.

## Step 3: Check if the log-returns are uncorrelated

Box.test(log_return,lag=10,type = "Ljung-Box")

## 
##  Box-Ljung test
## 
## data:  log_return
## X-squared = 29.815, df = 10, p-value = 0.0009183