Example 1: Random Walk with Fixed Moves
set.seed(321)
n<-21600
P<-rep(NA,n)
P[1]<-100 ## Current price 100 paisa or Re 1/-
for(sec in 2:n){
toss<-sample(c("H","T"),1,replace = TRUE,prob = c(0.5,0.5))
if(toss=="H")P[sec]<-P[sec-1]+5
if(toss=="T")P[sec]<-P[sec-1]-5
}
plot(ts(P))
abline(h=0,lwd=2,col="red")
Example 2: Random Walk with Random Moves
set.seed(321)
n<-21600
P<-M<-rep(NA,n)
P[1]<-100 ## Current price 100 paisa or Re 1/-
for(sec in 2:n){
toss<-sample(c("H","T"),1,replace = TRUE,prob = c(0.5,0.5))
M[sec]<-rpois(1,lambda = 5)
if(toss=="H")P[sec]<-P[sec-1]+ M[sec]
if(toss=="T")P[sec]<-P[sec-1]- M[sec]
}
par(mfrow=c(1,2))
plot(ts(P))
abline(h=0,lwd=2,col="red")
plot(ts(M))
abline(h=0,lwd=2,col="red")
Example 3: Random Walk with Random Return
set.seed(321)
n<-21600
P<-rep(NA,n)
P[1]<-100 ## Current price 100 paisa or Re 1/-
rt<-rnorm(n,mean=0,sd=0.01)
for(sec in 2:n) P[sec]<-P[sec-1]*(1+rt[sec])
par(mfrow=c(1,2))
plot(ts(P))
abline(h=0,lwd=2,col="red")
plot(ts(rt))
abline(h=0,lwd=2,col="red")
Example 4: Random Walk with Random log-Return
set.seed(321)
n<-21600
P<-rep(NA,n)
P[1]<-100 ## Current price 100 paisa or Re 1/-
rt<-rnorm(n,mean=0,sd=0.01)
for(sec in 2:n) P[sec]<-P[sec-1]*exp(rt[sec])
par(mfrow=c(1,2))
plot(ts(P))
abline(h=0,lwd=2,col="red")
plot(ts(rt))
abline(h=0,lwd=2,col="red")
Arbitrage
Random Walk Model
Let \(P_0\) be an arbitrary starting point and \[ P_t=P_0+r_1+r_2+...+r_t,~~t\geq 1. \]
The process \(P_0 , P_1,P_2,...\) is known as random walk and \(r_1,r_2,...\) are corresponding steps of that random walk.
The conditional expectation and variance of \(P_t\) given \(P_0\) is \(E(P_t|P_0)=P_0+\mu t\) and \(Var(P_t|P_0)=\sigma^2 t\) .
The parameter \(\mu\) is the drift and set an overall trend of the random walk.
The parameter \(\sigma\) is the volatility and controls how much it fluctuates around \(P_0+\mu t\).
Since the standard deviation of \(P_t\) given \(P_0\) is \(\sigma\sqrt{t}\) , as \(t\) increases the range of variability in the process increases.
This means at the \(t = 0\) you know very little about where the random walk will be in the remote future compared to its current spot value.
Suppose \(R_1,R_2,...\) are simple return from single investment.
You can make a common working assumption as the returns are mutually independent and identically distributed (\(i.i.d\)) random variables with mean \(\mu\) and variance \(\sigma^2\).
However, this is not a good model from a theoretical perspective.
The model assumes \(P_t\) is distributed over the real line, which means the model implies unlimited loss.
But typically in the world of limited liability structure of the companies, the loss is usually limited.
That is \(R_t \geq -1\) means the maximum loss that an investor can experience is the total investment.
You see, for the log return, \[ 1+R_t=\exp(r_t)\geq 0, \] which implies \(R_t\geq -1\).
This satisfies the condition of limited liability, i.e., possible maximum loss is the total investment. In addition, \[ \begin{eqnarray*} 1+R_t(k)&=&(1+R_t)(1+R_{t-1})...(1+R_{t-k+1}),\\ &=&\exp(r_t)\exp(r_{t-1})...\exp(r_{t-k+1}),\\ &=&\exp(r_t+r_{t-1}+...+r_{t-k+1}). \end{eqnarray*} \]
So to sum of \(k\) period log-returns yield \(k\)-period gross return. Now note that \[ \frac{P_t}{P_{t-k}}=1+R_t(k)=\exp(r_t+r_{t-1}+...+r_{t-k+1}). \] can be expressed as for \(k=t\), \[ P_t=P_0\exp(r_t+...+r_1). \]
Therefore, if the log returns are assumed to be i.i.d. random variables, then the price of the stock or market index, denoted by the process \(P=\{P_t:t \geq 0\}\), is the exponential of random walk or also known as geometric random walk.
mu<-0
s<-1
n<-100
r<-rnorm(n,mu,s)
plot(ts(r))
P<-r
P[1]<-1
P<-cumsum(P)
plot(ts(P))
The random walk is said to have unit root. To understand what this means, you should consider the AR(1) model (i.e., Auto-Regressive model with lag 1), \[ p_t=\phi p_{t-1}+r_t \] where \(\phi=1\) and \(p_t=\log(P_t)\) is price of the asset in log-scale.
The generic AR(1) model can be presented as \[ \begin{eqnarray*} p_t&=&\phi p_{t-1}+r_t\\ &=& \phi (\phi p_{t-2} + r_{t-1}) + r_t\\ &=& \phi^2 p_{t-2} + \phi r_{t-1} + r_t \\ &=& \phi^2 (\phi p_{t-3} + r_{t-2}) + \phi r_{t-1} + r_t\\ &=& \phi^3 p_{t-3} + \phi^2 r_{t-2} + \phi r_{t-1} + r_t\\ &\vdots&\\ &=& \phi^k p_{t-k} + \phi^{k-1} r_{t-(k-1)}+... + \phi r_{t-1} + r_t\\ &=& \phi^k p_{t-k} + \sum_{i=0}^{k-1}\phi^{i-1} r_{t-(i-1)}. \end{eqnarray*} \]
If \(\phi=0\) that means the process is stationary and \(p_t\) and \(p_{t-1}\) are independent \(\forall t\).
As the series \(\{p_t: t \geq 0\}\) is a random walk (i.e., \(\phi=1\)) the incremental steps (i.e., log-returns) are independent and stationary process, you can write it as \[ r_t=\phi_1 r_{t-1}+\epsilon_t \] where \(\phi_1=0\) and \(\epsilon_t\) is white noise with mean \(\mu\) and variance \(\sigma^2\).
The Dickey-Fuller test (1979) compares the null hypothesis \[ H_0: p_t=p_{t-1}+r_t \] i.e., that the series is a random walk without drift, where \(r_t\) is a white noise with mean 0 and variance \(\sigma^2\).
The alternative hypothesis is \[ H_1: p_t=\mu + \phi p_{t-1} + r_t \] where \(\mu\) and \(\phi\) are constant with \(|\phi|<1\).
According to \(H_1\), the process is stationary AR(1) with mean \(\frac{\mu}{1-\phi}\).
You can implement the Dickey Fuller test using adf.test
function in tseries
package.
library(tseries)
head(EuStockMarkets)
## DAX SMI CAC FTSE
## [1,] 1628.75 1678.1 1772.8 2443.6
## [2,] 1613.63 1688.5 1750.5 2460.2
## [3,] 1606.51 1678.6 1718.0 2448.2
## [4,] 1621.04 1684.1 1708.1 2470.4
## [5,] 1618.16 1686.6 1723.1 2484.7
## [6,] 1610.61 1671.6 1714.3 2466.8
FTSE<-EuStockMarkets[,"FTSE"]
## Plot FTSE
n<-length(FTSE)
plot(ts(FTSE))
grid(col = "red")
## Compute the log-return of the FTSE
log_return<-diff(log(FTSE))*100
n<-length(log_return)
## The time-series plot of the log-return
plot(ts(log_return))
grid(col="red")
## Dickey-Fuller test for unit-root
## Step 1: Check if log-Price is non-stationary
adf.test(log(FTSE))
##
## Augmented Dickey-Fuller Test
##
## data: log(FTSE)
## Dickey-Fuller = -2.4976, Lag order = 12, p-value = 0.3677
## alternative hypothesis: stationary
## Step 2: Check if log-return is non-stationary
adf.test(log_return)
## Warning in adf.test(log_return): p-value smaller than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: log_return
## Dickey-Fuller = -11.162, Lag order = 12, p-value = 0.01
## alternative hypothesis: stationary
\[ H_1: \text{At least one inequality.} \]
The test statistic for the Ljung-Box test is \[ Q=n(n+2)\sum_{h=1}^H\frac{\hat{\rho}_h^2}{n-h}, \] where \(n\) is the sample size, \(\hat{\rho}_h\) is the sample autocorrelation of lag \(h\).
You can show under \(H_0\), \(Q\) follows a chi-square distribution, \(\chi^2_{(h)}\).
The Ljung-Box test can be done in R
using Box.test
function available in stats
package.
## Step 3: Check if the log-returns are uncorrelated
Box.test(log_return,lag=10,type = "Ljung-Box")
##
## Box-Ljung test
##
## data: log_return
## X-squared = 29.815, df = 10, p-value = 0.0009183