This lecture is about random walk. Objectives are the following. We will get familiar with random walk model. We will simulate a random walk model in R. We will obtain the correlogram of a random walk, and we will see a difference operator in action. The model is the following. Xt is equal to Xt minus 1 plus Zt. So here's how you can interpret this. Xt can be location of your particle at this moment. And Xt minus 1 will be location of that particle one step before. In other words, it can be location of a particle one minute ago or one day ago. And Zt is just sound residual, sound white noise. The random walk works as following. Wherever you are, we just add a little bit of noise to it, now you're in the next step. You add a little bit of noise to it, now you're in the next step. There's another interpretation where you can think Xt is the price of a stock today, and Xt minus y was the price of a stock yesterday, and Zt is the random noise. So stock is changing from the yesterday's price by adding some random noise into it. And this random noise, which is white noise or residual, that's just a normal random variable with some expectation and some variance. We can assume that maybe we are starting at point 0. At times 0, we are at 0. Which means that time 1 X1 would be X0, which is 0, plus the Z1, so X1 is actually Z1. And the next times step we are X2, which has to be X1 plus the Z2. But X1 is Z1 so it becomes Z1 plus the Z2. That's how you go. So X3 will become X2, which is Z1 plus Z2 plus additional noise, which is Z3. So as you go in this random work, you accumulate the noises. So at set T, XT, you basically have the sum of all noises until time T. If you look at expectation of Xt, well it is expectation of sum. And expectation of sum is the same thing as the sum of expectations. And since all of Zis are have the same mean mu, you will get mu t. So expectation is mu t. Expectation of this stochastic process is changing by the location. It is definitely not a stationary process. And variants of Xt is variants of the sum, which I wrote as the sum of the variants. This is only true if the random variables, the are independent. So we assume that in our model, that noises are independent from each other, which would mean that variance of the sum is the sum of the variances, which will give us signal square t. So there's systematic change in mean. There is systematic change in variance. This is definitely a non-stationary time series, or a non-stationary stochastic process. So let's do a simulation, in this simulation, we're going to start from X1 not X0. X1 is 0 and our random variable is standard, our noise is a standard normal distribution. And we're going to simulate it using a for loop and we're going to plot it and we're going to look at the correlation function of it for the simulation. So I'm going to say x=NULL, okay. And that x1, my starting point, is actually 0. Then for later on, I have to start from my previous step add some noise to it. So I'm going to use for loop. The syntax is for parentheses. We do the index i which is n, starting from 2 until let's say 1,000. We going to generate 1,000 data points. I'm going to do the brackets. Open bracket and I'm going to close the bracket. Everything in the bracket will have a loop, will be inside the loop basically. And I'm going to say x[i]=x[i-1]+ some noise, and the noise we assume to be standard normal distribution. So I'm going to say rnorm, just 1 data point, because I want to add one noise to it. And then we generated our beta set. If I say print(x), we see that we have thousand, thousand, data points. But it does not have a time series structure on it. So let me just go ahead and clear the console. And x is a data set, but it doesn't have a time state structure on it. So I'm going to define a random walk and I'm going to say ts. And ts is going to basically transform the data set to a time series. I'm going to put x in it. And then I have a random walk, which is basically, time series. Now let's plot this. Let's plot random_walk. Let's out some title to it. Title would be a random walk. Into the y label, let's put nothing. And the x label, let's say this is basically days. And let's put some color into it. We can put a blue color to it, and we can increase the width of our line by 2. And once I do that, we obtain the following. A random walk. This is a very, very typical time plot for a random walk. Now, random walk we just said, is not a stationary time series. It would not make sense to actually find acf of it, because acf, we define acf for stationary time series. But let's just do it because we can just do it. Let's just try to find the acf in r. If I say acf(random_walk), we obtain the following plot. As you see, there's a high correlation, even 30 laps back, which just again shows that there is a high correlation in this data set and there is no stationality. Now let's deviate from the topic, random walk, and say there is a trend, definitely trending here. Goes up and down. Can you remove that trend? It turns out that yes we can. Look at here, look what we have here. Xt is Xt-1 + Zt, I'm going to take this Xt-1 to the left hand side. So basically we have Xt- Xt-1 = Zt. Let's define Xt- Xt-1 as delta. Well, this is not exactly delta, this is a difference operator. So let's call it delta Xt. So this is our difference operator. So difference operator applied to the Xt, DXt. This is a new time series, which is equal to Zt. I remember Zt is a random noise. Zt is a purely random process. Which means that my difference data delta Xt is purely random processed, which is a stationary time series, which is stationary statistic process. So it means that if we have a random walk, simulation for a random walk, if we can take difference and look at the difference, the difference is going to be stationary. Let's confirm that using R. What we begin to do, we going to use difference operators. I diff and I write the random walk. That will give me difference of like 1. So it's going to give me x2 minus x1, x3 minus x2, x4 minus x3 dot dot dot dot x1000 minus x900. So this will actually give me 999 data points. We are missing one point at the beginning, but that's all right. Once I do that, I have another time series, which is just differences. For example, I can try to just plot it. Let me not just put any title on it. By plotting this, I should get a purely random process because where you saw that by just taking xt minus 1 to the left hand side the difference is Zt. It is purely [INAUDIBLE] process. Let's plot this, And you get what looks like white noise. Now we can also look at a acf of the difference. So, I write acf of the difference of the random walk. And I look at the difference, I get an acf, which I have seen before. This is acf of the purely random process we generated a few lectures back. So we just apply difference operator to remove the trend. There was some kind of trend going on, we removed the trend. By just applying the difference operator, it'll get the pot of ACF of the differenced time series. So what have we learned? We have learned a random walk model. We learned how to simulate a random walk in R. And we learned how to get a stationary time series. In fact, the purely random process [INAUDIBLE] Random Walk using difference operator.