In this lecture,
we'll talk about autocovariance function. Objectives are the following. We'll recall random variables from
our introductory statistics and probability class, and we'll recall
the covariance of two random variables. We will give a new
definition to a time series. We'll characterize time series as
a realization of a stochastic process. We'll talk about stochastic process
taking this cycle as well, and we'll define autocovariance function. So what's a random variable? Random variable is a function that goes
from sample space to real numbers. Number of sample space are all possible
outcomes of the experiment, and if we map each possible
outcome of the experiment with the number in the green line,
we get a random variable. For people who are familiar
with the measure theory, random variable is basically
a measurable function. But for us, we'll look at it
in a slightly different way. We're going to look at it as a machine. Basically, it's a machine that
produces this random numbers. Now once it produces a lot of numbers,
those numbers together is a data set. If we start with data like this, we can
say, they're all coming from this machine. This random variable x, if I know
the properties of the random variable, for example the distribution
of this random variable, I can say something
meaningful about my dataset. So here we have random variable, actually
we have a random variable in the right outside, but we have a dataset in the left
outside, 45, 36, 27, it's a dataset. But if we assume that it comes
from this one variable x, we're more than left with x, and
mathematically we work on x, and then we inverse something meaningful
about the dataset using the proper. From your probability and
statistics class, you already know that random variables
might be discrete or continuous. The script running variably
produces countable pascal points numbers on a real line. For example on the left down side, X is
a discrete right number variable, possible outcomes of X is 20, 30, 57 and so
forth so basically they're countably many. But on the right hand side, we have
a continuous random variable y, and it might have any point, might take any
point in between lets say 10 to 60. Now before we do experiment,
everything is random, right? You pull up in the coin,
you have a randomness. It can be heads or tails. But once we flip the coin, the result of
experiment is known, randomness is gone. So the same thing happens here, right? Once we do the experiment,
let's say X becomes 20, the discrete random variable X becomes
20 which means, randomness is gone now. And we have exact,
we have exact value for it, it's 20. We call that 20 as a realization
of the random variable X. Same thing for Y. Y is a continuous random variable. But say we do the experiment. Randomness is gone,
now we have a value for it. Let's say it 30.29. And then we say 30.29 is a realization
of the Y random variable. If we have two random variables,
X and Y, we'll learn this notion called covariance from our probability
class that it somehow measures the linear dependence between
two random variables, right? We are talking about this abstractly. If you have two data sets, covariance will tell us something about the linear
dependence of the pair, data set. But right now, we model each of our data
set with a random variable, x and y. Abstractly, we are defining
covariance of x and y, using the formal expectation x minus its
expectational Y minus expectational Y. And to put them together
as an expectation. And that's defined covariance. And let me just mention that covariance
of X and Y is covariance of Y(X) if it's symmetrical. We talked about random variable but if you just put a lot of random variable
together and give them a sequence. For example,
there's the first random variable X1. The second one, at time one it's X1, and
time two it's X2, at time three it's X3, and now you have a sequence
of random variables. We call it a stochastic process that each one of these random variables
might have their own distribution, might have its own expectation,
might have its own variations. But the way to think about
Stochastic process is to think of it versus deterministic process. In deterministic processes, for example, if you ask me solution of
ordinary differential equation. You start with some point and the solution
of the [INAUDIBLE] will tell you exact trajectory so you know exactly where
you're going to be the next time, next time step,
next time step and so forth. The Stochastic process is
basically opposite of that. At every step you have some randomness. You don't know exactly
where you're going to be. But there are some distribution
of X at that time stamp. But we don't know exactly
where we're going to be. So we get some stochastic process. Now, we are ready to define a time
series in a slightly different way. Let me remind you our first definition. What was the time series? Time series is any dataset but
collected different times. But now we say,wait a minute,
maybe there is some stochastic process going on the background they
are not way off which is X1, X2, X3, and so forth, and the realization of X1
is my first datapoint in the time series, realization of X2 is my second
datapoint in my time series. So, 30, 29, 57, and ..., this time series,
that I start with, I am trying to analyze mainly, it's actually a realization of the
stochastic process going on the back one. So if I know the stochastic process. If I know X1, X2, X3, and how it changes,
then I can say something meaningful about my client series, but
realize the phone X1, X2, X3, and so forth, the stochastic process might
come with ensemble of realizations, I mean, it might get its own
ensemble of time series. But I only have one time series. By having only one time series,
basically, one point at each time, you would like to say something
meaningful about the stochastic process. Autocovariance function is defined,
basically, just taking covariance of different
elements in our sequence, in our stochastic process. If you take Xt and Xs and s and
t might be in different locations and we'll get the cavariance of them,
we get gamma (s,t) then we call that covariance and if we take (
x,t) the covariance of (x,t) will itself of course will get
the variance at that time stand. Now we are ready to actually define
our autocovariance function which we call gamma. Gamma force will only depend on the kind of difference between
these random variables. In other words, you don't look at,
for example, random variable xt and run them wherever xt plus k. It doesn't matter what t is. The time difference is k and the time
difference actually decides the nature, decides the fate of our autocovariance. And the reason is the following. We assume you're working with
stationary times series. Remember in a stationary time series
we said one part of the time series, the properties of the one
part of the time series, is same as the properties of
the other parts of the time series. So in this case if you start at
zero x1 to xk plus 1 or x10, x10 plus k, it's same different
parts of the time series. But the sense of we only
have k steps in between. The properties of these sections of
the time series must be the same. So the covariance from 4 k plus 1 with
x1 is same as x10 plus k with x10. And we call that gamma k. So gamma is our autocovariance function. Gamma k is going to be called
autocovariance coefficient, but we usually do not have
the stochastic process, right? We only have a time series, just
a realization of the stochastic process. So we're going to use that to
approximate gamma k with Ck, which we will call
the autocovariance coefficient. So what have we learned in this lecture? We have learned the definition
of a stochastic process, which is collection of random variables. And you learned how to characterize
time series in slightly different way, but realizing that it is actually
a realization of a stochastic process. And we learned how to define our
autocovariance function of a time series.