1
00:00:11,050 --> 00:00:16,750
In this lecture, we are going to discuss another integral part of the ARIMA model, the model.

2
00:00:17,380 --> 00:00:20,410
As you know, M-A stands for Moving Average.

3
00:00:20,770 --> 00:00:26,020
However, it's important not to conflate this with the simple moving average and the exponentially weighted

4
00:00:26,020 --> 00:00:26,770
moving average.

5
00:00:26,920 --> 00:00:33,160
We discussed earlier in this course, as you will see shortly, this model is very different from those.

6
00:00:33,550 --> 00:00:36,640
So let's get right to what the moving average model looks like.

7
00:00:41,610 --> 00:00:47,160
The moving average model is similar to the auto regressive model in that it is a linear function of

8
00:00:47,160 --> 00:00:47,700
something.

9
00:00:48,360 --> 00:00:50,160
But what is it a linear function?

10
00:00:51,360 --> 00:00:55,050
In fact, it is a linear function of past error terms.

11
00:00:55,500 --> 00:00:59,940
This should definitely strike you as odd in pretty much all of machine learning.

12
00:01:00,210 --> 00:01:03,150
We typically create models which depend on input data.

13
00:01:03,570 --> 00:01:07,050
In this moving average model, there is no input data to be seen.

14
00:01:08,010 --> 00:01:11,340
Instead, at the time, series depends only on errors.

15
00:01:11,880 --> 00:01:17,190
Of course, in order to know these errors, we must have compared at the previous predictions with the

16
00:01:17,190 --> 00:01:19,260
previous values of the Time series.

17
00:01:20,760 --> 00:01:26,790
Note that our abbreviated form for a moving average model with order Q is may Q.

18
00:01:27,240 --> 00:01:34,200
This means that the output Y depends on Q passed error terms in addition to the latest error term epsilon

19
00:01:34,200 --> 00:01:34,940
sub t.

20
00:01:39,820 --> 00:01:44,350
Here's one way to think of the moving average model that might help you rationalize its name.

21
00:01:45,340 --> 00:01:48,580
Consider what the expected value of Y of T should be.

22
00:01:49,750 --> 00:01:54,460
As you know, we treat errors as normals with mean zero and variance Sigma Square.

23
00:01:55,030 --> 00:02:01,210
Well, if we take the expected value of y of t, we simply get C, since the expected value of each

24
00:02:01,210 --> 00:02:02,680
of the errors is zero.

25
00:02:03,370 --> 00:02:10,210
Hence we can think of the biased term C as the average value and then each of the errors as fluctuations

26
00:02:10,210 --> 00:02:13,210
that make y of t go up or down around c.

27
00:02:18,260 --> 00:02:22,460
Another way to think about the moving average model is in terms of simulation.

28
00:02:23,490 --> 00:02:28,830
In fact, simulating a moving average process is somewhat easier than an auto regressive process.

29
00:02:29,400 --> 00:02:32,430
Let's suppose that we want to simulate an may to process.

30
00:02:32,640 --> 00:02:38,010
So the output y depends on two past error terms in addition to the current error term.

31
00:02:39,360 --> 00:02:42,360
First, we need to generate Epsilon one and Epsilon two.

32
00:02:42,630 --> 00:02:47,040
These are just samples from some normal with mean zero and variance sigma squared.

33
00:02:47,580 --> 00:02:53,730
Then we generate epsilon three, then we calculate y three according to our formula, which depends

34
00:02:53,730 --> 00:02:58,620
on Epsilon one, two and three, as well as the model weights, which we assume have been given.

35
00:02:59,700 --> 00:03:05,310
Next we generate Epsilon four, then we calculate y four, which depends on Epsilon two, three and

36
00:03:05,310 --> 00:03:05,850
four.

37
00:03:06,420 --> 00:03:11,070
Then we generate Epsilon five and we five, Epsilon six and Y six and so on.

38
00:03:11,760 --> 00:03:17,520
So as you see, this process really amounts to nothing but generating samples from the normal and adding

39
00:03:17,520 --> 00:03:18,210
them together.

40
00:03:18,900 --> 00:03:24,780
Therefore, you would use this model if you think that nature actually behaves this way to generate

41
00:03:24,780 --> 00:03:26,460
the data which you are observing.
