1
00:00:11,090 --> 00:00:16,910
In this lecture, we are going to discuss how to choose the final hyper perimeter of the Arima model,

2
00:00:17,120 --> 00:00:19,730
which is peh the order of the A.R. component.

3
00:00:20,570 --> 00:00:26,410
This is accomplished by looking at the pickoff, which stands for partial autocorrelation function.

4
00:00:27,080 --> 00:00:32,510
In my opinion, the partial autocorrelation function is a bit more difficult to understand than the

5
00:00:32,510 --> 00:00:33,330
AKF.

6
00:00:33,740 --> 00:00:39,650
However, the way that we apply it is the same as how we apply the akef for the moving average model.

7
00:00:40,160 --> 00:00:47,000
So in this lecture, first we will just simply discuss how to use the F, then we will discuss a tiny

8
00:00:47,000 --> 00:00:49,640
bit more about what the pickoff actually is.

9
00:00:49,760 --> 00:00:55,160
In case you are interested, you should not feel obligated to understand the second part in order to

10
00:00:55,160 --> 00:00:56,630
move on to the next lecture.

11
00:01:01,650 --> 00:01:08,310
So what is the perceive from a practical perspective, from a practical perspective, it's just a plot

12
00:01:08,310 --> 00:01:13,300
that looks very similar to the Akef and that it's some kind of auto correlation function.

13
00:01:14,160 --> 00:01:20,430
The vertical axis is still the value of the function and the horizontal axis still represents some like

14
00:01:20,910 --> 00:01:23,490
the way that we use it is exactly the same.

15
00:01:23,490 --> 00:01:28,520
As for the moving average, there will be some confidence interval on the Akef plot.

16
00:01:29,250 --> 00:01:36,030
If we see any statistically significant non-zero legs up to something like P, this would indicate that

17
00:01:36,030 --> 00:01:38,550
we have an auto regressive component of order.

18
00:01:38,610 --> 00:01:47,830
P, for example, if the highest non-zero value is five, then we would choose P equals five as before.

19
00:01:48,000 --> 00:01:53,100
It's usually the case that the values below five will also be non-zero.

20
00:01:58,090 --> 00:02:04,420
All right, so now that you know how the stuff works, what is the really we know that it results in

21
00:02:04,420 --> 00:02:10,390
a graph that looks like the AKF, but it is clearly not the same since it has a different name and a

22
00:02:10,390 --> 00:02:11,360
different purpose.

23
00:02:12,430 --> 00:02:15,210
There are two ways of thinking about the picture.

24
00:02:15,910 --> 00:02:20,230
The first way is to simply look at what it is in terms of its definition.

25
00:02:20,860 --> 00:02:25,150
The second way to look at the Sieff is to consider how it's calculated.

26
00:02:25,780 --> 00:02:29,200
Looking at both of these perspectives should help you understand the.

27
00:02:31,500 --> 00:02:38,280
So what is the partial autocorrelation function, the partial autocorrelation function at Leitao is

28
00:02:38,280 --> 00:02:45,600
defined as the correlation between a why a time T and Y a time T plus tau conditioned on all the ways

29
00:02:45,750 --> 00:02:47,600
in between a T plus tau.

30
00:02:48,060 --> 00:02:53,730
So that's why a T plus one, why a T plus two all the way up to Y of T plus tau minus one.

31
00:02:54,540 --> 00:03:00,660
So one way to think about this is it's like a conditional auto correlation, whereas the regular auto

32
00:03:00,660 --> 00:03:02,320
correlation is unconditional.

33
00:03:02,400 --> 00:03:04,650
In other words, not conditioned on anything.

34
00:03:09,740 --> 00:03:16,880
The second way to think of the pickoff is how we calculate it, we usually define the coefficients using

35
00:03:16,880 --> 00:03:23,530
the Greek letter PHY and we use double indices, so we'll have five zero zero five one one and so on.

36
00:03:24,320 --> 00:03:27,680
In general, these will be represented by Tau Tau.

37
00:03:28,310 --> 00:03:31,410
Note that five zero zero will just be one as usual.

38
00:03:31,940 --> 00:03:39,890
This is because this is the ECF between Y a time T and Y a time T for any T and therefore there are

39
00:03:39,890 --> 00:03:42,470
no in between values and it's just the same as the.

40
00:03:44,000 --> 00:03:50,450
Furthermore, the same thing happens for five one one because again there are no values in between Y

41
00:03:50,450 --> 00:03:52,840
a time T and Y time T plus one.

42
00:03:53,510 --> 00:03:57,140
Therefore this just reduces to the autocorrelation at like one.

43
00:03:58,010 --> 00:04:03,200
What about for any other tau in this case we need to do a few auto regressions.

44
00:04:08,070 --> 00:04:15,450
Specifically, suppose that y half of time T plus tau is regressed on all of the ways in between a T

45
00:04:15,450 --> 00:04:16,560
plus 20.

46
00:04:17,160 --> 00:04:19,920
Note that this is just our usual auto regressive model.

47
00:04:21,060 --> 00:04:28,900
Next, let y had at time t be another regression again based on all the whys in between a T plus 20.

48
00:04:29,640 --> 00:04:32,740
This one is weird because it's like a backwards auto regression.

49
00:04:33,090 --> 00:04:36,110
We are using future values to predict a past value.

50
00:04:36,720 --> 00:04:43,290
So both Y had a time T plus tau and we had a time T or regressed on the same set of variables, the

51
00:04:43,290 --> 00:04:44,460
ones in between.

52
00:04:49,340 --> 00:04:57,320
Next, we have the big reveal via Tau Tau is just the correlation between why a time T plus tau minus

53
00:04:57,320 --> 00:05:05,270
Y hat at time T plus tau and why a time T minus Y hat at time T as defined on the previous slide.

54
00:05:05,960 --> 00:05:14,690
That is to say fire tau tau or the akef for like tau is the correlation between any noise after subtracting

55
00:05:14,690 --> 00:05:18,050
off the parts that could be predicted by the regressions.

56
00:05:18,500 --> 00:05:22,660
You may have to repeat this to yourself a few times, but it will make perfect sense.

57
00:05:23,210 --> 00:05:29,720
The partial autocorrelation function is just the correlation conditioned on the in between variables.

58
00:05:30,110 --> 00:05:36,740
Therefore, it's the correlation between whatever we can't explain using those in between variables.

59
00:05:37,070 --> 00:05:42,760
And of course that makes perfect sense for choosing the order p in an auto regressive model.

60
00:05:43,640 --> 00:05:49,880
If the pickoff is non-zero, that means there is a significant correlation that cannot be explained

61
00:05:50,120 --> 00:05:56,150
by the in between variables and therefore we should include this correlation in our final auto regressive

62
00:05:56,150 --> 00:05:56,700
model.

63
00:05:57,380 --> 00:06:03,530
So just to put some real numbers to this, suppose that we want to know if y one is helpful in predicting

64
00:06:03,530 --> 00:06:04,370
Y five.

65
00:06:04,940 --> 00:06:09,820
Obviously the same question applies to Y two and why six y three and Y seven and so forth.

66
00:06:10,430 --> 00:06:16,040
This is because we assume our signal is stationary and therefore any auto correlations only depend on

67
00:06:16,040 --> 00:06:17,120
the time difference.

68
00:06:17,540 --> 00:06:19,280
But let's stick to Y one in my five.

69
00:06:21,140 --> 00:06:29,120
The in between values are y two y three and Y for the pickoff will be large if there is a large correlation

70
00:06:29,120 --> 00:06:34,990
between Y one and Y five that cannot be predicted by Y to Y three and Y four.

71
00:06:35,600 --> 00:06:41,090
Therefore, we should use one in our auto regressive model for predicting Y five.

72
00:06:41,660 --> 00:06:48,530
That is to say, whenever we see large values in the pickoff, it indicates that these should be included

73
00:06:48,650 --> 00:06:50,200
in our auto regressive model.
