1
00:00:10,030 --> 00:00:17,620
In this lecture, we are going to check our understanding of the Akef and encode this lecture is going

2
00:00:17,620 --> 00:00:23,320
to walk you through a prepared CoLab notebook, although a very good exercise, which I always recommend,

3
00:00:23,590 --> 00:00:29,240
is once you know how this is done, to try and recreate it yourself with as few references as possible.

4
00:00:29,770 --> 00:00:35,440
As always, you can check the lectures, how to code by yourself and how to practice for a more in-depth

5
00:00:35,440 --> 00:00:36,220
discussion.

6
00:00:36,700 --> 00:00:42,040
If there's anything in this lecture you didn't understand or you think I missed a step or didn't explain

7
00:00:42,040 --> 00:00:45,270
why we were doing something, please use the Q&amp;A to inquire.

8
00:00:45,880 --> 00:00:50,890
As usual, you can look at the title of the notebook to determine what notebook we are currently looking

9
00:00:50,890 --> 00:00:51,200
at.

10
00:00:53,180 --> 00:00:59,900
So the first thing we're going to do is import plot and a plot akef from stat's models, these are more

11
00:00:59,900 --> 00:01:06,440
useful than just generic HCF functions that only give you back an array, since these will make a plot

12
00:01:06,560 --> 00:01:08,660
and draw the confidence bounce automatically.

13
00:01:12,790 --> 00:01:15,880
Next, we're going to import numpty and matplotlib.

14
00:01:19,040 --> 00:01:22,850
Next, we're going to generate IDE noise from the standard normal.

15
00:01:26,260 --> 00:01:31,420
Just to build a point of comparison for what we will see later in this lecture, let's plot what this

16
00:01:31,420 --> 00:01:35,650
noise looks like, OK, so I'm pretty sure this is not too surprising.

17
00:01:41,900 --> 00:01:47,930
Next, let's call the function of plot, we'll be looking at auto regressive models first.

18
00:01:48,110 --> 00:01:54,110
And as you know, the relevant plot for that is the notice that I'm using.

19
00:01:54,110 --> 00:01:56,970
The subplots function to manage the size of a plot.

20
00:01:57,680 --> 00:02:02,160
This returns an access object, which I then pass into the plot function.

21
00:02:03,290 --> 00:02:03,680
All right.

22
00:02:03,680 --> 00:02:04,820
So what do we see?

23
00:02:05,720 --> 00:02:10,880
Well, as expected, most of the lagged values are very near zero.

24
00:02:11,450 --> 00:02:16,430
The first value is one, since that's just the autocorrelation of each point with itself.

25
00:02:17,600 --> 00:02:23,160
Notice how there are multiple values in which the pickoff goes just outside the confidence bounds.

26
00:02:23,690 --> 00:02:27,590
However, you have to remember the definition of the confidence interval.

27
00:02:28,250 --> 00:02:33,470
Basically, this is allowed to happen randomly five percent of the time, even when there is no true

28
00:02:33,470 --> 00:02:34,330
correlation.

29
00:02:35,030 --> 00:02:41,240
Since these values are so close to the threshold, we can intuit that it's probably OK to ignore them.

30
00:02:46,620 --> 00:02:53,260
Next, let's create an air one process and test that again to generate this process.

31
00:02:53,280 --> 00:02:57,520
I'm going to create a list called X1 within initial value of zero.

32
00:02:58,200 --> 00:03:04,040
Then I'm going to enter a loop that goes for 1000 iterations inside this loop.

33
00:03:04,050 --> 00:03:10,950
I'm going to create the next value by taking zero point five times the previous value, plus some Gaussian

34
00:03:10,950 --> 00:03:12,960
noise with standard deviation.

35
00:03:12,960 --> 00:03:16,500
At zero point one, we'll call this variable X.

36
00:03:17,900 --> 00:03:20,180
Next, we append X to X1.

37
00:03:21,860 --> 00:03:28,730
When we're outside the loop, we can't explain it to an umpire in the next block, we plot one.

38
00:03:29,330 --> 00:03:35,300
As you can see, it looks pretty stationary and there's no real observable difference between this and

39
00:03:35,300 --> 00:03:36,640
ID noise.

40
00:03:37,460 --> 00:03:41,420
It's not obvious that each value depends on the previous value.

41
00:03:46,610 --> 00:03:55,580
Next, we plot the pickoff for X1, as you can see, we have a value at like one which is far outside

42
00:03:55,580 --> 00:03:56,990
the confidence threshold.

43
00:03:57,590 --> 00:04:04,690
Therefore, if we were to look just at this plot, we would conclude that our TIME series is an A1 process.

44
00:04:05,240 --> 00:04:09,350
And of course, we know that this is true because that is what we just created.

45
00:04:14,250 --> 00:04:18,880
Next, we're going to do another experiment again for an R-1 process.

46
00:04:19,380 --> 00:04:24,120
I'm not going to run through the details of this code again, since it's very similar to last time.

47
00:04:24,690 --> 00:04:30,330
The main difference with this code is that the coefficient for the previous X term is now negative zero

48
00:04:30,330 --> 00:04:33,310
point five instead of positive zero point five.

49
00:04:33,840 --> 00:04:39,810
We'll see what effect this has on the Earth if we look at the plot of the Time series.

50
00:04:40,080 --> 00:04:45,260
Again, it's not at all obvious that this is any different from just plain Idy noise.

51
00:04:51,230 --> 00:04:57,890
All right, so next we look at the picture, this time we see that there is, again, a non-zero value

52
00:04:57,890 --> 00:05:05,080
at like one, but this time it's negative corresponding to the negative coefficient of our R-1 process.

53
00:05:12,620 --> 00:05:19,550
Next, we're going to generate an R2 process, so this time the Time series will depend linearly on

54
00:05:19,550 --> 00:05:23,690
to pass values instead of just one to initialize.

55
00:05:23,690 --> 00:05:27,750
Our list will set the first two values in X two to be zero.

56
00:05:28,670 --> 00:05:31,490
Inside the loop, we use the following equation.

57
00:05:32,180 --> 00:05:35,570
The coefficient for the one value is zero point five.

58
00:05:36,080 --> 00:05:40,290
The second coefficient for the two value is minus zero point three.

59
00:05:40,910 --> 00:05:45,260
And again, we have Gaussian noise with a standard deviation, a zero point one.

60
00:05:50,130 --> 00:05:55,800
From our Time series plot, it is again the case that we probably cannot distinguish this from idee

61
00:05:55,800 --> 00:06:00,780
noise, although the variants in the later parts of the Times series seems to be quite high.

62
00:06:04,700 --> 00:06:10,850
If we look at the pickoff, we can see that there are now two non-zero lag's corresponding to the two

63
00:06:10,850 --> 00:06:18,980
coefficients of our R2 process, the first non-zero value is positive and the second at non-zero value

64
00:06:18,980 --> 00:06:19,760
is negative.

65
00:06:20,330 --> 00:06:26,390
If we were to use this plot to choose the value of P for fitting in auto regressive model, we would

66
00:06:26,390 --> 00:06:27,640
definitely choose to.

67
00:06:27,950 --> 00:06:30,400
So this confirms that this method works.

68
00:06:35,740 --> 00:06:44,350
Next, we're going to generate an five process to initialize X5 or create an array of all zeroes inside

69
00:06:44,350 --> 00:06:51,190
the loop to generate the next value of X. We will make X depend on three different lagged values rather

70
00:06:51,190 --> 00:06:53,820
than all five in particular.

71
00:06:53,830 --> 00:07:00,070
These will be the last value at T minus one, the second class value at T minus two and the fifth less

72
00:07:00,070 --> 00:07:01,830
value at T minus five.

73
00:07:02,350 --> 00:07:08,830
The weights for these will be zero point five minus zero point three and minus zero point six respectively.

74
00:07:09,460 --> 00:07:14,680
Again, we'll add Gaussian noise with mean zero and standard deviation at zero point one.

75
00:07:18,420 --> 00:07:23,400
Notice that when we plot the process is a time series, it definitely looks a little different from

76
00:07:23,400 --> 00:07:30,120
Idy noise and all of the previous plots, there seems to be more of a pattern in a more wavelike structure

77
00:07:30,210 --> 00:07:31,560
to the Time series.

78
00:07:37,140 --> 00:07:44,970
Next, we noticed something interesting with this, even though our Time series does not depend on the

79
00:07:44,990 --> 00:07:50,740
lag, three and like four values, the corresponding values in the pickoff are still non-zero.

80
00:07:51,510 --> 00:07:56,910
So although we generated the process ourselves and we know that it doesn't depend on the three and like

81
00:07:56,910 --> 00:08:02,070
four values, when we fit in ERP model, we will always include those terms.

82
00:08:02,610 --> 00:08:08,400
So this is a sign that just because you have some in between values that don't affect the next value

83
00:08:08,400 --> 00:08:12,780
in a time series, this does not mean that their values will be zero.

84
00:08:13,650 --> 00:08:14,040
All right.

85
00:08:14,050 --> 00:08:19,200
So since this lecture has been pretty long already, I will see you in the next lecture to look at the

86
00:08:19,200 --> 00:08:21,480
akef and moving average models.
