1
00:00:11,100 --> 00:00:16,500
So with this library, obtaining the forecast is a bit complicated, there are three arguments we're

2
00:00:16,500 --> 00:00:18,900
going to consider in the forecast function.

3
00:00:20,400 --> 00:00:23,140
The first argument is simple, which is just the horizon.

4
00:00:23,370 --> 00:00:26,130
That is how many steps would we like to forecast?

5
00:00:26,760 --> 00:00:31,300
The second argument is called the index, which will be the main point of confusion.

6
00:00:32,430 --> 00:00:36,810
The third argument is start, which seems like it will be the start date of your forecast.

7
00:00:37,140 --> 00:00:39,720
However, as you'll see, this is not quite true.

8
00:00:44,550 --> 00:00:50,550
OK, so let's talk about this, a reenacts argument, in fact, the main reason we have to mention this

9
00:00:50,550 --> 00:00:52,970
is because of a new change in the art library.

10
00:00:53,550 --> 00:00:59,370
So if you do not set this argument explicitly, you're going to get a warning which says the default

11
00:00:59,370 --> 00:01:03,080
Tafari index is true after September twenty twenty one.

12
00:01:03,300 --> 00:01:04,710
This will change defaults.

13
00:01:05,280 --> 00:01:08,520
Since we don't want to get this warning, we're going to set it explicitly.

14
00:01:09,060 --> 00:01:12,930
Of course, in order to set this argument, we have to actually know what it means.

15
00:01:17,590 --> 00:01:24,770
So what does reenacts index do according to the documentation it states, whether to index the forecasts

16
00:01:24,790 --> 00:01:28,400
to have the same dimension as the series being forecast?

17
00:01:29,050 --> 00:01:33,260
Now, to me, this sounds very cryptic, so let's try to explain it in a simpler way.

18
00:01:34,120 --> 00:01:37,990
Basically, when you think of a forecast, you think of it like in stats models.

19
00:01:38,290 --> 00:01:43,390
You have your training time series and then you make a forecast which extends beyond the end of that

20
00:01:43,390 --> 00:01:44,280
Time series.

21
00:01:44,830 --> 00:01:46,270
Not so with this library.

22
00:01:46,840 --> 00:01:52,120
With this library, it's possible to make a forecast starting from any point within the Time series

23
00:01:52,120 --> 00:01:52,660
as well.

24
00:01:54,040 --> 00:01:59,860
For example, if your Time series goes from January twenty twenty one up to January twenty twenty two,

25
00:02:00,100 --> 00:02:03,520
it's possible to make a forecast starting at March 1st.

26
00:02:03,520 --> 00:02:04,450
Twenty twenty one.

27
00:02:05,230 --> 00:02:07,270
And note that this is not an example.

28
00:02:07,270 --> 00:02:12,190
One step prediction as you might expect, it's a legitimate multi-step forecast.

29
00:02:12,830 --> 00:02:14,820
So that's one thing you may not have expected.

30
00:02:15,250 --> 00:02:20,920
Of course, the latest forecast you can make corresponds to the final date in the training time series.

31
00:02:25,600 --> 00:02:31,870
OK, so here's another strange fact about how we can forecast with this library, not only can we start

32
00:02:31,870 --> 00:02:37,330
the forecast from any date within the time series, we can actually forecast from every date within

33
00:02:37,330 --> 00:02:39,430
the Time series at the same time.

34
00:02:40,030 --> 00:02:45,040
So if your data goes from twenty twenty one up to twenty, twenty two, you can potentially create a

35
00:02:45,040 --> 00:02:47,670
forecast for every day of the whole year.

36
00:02:49,080 --> 00:02:55,470
That brings us back to this reenacts index argument, basically reenacts equals true means you want

37
00:02:55,470 --> 00:02:59,650
the resulting data frame to have the same length as your training series.

38
00:03:00,180 --> 00:03:05,640
So if your training series went from twenty twenty one up to twenty twenty two, your forecast data

39
00:03:05,640 --> 00:03:08,750
frame will have one row for every day of the year.

40
00:03:09,510 --> 00:03:12,220
On the other hand, if you set index to false.

41
00:03:12,570 --> 00:03:17,310
This means that the resulting data frame will only contain the dates for which you wanted to make a

42
00:03:17,310 --> 00:03:18,090
forecast.

43
00:03:19,170 --> 00:03:22,720
Now this is a bit abstract, so it helps to simply look at examples.

44
00:03:23,610 --> 00:03:29,340
So suppose we said re indexed to true and we set the forecast horizon to five and we set the start to

45
00:03:29,340 --> 00:03:30,210
March six.

46
00:03:31,140 --> 00:03:34,780
Also, suppose that our data set only goes from March one up to March 10.

47
00:03:35,790 --> 00:03:41,910
In this case, the forecast data frame will have a row for every day from March one up to March 10,

48
00:03:42,060 --> 00:03:48,480
the entirety of the Input Time series, but only March six up to March 10 will be filled with actual

49
00:03:48,480 --> 00:03:49,930
numbers for the forecast.

50
00:03:50,310 --> 00:03:52,260
The other rows will be set to Enan.

51
00:03:54,300 --> 00:04:00,060
Now, let's consider what happens if you set reenacts to false in this case, you will only get the

52
00:04:00,060 --> 00:04:03,830
rose from March six up to March 10 and all of them will be filled.

53
00:04:04,470 --> 00:04:08,740
So as you can see in this case, the forecast goes from left to right.

54
00:04:09,330 --> 00:04:12,600
So suppose that we wanted to make a forecast for March six.

55
00:04:13,080 --> 00:04:16,500
The first column in this case contains the forecast for March seven.

56
00:04:17,310 --> 00:04:20,700
The second column contains the forecast for March 8th and so on.

57
00:04:21,480 --> 00:04:25,560
Also note the headings which are named as one H2 and so forth.

58
00:04:30,180 --> 00:04:35,550
Now, we've pretty much already explained the first argument, which is start essentially this argument

59
00:04:35,550 --> 00:04:39,480
controls the first date from which we want to generate a forecast.

60
00:04:39,960 --> 00:04:45,280
So this date and every date after it, we'll have a forecast as per the previous slide.

61
00:04:45,900 --> 00:04:50,790
The only thing left to mention is that if you do not pass in this argument, there will only be a single

62
00:04:50,790 --> 00:04:53,660
forecast for the final date of your training series.

63
00:04:54,030 --> 00:04:59,460
So if your data goes from March one to March 10, the forecasts will be assumed to start on March 10,

64
00:05:00,060 --> 00:05:04,680
which means that the first the actual forecast data point is a forecast for March 11.

65
00:05:09,480 --> 00:05:15,690
Now, again, as mentioned earlier, this so-called forecast is not just a single number when we call

66
00:05:15,690 --> 00:05:20,250
the forecast function, this actually returns an object of type Arche model forecast.

67
00:05:20,790 --> 00:05:24,970
From this, we can access certain attributes like the mean of variance and so forth.

68
00:05:25,470 --> 00:05:30,840
So when you call variance, this will return a data frame of variance forecasts.

69
00:05:31,440 --> 00:05:34,710
One point of confusion is that there are two attributes for variance.

70
00:05:35,100 --> 00:05:38,820
One is called residual variance and the other is simply called variance.

71
00:05:39,270 --> 00:05:44,580
These will only be different if you specify a mean model that can change like an error process.

72
00:05:45,120 --> 00:05:48,990
So for us, it doesn't matter which one we use since they will be the same.

73
00:05:53,770 --> 00:05:59,350
OK, so the final topic we have to discuss in this lecture is the difference between in sample and out

74
00:05:59,350 --> 00:06:00,670
of sample predictions.

75
00:06:01,300 --> 00:06:06,910
As you recall, we've seen that we can kind of do an end sample forecast, but this is not a true in

76
00:06:06,910 --> 00:06:07,910
sample prediction.

77
00:06:08,620 --> 00:06:14,920
In fact, the sample prediction is obtained by accessing that attribute, conditional volatility through

78
00:06:14,920 --> 00:06:16,600
the arch model result object.

79
00:06:17,200 --> 00:06:20,350
As you recall, this is the objects return by calling fit.

80
00:06:21,130 --> 00:06:26,680
Also recall that in finance we normally think of volatility as the square root of variance.

81
00:06:27,490 --> 00:06:32,070
Thus this is on a different scale than the actual forecast, which has variance.

82
00:06:32,590 --> 00:06:37,960
So when we do the forecast, we want to take the square root such that the sample predictions and the

83
00:06:37,960 --> 00:06:40,060
forecasts can be on the same scale.

84
00:06:44,830 --> 00:06:49,930
OK, so let's now summarize what we've learned about the code since that was probably more complex than

85
00:06:49,930 --> 00:06:53,640
you expected, as we saw, there are three main steps.

86
00:06:54,100 --> 00:06:58,460
The first step is to construct our model using the high level function arc model.

87
00:06:59,080 --> 00:07:04,030
This takes in several arguments, such as your training time series, the type of model, the model

88
00:07:04,030 --> 00:07:06,010
orders and the error distribution.

89
00:07:07,000 --> 00:07:08,950
The second step is to fit the model.

90
00:07:09,430 --> 00:07:13,340
After fitting the model, we can also call the summary function to check the result.

91
00:07:14,350 --> 00:07:16,120
The third step is to forecast.

92
00:07:16,600 --> 00:07:23,290
We learned about three arguments for the forecast function, including Horizon re index and start the

93
00:07:23,290 --> 00:07:29,200
most confusing of these were we index start, which allow us to quote unquote forecast starting not

94
00:07:29,200 --> 00:07:33,300
only from the end of the Time series, but from any point within the Time series.

95
00:07:33,940 --> 00:07:39,760
We also learn that what we are actually forecasting is not the Time series itself, but its variance.

96
00:07:40,510 --> 00:07:45,030
Furthermore, we learn that the sample predictions are the square root of the variance.

97
00:07:45,520 --> 00:07:50,620
Thus we should also take the square root of the forecast to put them both on the same scale.
