1
00:00:11,090 --> 00:00:17,120
So in this lecture, we will be looking at the syntax for Arche and Gurche with the Python library called

2
00:00:17,120 --> 00:00:22,760
Arche, although there are other libraries in Python that implement Gurche, this seems to be the most

3
00:00:22,760 --> 00:00:23,330
popular.

4
00:00:23,990 --> 00:00:28,790
So this lecture will focus on showing you how to use this library so that when we look at the actual

5
00:00:28,790 --> 00:00:30,840
code, you won't be surprised.

6
00:00:32,180 --> 00:00:38,360
OK, so to begin, unlike most libraries we need on Google CoLab, Arche does not come preinstalled.

7
00:00:38,840 --> 00:00:42,800
However, as with most Python libraries, it's very easy to install.

8
00:00:43,010 --> 00:00:46,640
Simply use PIP install arch as you normally would.

9
00:00:51,340 --> 00:00:53,750
The next step is to consider how we will use our.

10
00:00:54,730 --> 00:00:58,990
Basically, you'll find that it's much more like stat's models compared to typical machine learning

11
00:00:58,990 --> 00:01:00,710
libraries like Cycad Learn.

12
00:01:01,360 --> 00:01:03,160
So the main steps are as follows.

13
00:01:04,090 --> 00:01:10,090
The first step is to create the model during the stage you'll pass in the training data along with model

14
00:01:10,090 --> 00:01:12,380
type of parameters, just like stat's models.

15
00:01:13,000 --> 00:01:18,040
The next step is to call the fifth function, which is not taking any data since you already pass it

16
00:01:18,040 --> 00:01:19,540
in a during the previous step.

17
00:01:20,320 --> 00:01:23,860
The third step is to use your fittin model to make a forecast.

18
00:01:24,460 --> 00:01:29,690
Now this step will require some elaboration, so that will be one of the main topics of this lecture.

19
00:01:30,520 --> 00:01:35,350
So on the next few slides, we're going to expand on each of these three steps and I'll show you the

20
00:01:35,350 --> 00:01:37,150
exact syntax you'll need.

21
00:01:41,840 --> 00:01:47,660
OK, so the first step is to create your model for this, we're going to use a high level function called

22
00:01:47,660 --> 00:01:48,570
arch model.

23
00:01:49,160 --> 00:01:54,230
Note that unlike other libraries, this is not the actual name of the class, but more like a factory

24
00:01:54,230 --> 00:01:57,360
method that returns some object at this point.

25
00:01:57,380 --> 00:02:00,870
We'll look at some, but not all of the possible arguments.

26
00:02:01,760 --> 00:02:03,950
The first argument is the Noise Time series.

27
00:02:03,950 --> 00:02:07,600
You're trying to model, for example, series of log returns.

28
00:02:08,870 --> 00:02:13,720
The next argument is for passing an exogenous data, which we will not use in this course.

29
00:02:14,690 --> 00:02:19,550
The next argument is the mean, which is not a number like it sounds like it might be, but rather a

30
00:02:19,550 --> 00:02:20,120
string.

31
00:02:20,810 --> 00:02:26,740
This string specifies the kind of model you want to use for the mean portion of your time series.

32
00:02:27,380 --> 00:02:33,530
As you recall, we can think of every time series as a model plus a noise model gergis for the noise

33
00:02:33,530 --> 00:02:33,860
part.

34
00:02:33,860 --> 00:02:37,410
While you can really use any technique we've learned about for the mean part.

35
00:02:38,270 --> 00:02:40,470
Now you might ask, what about Arima?

36
00:02:41,210 --> 00:02:47,060
Unfortunately, this library for some reason does not support a rhema as the model, although other

37
00:02:47,060 --> 00:02:51,370
libraries do, for example, are Yuga, which is for the our language.

38
00:02:51,860 --> 00:02:56,570
Since we're not using R in this class, I don't have any plans to cover that at this time.

39
00:02:56,960 --> 00:03:01,520
But you can let me know on the Q&amp;A if that's something you would be interested in seeing.

40
00:03:02,360 --> 00:03:08,720
So the closest thing to a Remo that's included in this library is R x, which is an AP model with the

41
00:03:08,720 --> 00:03:10,900
option of having exoticness inputs.

42
00:03:11,690 --> 00:03:15,830
Now for us, we will be using the default argument, which is constant.

43
00:03:16,280 --> 00:03:21,290
This makes sense since we will be looking at stock returns, which we've seen have no auto regressive

44
00:03:21,290 --> 00:03:21,890
structure.

45
00:03:22,340 --> 00:03:24,890
But feel free to try other options if you like.

46
00:03:26,760 --> 00:03:31,500
The next argument we're going to look at is Volle, which represents the kind of Gargash model you want

47
00:03:31,500 --> 00:03:36,420
to use, the default is Gargash, but the simple arch is another option.

48
00:03:37,080 --> 00:03:41,610
Other advanced options are also possible to choose, for example, eg arch.

49
00:03:43,830 --> 00:03:50,460
The next arguments we're going to discuss are in queue for this library, and you are used as we've

50
00:03:50,460 --> 00:03:52,160
defined them in this course.

51
00:03:52,650 --> 00:03:58,050
So as you recall, although some resources reverse the queue, that won't be the case this time.

52
00:03:59,010 --> 00:04:02,580
Also note that the default values for P and Q are both one.

53
00:04:02,910 --> 00:04:09,000
Since the most common Gurche model is the Ghazwan one, it just so happens that this model tends to

54
00:04:09,150 --> 00:04:10,980
fit financial returns very well.

55
00:04:13,010 --> 00:04:16,350
The final argument I want to discuss is the deceit argument.

56
00:04:17,000 --> 00:04:21,920
Now, this is going to get a bit advanced, but if you've taken courses with me before, you should

57
00:04:21,920 --> 00:04:22,960
find this familiar.

58
00:04:23,690 --> 00:04:29,930
Basically, as you recall, the way that these models are fit is by maximizing some likelihood, of

59
00:04:29,930 --> 00:04:31,980
course, in order to have a likelihood function.

60
00:04:32,180 --> 00:04:34,340
You must have some corresponding distribution.

61
00:04:35,120 --> 00:04:36,530
We've seen that for regression.

62
00:04:36,590 --> 00:04:37,970
This is typically the normal.

63
00:04:38,660 --> 00:04:43,410
However, as you may have learned in the past, financial returns have fat tails.

64
00:04:43,910 --> 00:04:49,310
What we mean by this is financial returns can take on more extreme values than would be predicted by

65
00:04:49,310 --> 00:04:51,740
the normal distribution, which is the default.

66
00:04:52,280 --> 00:04:57,710
If you want to convince yourself of this, try plotting a histogram of stock returns against a fitted

67
00:04:57,710 --> 00:04:58,810
normal distribution.

68
00:04:59,060 --> 00:05:01,520
And you should see that there are large deviations.

69
00:05:02,240 --> 00:05:07,280
Basically, the short story is, although the normal distribution is often the default choice, it's

70
00:05:07,280 --> 00:05:09,880
not actually a good fit for financial returns.

71
00:05:10,430 --> 00:05:14,530
As an example, we often find that the T distribution is a much better fit.

72
00:05:15,230 --> 00:05:19,310
So this this argument allows us to specify a different distribution.

73
00:05:24,020 --> 00:05:30,230
OK, so finally, we are at the second step, which is to actually fit the model, this isn't so interesting

74
00:05:30,230 --> 00:05:32,870
because we can call this function without any arguments.

75
00:05:33,410 --> 00:05:39,050
As with stats models, this returns a result object, which you can then use to forecast and do other

76
00:05:39,050 --> 00:05:39,770
things.

77
00:05:40,910 --> 00:05:45,240
One thing you can do after you fitted your model is you can call the summary function.

78
00:05:45,830 --> 00:05:50,610
So if you've ever done regression analysis, what we get back from this should feel pretty familiar.

79
00:05:51,200 --> 00:05:57,000
Basically, you get to see the final log likelihood, the final I, the model parameters and so forth.

80
00:05:57,440 --> 00:06:02,720
It also does some statistical tests so you can check whether or not the parameters are statistically

81
00:06:02,720 --> 00:06:03,620
significant.

82
00:06:08,230 --> 00:06:14,500
OK, so the interesting part of how we use Gargash is when we make our predictions, let's think about

83
00:06:14,500 --> 00:06:15,490
this for a second.

84
00:06:16,000 --> 00:06:21,690
If we want to make a so-called prediction using Gurche, what is it that we are actually predicting?

85
00:06:22,450 --> 00:06:27,640
You might think it's Epsilon of T, since that's the TIME series we're trying to model, but this actually

86
00:06:27,640 --> 00:06:28,680
doesn't make sense.

87
00:06:33,320 --> 00:06:36,530
OK, so why doesn't it make sense to predict Ypsilanti?

88
00:06:37,340 --> 00:06:39,050
Well, Ypsilanti is Noize.

89
00:06:39,470 --> 00:06:44,720
This would be nearly as random as trying to predict a number sampled from the standard normal or a coin

90
00:06:44,720 --> 00:06:45,120
flip.

91
00:06:45,650 --> 00:06:52,070
In fact, it's pretty much just as hard as predicting a coin flip since, as you recall, ZT is a sample

92
00:06:52,070 --> 00:06:55,070
from the standard normal, which is symmetric around zero.

93
00:06:56,090 --> 00:06:59,060
OK, so it doesn't make sense to predict Ypsilanti.

94
00:07:03,910 --> 00:07:05,950
So what does it make sense to predict?

95
00:07:06,730 --> 00:07:13,080
Well, it makes sense to predict the statistics of Ypsilanti in particular, its mean and its variance.

96
00:07:13,720 --> 00:07:19,090
Note that the median is not so interesting since, as you recall, it's typically zero or in our case,

97
00:07:19,090 --> 00:07:19,660
constant.

98
00:07:20,420 --> 00:07:25,270
What we really care about with Gurche being a model for volatility is the variance.

99
00:07:25,750 --> 00:07:32,380
As you recall, this is Sigma squared T. So when it comes time to forecast, this is really what we

100
00:07:32,380 --> 00:07:35,530
want to forecast, either sigma or sigma squared.

101
00:07:40,290 --> 00:07:45,030
Now, since this lecture has been quite long so far, we're going to take a break and continue in the

102
00:07:45,030 --> 00:07:49,050
next lecture, the next lecture will focus on how to forecast.
