1
00:00:11,110 --> 00:00:16,600
OK, so in this lecture, we are going to look at how to apply the code we just learned to a stock price

2
00:00:16,600 --> 00:00:17,500
time series.

3
00:00:18,220 --> 00:00:23,450
Now, one thing I want you to notice about this lecture is that it does not require any new code.

4
00:00:23,860 --> 00:00:25,640
Only the data set has changed.

5
00:00:26,050 --> 00:00:30,300
In fact, you should already have this data set from the previous exercises.

6
00:00:30,730 --> 00:00:36,040
So if you'd like to stop this video and try to do this yourself, please take this opportunity to do

7
00:00:36,040 --> 00:00:36,660
so now.

8
00:00:36,910 --> 00:00:38,350
Otherwise, we'll continue.

9
00:00:39,370 --> 00:00:43,810
Also note that I'm not exaggerating when I say that this did not require any new code.

10
00:00:44,210 --> 00:00:49,840
I simply took the previous script change that you were all of the data set and changed the column names.

11
00:00:50,260 --> 00:00:53,670
So if you're ever wondering what I mean by all data is the same.

12
00:00:53,920 --> 00:00:55,160
This is a great example.

13
00:00:55,990 --> 00:01:01,190
However, you should be aware of the fact that the statement does not imply you will get the same results.

14
00:01:01,720 --> 00:01:07,330
In other words, just because you can forecast champagne sales with near perfect accuracy does not mean

15
00:01:07,330 --> 00:01:09,070
you can do the same with stock prices.

16
00:01:10,240 --> 00:01:13,920
OK, so let's scroll down to the part of the code where we load in our data.

17
00:01:14,890 --> 00:01:20,380
As you recall, this CSV contains multiple stocks, so I'm going to call it D.F. Zero.

18
00:01:26,420 --> 00:01:31,580
I've arbitrarily chosen IBM for this script, and so we're going to call the closed prices for the IBM

19
00:01:31,580 --> 00:01:32,970
data frame D.F..

20
00:01:34,130 --> 00:01:39,260
OK, so all this stuff is the same as before we take the log, we compute the difference, do a train

21
00:01:39,260 --> 00:01:40,850
to split and so forth.

22
00:01:42,140 --> 00:01:47,540
Now, remember, these are stock prices, so we pretty much have to make it stationary unless we want

23
00:01:47,540 --> 00:01:49,490
a model that just predicts the last value.

24
00:01:50,210 --> 00:01:53,060
OK, so let's look at how our linear model performs.

25
00:01:56,710 --> 00:02:02,130
As you can see, the train R-squared is nearly zero and the test R-squared is worse than zero.

26
00:02:13,470 --> 00:02:16,260
Let's not look at a plot of the One-Step forecast.

27
00:02:20,730 --> 00:02:26,640
OK, so as you can see, it simply lags the input time series, which makes sense given what we know.

28
00:02:29,770 --> 00:02:33,370
OK, so let's look at the multi step linear model forecast.

29
00:02:42,150 --> 00:02:47,610
And so we see that the multistep forecast pretty much follows the straight line, which makes sense

30
00:02:47,610 --> 00:02:48,690
given what we know.

31
00:02:52,480 --> 00:02:54,970
So now let's do our multi output forecast.

32
00:03:04,030 --> 00:03:09,430
Again, the R-squared is only slightly better than zero, which suggests that our data is just noise.

33
00:03:17,120 --> 00:03:21,610
And so we see that the multi output forecasts pretty much follows the same pattern.

34
00:03:26,310 --> 00:03:30,100
So interestingly, when we check the map, it doesn't seem too bad.

35
00:03:30,570 --> 00:03:33,530
This should tell you that these numbers take some interpretation.

36
00:03:36,480 --> 00:03:43,170
OK, so the next step is to try our non-linear models, maybe, perhaps there is some non-linear relationship.

37
00:03:45,110 --> 00:03:47,000
OK, so let's check out the SVR.

38
00:03:49,080 --> 00:03:51,420
So the SVR pretty much does the same thing.

39
00:03:54,880 --> 00:03:56,410
Now let's see the random forest.

40
00:04:05,640 --> 00:04:07,350
Again, pretty much the same thing.

41
00:04:08,250 --> 00:04:11,520
Notice how well the random forest over fits the train said.

42
00:04:18,610 --> 00:04:21,910
OK, now let's have a look at our multi output forecast.

43
00:04:24,990 --> 00:04:27,530
And we see that it's pretty much the same results.

44
00:04:27,570 --> 00:04:28,630
No surprise.

45
00:04:30,030 --> 00:04:32,190
Now, what is the lesson of this example?

46
00:04:33,060 --> 00:04:37,830
Well, you may have been hopeful that perhaps it was only linear models that would fail at predicting

47
00:04:37,830 --> 00:04:38,870
stock returns.

48
00:04:39,570 --> 00:04:43,730
Maybe you hopes that there were some nonlinear patterns waiting to be exploited.

49
00:04:44,160 --> 00:04:47,100
But so far, it turns out that this has not been the case.
