1
00:00:11,060 --> 00:00:16,160
In this video, we will continue looking at our Varma Notebook for predicting temperatures.

2
00:00:17,000 --> 00:00:23,510
To recap, what we've done so far is loaded in our data, which wasn't in the T by D format we expected.

3
00:00:23,690 --> 00:00:29,480
So we did some work to convert it into that format and we plotted the data to see what it looks like.

4
00:00:29,810 --> 00:00:33,070
The next step is to split our data into train and test.

5
00:00:33,080 --> 00:00:37,010
So I've chosen n test equals 12, which represents one year.

6
00:00:41,230 --> 00:00:43,360
The next step is to scale our data.

7
00:00:43,690 --> 00:00:48,820
As you recall, the temperatures are not on the same scale and they're also pretty large Fahrenheit

8
00:00:48,820 --> 00:00:49,840
values.

9
00:00:50,380 --> 00:00:56,260
So you can see that I've used a fit transform on the train set and transform on the test at the new

10
00:00:56,260 --> 00:00:56,500
scale.

11
00:00:56,500 --> 00:01:01,120
Two columns are given a new name, which is the old name pre with scaled.

12
00:01:04,840 --> 00:01:07,210
The next step is to scale the other column.

13
00:01:11,640 --> 00:01:16,200
The next step is to create a train idea and test out X, which you've seen before.

14
00:01:16,620 --> 00:01:22,200
As you recall, we use these to index the original data frame for both the train and test sets.

15
00:01:26,260 --> 00:01:30,760
The next step is to also put the scale data back into the original data frame.

16
00:01:31,240 --> 00:01:34,270
As you recall, train and test are just copies.

17
00:01:38,940 --> 00:01:41,580
The next step is to plot the scaled columns.

18
00:01:46,960 --> 00:01:50,740
So as you can see, the two time series are now on the same scale.

19
00:01:53,240 --> 00:01:56,540
The next step is to plot the arc of the first column.

20
00:02:00,930 --> 00:02:05,760
As you recall, we can use these to guide our choice for the number of legs in our model.

21
00:02:06,300 --> 00:02:12,210
Notice how the ACF has a strong seasonal pattern due to the fact that the Time series itself has a strong

22
00:02:12,210 --> 00:02:13,350
seasonal pattern.

23
00:02:16,330 --> 00:02:18,550
The next step is to plot the ICF.

24
00:02:22,810 --> 00:02:27,940
So we see a similar pattern, except that the lags closer to zero are more significant.

25
00:02:31,340 --> 00:02:34,700
The next step is to plot the ACF for the second column.

26
00:02:39,130 --> 00:02:41,830
So again, there is a strong seasonal pattern.

27
00:02:44,350 --> 00:02:47,920
The next step is to plot the passage of the second column.

28
00:02:52,060 --> 00:02:54,310
So again, we see a similar pattern.

29
00:02:57,780 --> 00:03:02,100
The next step is to create a var max object and fit it to the Time series.

30
00:03:02,700 --> 00:03:07,860
Note that because this is going to take some time, I've used datetime to measure the duration.

31
00:03:09,060 --> 00:03:13,950
Also notice that I've picked P equals ten and Q equals ten for no particular reason.

32
00:03:14,310 --> 00:03:16,740
Feel free to try other values if you like.

33
00:03:23,320 --> 00:03:23,650
Okay.

34
00:03:23,680 --> 00:03:28,660
So you'll see that when you use a queue other than zero, you'll get this estimation warning saying

35
00:03:28,660 --> 00:03:33,610
that the estimation of parameters has ID issues.

36
00:03:33,640 --> 00:03:36,310
This is because the solution is not unique.

37
00:03:36,670 --> 00:03:41,920
What you'll also notice is that due to the training methods used behind the scenes, fitting a Varma

38
00:03:41,920 --> 00:03:44,920
model takes much longer than a pure VAR model.

39
00:03:45,820 --> 00:03:48,670
In any case, we'll continue using Varma for now.

40
00:03:49,990 --> 00:03:52,360
So make note of how long this took to train.

41
00:03:55,200 --> 00:04:01,140
The next step is to call the get forecast function to get the forecast for pn test time steps.

42
00:04:04,890 --> 00:04:08,220
The next step is to demonstrate that the results objects.

43
00:04:08,220 --> 00:04:09,270
We got back from fitting.

44
00:04:09,270 --> 00:04:15,090
The model already contains the in sample predictions in an attribute called fitted values.

45
00:04:18,250 --> 00:04:20,830
As you can see, it returns a series.

46
00:04:23,390 --> 00:04:28,190
The next step is to store the predictions in the original joint part DataFrame.

47
00:04:28,610 --> 00:04:34,640
So we use train for the end sample predictions, assigning it to be the fitted values we use.

48
00:04:34,640 --> 00:04:40,640
Test X for the out of sample predictions, assigning it to be the predicted mean of the forecast.

49
00:04:44,720 --> 00:04:48,800
The next step is to plot our predictions along with the original TIME series.

50
00:04:49,040 --> 00:04:53,810
Note that we only plot the last 100 values to keep the plot a reasonable size.

51
00:04:59,080 --> 00:05:03,070
So as you can see, both the train and test predictions look pretty good.

52
00:05:06,170 --> 00:05:09,770
The next step is to perform all the same steps for the other city.

53
00:05:10,010 --> 00:05:13,910
So again, we store the sample and out of sample predictions.

54
00:05:13,940 --> 00:05:19,640
Note that this uses the same objects we had before, since they store the predictions for both cities.

55
00:05:23,700 --> 00:05:26,070
The next step is to plot our predictions.

56
00:05:30,570 --> 00:05:30,740
Okay.

57
00:05:31,010 --> 00:05:35,000
So as you can see, our predictions for the second city look pretty similar.

58
00:05:38,720 --> 00:05:41,720
The next step is to check the R squared of our model.

59
00:05:42,260 --> 00:05:46,640
Basically, we can just index the data frame where we stored the predictions.

60
00:05:50,070 --> 00:05:50,460
Okay.

61
00:05:50,460 --> 00:05:53,070
So the r squared for oakland looks pretty good.

62
00:05:55,590 --> 00:05:58,380
The next step is to check the R squared for Stockholm.

63
00:06:02,140 --> 00:06:04,720
So again, the R squared looks pretty good.

64
00:06:05,680 --> 00:06:11,350
Now, of course, how good these predictions really are doesn't depend on these values in isolation.

65
00:06:11,650 --> 00:06:15,670
Instead, what we should really do is compare to a baseline.

66
00:06:16,150 --> 00:06:21,940
So the reason I chose to use Varma first is because after years of getting to know my students online,

67
00:06:21,940 --> 00:06:24,220
I believe this is what they would want to try.

68
00:06:24,610 --> 00:06:28,150
Of course, we've already seen a few disadvantages of Varma.

69
00:06:28,540 --> 00:06:33,040
One issue is that Varma processes are generally not identifiable.

70
00:06:33,400 --> 00:06:36,460
Another problem is that they take longer to train.

71
00:06:36,850 --> 00:06:42,220
So in the coming lecture, we'll explore alternative options and also compare their performance.