1
00:00:11,540 --> 00:00:17,030
In this lecture I want to discuss some other ways you might generate multi-step forecasts.

2
00:00:17,030 --> 00:00:22,100
You may have noticed that the two forecasting methods we used in this section demonstrated some pretty

3
00:00:22,100 --> 00:00:23,770
extreme results.

4
00:00:23,810 --> 00:00:29,960
On one hand we saw that once that forecast can seem artificially good especially when all they do is

5
00:00:29,960 --> 00:00:32,680
just predict the last value or close to it.

6
00:00:32,750 --> 00:00:38,120
On the other hand doing a multi-step forecast iteratively using the models own predictions can lead

7
00:00:38,120 --> 00:00:42,050
to poor results even on simple problems like the sine wave.

8
00:00:42,050 --> 00:00:47,840
Even a nearly perfect sine wave model eventually breaks down due to the build up of numerical position

9
00:00:47,840 --> 00:00:48,410
errors

10
00:00:53,510 --> 00:00:54,230
at this point.

11
00:00:54,230 --> 00:00:57,260
One issue we have to think about is practicality.

12
00:00:57,260 --> 00:01:01,310
In fact it's not necessarily wrong to predict only one step ahead.

13
00:01:01,340 --> 00:01:06,740
It's certainly wrong when you want to make a conclusion such as Wow look how good my model is at predicting

14
00:01:06,740 --> 00:01:08,060
future values.

15
00:01:08,090 --> 00:01:13,520
Of course that doesn't make any sense because you used future values to predict the future values.

16
00:01:13,520 --> 00:01:19,220
In some cases it may make sense to only want a prediction for one time step ahead.

17
00:01:19,220 --> 00:01:22,190
In other cases it may not make sense.

18
00:01:22,220 --> 00:01:27,800
For example if you want to forecast weekly or monthly sales for the next year it doesn't make sense

19
00:01:27,800 --> 00:01:30,950
to only do a one step forecast.

20
00:01:30,950 --> 00:01:34,910
On the other hand maybe you just want to predict the load on your website for tomorrow.

21
00:01:35,030 --> 00:01:38,690
So you know how many Amazon machine instances to spin up.

22
00:01:38,810 --> 00:01:44,060
Either way the method you use is highly dependent on context and the real world constraints of your

23
00:01:44,060 --> 00:01:49,750
problem.

24
00:01:49,750 --> 00:01:54,180
It's also important to benchmark your predictions against a baseline model.

25
00:01:54,310 --> 00:01:59,500
The reason you want to do this is well what if your complicated model performs worse than your baseline

26
00:01:59,500 --> 00:02:00,610
model.

27
00:02:00,610 --> 00:02:05,350
If that happens then you should just use your baseline model as you'll see.

28
00:02:05,360 --> 00:02:11,180
Sometimes you can even prove mathematically that your baseline model is better than your complex model.

29
00:02:11,210 --> 00:02:17,270
One way to establish a baseline is to use what is called a naive forecast a naive forecast does the

30
00:02:17,270 --> 00:02:18,660
dumbest thing possible.

31
00:02:18,710 --> 00:02:22,410
You guessed it it just predicts the last value of the time series.

32
00:02:22,460 --> 00:02:25,430
It's a model with zero parameters.

33
00:02:25,580 --> 00:02:30,950
One interesting fact you will learn about if you ever study finance is that stock prices very closely

34
00:02:30,950 --> 00:02:34,360
follow A Random Walk Model in a random walk model.

35
00:02:34,360 --> 00:02:38,000
A naive forecast is in fact the best forecasts you can make.

36
00:02:38,060 --> 00:02:43,550
So by definition if your time series follows a random walk it doesn't matter how complex your model

37
00:02:43,550 --> 00:02:43,970
is.

38
00:02:44,060 --> 00:02:46,010
You cannot beat a naive forecast

39
00:02:51,200 --> 00:02:56,840
in time series literature classic statistical models such as Auriemma use algorithms that predicts only

40
00:02:56,840 --> 00:03:02,660
one step ahead and then use the iterative method we discussed previously in order to do multi-step forecasts

41
00:03:02,690 --> 00:03:04,290
into the future.

42
00:03:04,310 --> 00:03:09,470
One interesting question to ask is Are there any models that naturally predict multiple steps into the

43
00:03:09,470 --> 00:03:10,490
future.

44
00:03:10,490 --> 00:03:13,690
The answer is yes and we can do so quite trivially.

45
00:03:13,850 --> 00:03:18,650
As you know it's very easy to specify the number of outputs in your neural network model.

46
00:03:18,860 --> 00:03:23,590
You're just passing a number into the dense layer for the number of outputs you want.

47
00:03:23,600 --> 00:03:28,100
So what if you want a model that predicts 12 time steps ahead simultaneously.

48
00:03:28,100 --> 00:03:30,140
In fact that's entirely possible.

49
00:03:30,170 --> 00:03:32,930
Just make a neural network with twelve outputs.

50
00:03:32,930 --> 00:03:37,790
Then as the target you'll use the true values for the next twelve timestamps.

51
00:03:37,850 --> 00:03:42,530
So whereas in this course the target just had one value for a multi head model.

52
00:03:42,530 --> 00:03:44,620
You'll have a target with 12 values.

53
00:03:44,840 --> 00:03:49,130
If you're interested in this kind of thing you might want to give this a try as an exercise

54
00:03:54,300 --> 00:03:59,370
to summarize this lecture what we wanted to do was clarify the difference between the different kinds

55
00:03:59,370 --> 00:04:00,710
of forecasts.

56
00:04:00,750 --> 00:04:06,210
The big mistake a lot of people make is they pretend they are predicting the future when in actuality

57
00:04:06,240 --> 00:04:08,610
they are only predicting one step ahead.

58
00:04:08,640 --> 00:04:14,480
It's not wrong to predict only one step ahead but you must be clear about the results you are presenting.

59
00:04:14,520 --> 00:04:19,530
If you really want to forecast multiple steps into the future then you are not allowed to use data from

60
00:04:19,530 --> 00:04:20,800
the future.

61
00:04:20,820 --> 00:04:26,190
Of course there are some cases where you actually want to make one step forecasts only.

62
00:04:26,190 --> 00:04:31,920
This is dependent on the specifics of your company and your data only you know that and you must design

63
00:04:31,920 --> 00:04:34,560
your model accordingly.

64
00:04:34,570 --> 00:04:39,820
We also discussed the importance of establishing a baseline which in time series analysis is often the

65
00:04:39,820 --> 00:04:46,900
naive forecast where all we do is predict the previous value of the series I mentioned that ran a Walk

66
00:04:46,900 --> 00:04:53,380
Model which often very closely models stock prices a naive forecast is in the best forecast.

67
00:04:53,380 --> 00:04:58,690
In addition we saw that aside from building a one step model and then iteratively applying it to predict

68
00:04:58,690 --> 00:05:05,490
multiple steps into the future there is an easy modification we can simply use a neuron that with multiple

69
00:05:05,520 --> 00:05:09,540
output nodes representing the multiple steps in the future that we want to predict.
