1
00:00:00,000 --> 00:00:03,030
A common and very simple
forecasting method

2
00:00:03,030 --> 00:00:05,100
is to calculate a moving average.

3
00:00:05,100 --> 00:00:06,690
The idea here is that

4
00:00:06,690 --> 00:00:09,240
the yellow line is
a plot of the average

5
00:00:09,240 --> 00:00:10,740
of the blue values over

6
00:00:10,740 --> 00:00:13,680
a fixed period called
an averaging window,

7
00:00:13,680 --> 00:00:15,510
for example, 30 days.

8
00:00:15,510 --> 00:00:17,415
Now this nicely eliminates

9
00:00:17,415 --> 00:00:19,110
a lot of the noise and it gives

10
00:00:19,110 --> 00:00:22,710
us a curve roughly emulating
the original series,

11
00:00:22,710 --> 00:00:26,295
but it does not anticipate
trend or seasonality.

12
00:00:26,295 --> 00:00:29,010
Depending on
the current time i.e.

13
00:00:29,010 --> 00:00:30,690
the period after which you want

14
00:00:30,690 --> 00:00:32,594
to forecast for the future,

15
00:00:32,594 --> 00:00:36,330
it can actually end up being
worse than a naive forecast.

16
00:00:36,330 --> 00:00:38,129
In this case, for example,

17
00:00:38,129 --> 00:00:42,340
I got a mean absolute
error of about 7.14.

18
00:00:42,410 --> 00:00:46,580
One method to avoid this
is to remove the trend and

19
00:00:46,580 --> 00:00:48,605
seasonality from the time series

20
00:00:48,605 --> 00:00:50,825
with a technique
called differencing.

21
00:00:50,825 --> 00:00:53,990
So instead of studying
the time series itself,

22
00:00:53,990 --> 00:00:56,630
we study the difference
between the value at time

23
00:00:56,630 --> 00:00:59,750
T and the value at
an earlier period.

24
00:00:59,750 --> 00:01:01,700
Depending on the time
of your data,

25
00:01:01,700 --> 00:01:03,350
that period might be a year,

26
00:01:03,350 --> 00:01:05,165
a day, a month or whatever.

27
00:01:05,165 --> 00:01:07,265
Let's look at a year earlier.

28
00:01:07,265 --> 00:01:08,555
So for this data,

29
00:01:08,555 --> 00:01:11,225
at time T minus 365,

30
00:01:11,225 --> 00:01:14,015
we'll get this
difference time series

31
00:01:14,015 --> 00:01:17,255
which has no trend
and no seasonality.

32
00:01:17,255 --> 00:01:20,420
We can then use
a moving average to forecast

33
00:01:20,420 --> 00:01:24,015
this time series which
gives us these forecasts.

34
00:01:24,015 --> 00:01:25,865
But these are just forecasts

35
00:01:25,865 --> 00:01:27,770
for the difference time series,

36
00:01:27,770 --> 00:01:29,860
not the original time series.

37
00:01:29,860 --> 00:01:33,270
To get the final forecasts
for the original time series,

38
00:01:33,270 --> 00:01:38,075
we just need to add back
the value at time T minus 365,

39
00:01:38,075 --> 00:01:40,475
and we'll get these forecasts.

40
00:01:40,475 --> 00:01:42,250
They look much
better, don't they?

41
00:01:42,250 --> 00:01:44,210
If we measure
the mean absolute error

42
00:01:44,210 --> 00:01:45,740
on the validation period,

43
00:01:45,740 --> 00:01:47,480
we get about 5.8.

44
00:01:47,480 --> 00:01:48,980
So it's slightly better than

45
00:01:48,980 --> 00:01:51,995
naive forecasting but
not tremendously better.

46
00:01:51,995 --> 00:01:53,450
You may have noticed that our

47
00:01:53,450 --> 00:01:55,160
moving average removed a lot of

48
00:01:55,160 --> 00:01:58,790
noise but our final forecasts
are still pretty noisy.

49
00:01:58,790 --> 00:02:00,665
Where does that noise come from?

50
00:02:00,665 --> 00:02:03,050
Well, that's coming from
the past values that

51
00:02:03,050 --> 00:02:05,375
we added back into our forecasts.

52
00:02:05,375 --> 00:02:08,750
So we can improve
these forecasts by also removing

53
00:02:08,750 --> 00:02:12,870
the past noise using
a moving average on that.

54
00:02:13,210 --> 00:02:17,105
If we do that, we get
much smoother forecasts.

55
00:02:17,105 --> 00:02:19,730
In fact, this gives us
a mean squared error over

56
00:02:19,730 --> 00:02:23,195
the validation period
of just about 4.5.

57
00:02:23,195 --> 00:02:26,405
Now that's much better than
all of the previous methods.

58
00:02:26,405 --> 00:02:29,315
In fact, since
the series is generated,

59
00:02:29,315 --> 00:02:31,970
we can compute that
a perfect model will give

60
00:02:31,970 --> 00:02:35,630
a mean absolute error of
about four due to the noise.

61
00:02:35,630 --> 00:02:37,460
Apparently, with this approach,

62
00:02:37,460 --> 00:02:39,575
we're not too far
from the optimal.

63
00:02:39,575 --> 00:02:42,500
Keep this in mind before you
rush into deep learning.

64
00:02:42,500 --> 00:02:46,200
Simple approaches sometimes
can work just fine.