1
00:00:01,230 --> 00:00:04,500
Welcome back to
Practical Time Series Analysis.

2
00:00:04,500 --> 00:00:09,447
We've talked in our previous lectures
about the fundamental driving mechanisms

3
00:00:09,447 --> 00:00:13,749
that give rise to stochastic processes
that create the time series that

4
00:00:13,749 --> 00:00:16,126
you might be interested in analyzing.

5
00:00:16,126 --> 00:00:20,473
We've looked at moving average processes,
auto regressive processes,

6
00:00:20,473 --> 00:00:24,500
we've included trend,
we've included seasonality.

7
00:00:24,500 --> 00:00:27,600
We're going to take a bit of a different
approach in this lecture and

8
00:00:27,600 --> 00:00:30,570
begin to talk about forecasting.

9
00:00:30,570 --> 00:00:31,885
There are many methods here.

10
00:00:31,885 --> 00:00:36,996
We'll start with a very basic method,
Simple Exponential Smoothing,

11
00:00:36,996 --> 00:00:41,870
which does enjoy widespread
application in business and industry.

12
00:00:41,870 --> 00:00:44,052
It's something that people really do.

13
00:00:44,052 --> 00:00:48,581
We're going to try to make predictions
about the future or forecasts,

14
00:00:48,581 --> 00:00:53,620
let's say, based upon data that
we already have available to us.

15
00:00:53,620 --> 00:00:58,284
So you might be interested in predicting
sales figures for the upcoming

16
00:00:58,284 --> 00:01:03,271
holiday season based upon what you've
seen over the last several seasons.

17
00:01:03,271 --> 00:01:10,222
You might be interested in ridership
on a train, a railway system.

18
00:01:10,222 --> 00:01:14,553
There are all sorts of reasons people
have for looking at what's happening, or

19
00:01:14,553 --> 00:01:19,141
making good guesses about what's likely to
happen in the future based upon data that

20
00:01:19,141 --> 00:01:20,772
you already have available.

21
00:01:20,772 --> 00:01:25,975
In this particular lecture,
we'll use Simple Exponential Smoothing.

22
00:01:25,975 --> 00:01:29,722
And you'll be able to do this
with time series data that you

23
00:01:29,722 --> 00:01:32,632
find interesting to
make a simple forecast.

24
00:01:32,632 --> 00:01:37,509
As is often the case in these lectures,
an explicit goal is, you should be able to

25
00:01:37,509 --> 00:01:42,210
explain Simple Exponential Smoothing
to a friend or a colleague.

26
00:01:42,210 --> 00:01:42,970
What is it?

27
00:01:42,970 --> 00:01:44,130
How do you do it?

28
00:01:44,130 --> 00:01:44,894
What does it do for you?

29
00:01:47,390 --> 00:01:51,971
The data set that we'll be examining
here is on London rainfall,

30
00:01:51,971 --> 00:01:58,550
primarily in the 19th century, getting
into the 20th century a little bit.

31
00:01:58,550 --> 00:02:03,288
There's a nice discussion in
A Little Book of R for Time Series,

32
00:02:03,288 --> 00:02:06,465
and you can also access the original data.

33
00:02:10,121 --> 00:02:13,690
Rather than just using
a built-in data set from R,

34
00:02:13,690 --> 00:02:18,597
let's expand a little bit and
grab these data right off the Internet.

35
00:02:18,597 --> 00:02:23,926
R has a nice facility, the scan command,
that will allow you to go to a website,

36
00:02:23,926 --> 00:02:26,724
grab some data, and store it in an array.

37
00:02:26,724 --> 00:02:31,391
Once we have our numbers available to us,
we'll create a time

38
00:02:31,391 --> 00:02:36,522
series object to get a little bit
of structure and makes some calls.

39
00:02:36,522 --> 00:02:41,262
Now, if you've never thought
about rainfall in London before,

40
00:02:41,262 --> 00:02:44,980
then it's good to do even
the most vanilla things.

41
00:02:44,980 --> 00:02:47,140
Let's get a histogram
of our rainfall data.

42
00:02:47,140 --> 00:02:49,630
We'll take a look at the distribution.

43
00:02:49,630 --> 00:02:54,270
We'll almost reflexively take a look about
whether it's normally distributed or not.

44
00:02:57,190 --> 00:03:00,725
And close, not quite normally distributed.

45
00:03:00,725 --> 00:03:06,783
Looks like a systematic departure from
normality, but nothing too extreme.

46
00:03:09,581 --> 00:03:13,433
Looking at the times series as a sequence,
we look for

47
00:03:13,433 --> 00:03:17,290
the sorts of patterns we like to observe.

48
00:03:17,290 --> 00:03:21,280
If you feel that that's very difficult
to do just by looking at the sequence,

49
00:03:21,280 --> 00:03:24,400
pull up the auto correlation function and
take a look.

50
00:03:25,720 --> 00:03:29,475
As I'm looking at that data set,
I'm kind of seeing noise.

51
00:03:29,475 --> 00:03:34,560
I really can't make myself see much
of a structure in that data set.

52
00:03:37,184 --> 00:03:39,279
But maybe it's there, and
I'm just not seeing it.

53
00:03:39,279 --> 00:03:45,282
We'll call auto.arima to see if we can
see if we can get nice, fitted model.

54
00:03:45,282 --> 00:03:49,381
And even auto.arima says, no, sorry.

55
00:03:49,381 --> 00:03:55,464
The coefficients, the autoregressive
moving average coefficients, nothing.

56
00:03:55,464 --> 00:03:59,353
But we do get an average so
there is a model.

57
00:03:59,353 --> 00:04:00,759
It's just a very simple model.

58
00:04:00,759 --> 00:04:07,189
The model's just 24.8 So

59
00:04:07,189 --> 00:04:10,820
in light of this, we'll try to
do a little bit of forecasting.

60
00:04:12,490 --> 00:04:15,500
There are different notations
that people use here.

61
00:04:15,500 --> 00:04:18,090
And we'll look at one of the common
ones in this lecture, and

62
00:04:18,090 --> 00:04:20,890
we'll see another common
one in the next lecture.

63
00:04:20,890 --> 00:04:24,823
Where we'll let the subscript tell
us where we'd like the forecast.

64
00:04:24,823 --> 00:04:28,739
So h is how many periods into
the future you'd like to look.

65
00:04:28,739 --> 00:04:32,341
Maybe this is Tuesday and
you'd like to look for Next Tuesday.

66
00:04:32,341 --> 00:04:37,200
So h would be 7, if you have daily data.

67
00:04:37,200 --> 00:04:42,063
The superscript tells you what
data you're using when you're

68
00:04:42,063 --> 00:04:46,108
making your forecast,
data up through time step n.

69
00:04:46,108 --> 00:04:51,125
The most naive forecasting method I can
think of, is to say that what is going to

70
00:04:51,125 --> 00:04:56,229
happen tomorrow, our forecast for
tomorrow is just what was happening today.

71
00:04:56,229 --> 00:04:58,680
That's considered a naive method.

72
00:05:00,200 --> 00:05:04,714
In the notation that we've developed,
we would say x subscript n+1,

73
00:05:04,714 --> 00:05:06,423
there's the next period.

74
00:05:06,423 --> 00:05:12,380
Based upon data available at n is just
your observed value at time period n.

75
00:05:12,380 --> 00:05:19,103
Now, some data have a pretty
obvious seasonality to them.

76
00:05:19,103 --> 00:05:24,743
And we would say something like,
the forecast that we'll make for

77
00:05:24,743 --> 00:05:30,585
the next time period, n+1 based
upon data available up through and

78
00:05:30,585 --> 00:05:35,240
including time n is what was
happening one season ago.

79
00:05:35,240 --> 00:05:39,089
So if we're dealing with weeks,
capital S there would be a 7.

80
00:05:39,089 --> 00:05:43,893
Another way of thinking about
your forecasting is to say that

81
00:05:43,893 --> 00:05:48,791
we'll predict what's going to
happen in the next period is just

82
00:05:48,791 --> 00:05:52,388
an average of what's happened previously.

83
00:05:52,388 --> 00:05:56,474
Simple Exponential Smoothing
tries to do a little bit

84
00:05:56,474 --> 00:05:59,390
better than just a plain old average.

85
00:05:59,390 --> 00:06:04,570
It's going to develop
a weighting of previous values.

86
00:06:04,570 --> 00:06:10,040
So in our current data set,
we're going to try to predict

87
00:06:10,040 --> 00:06:14,720
the rainfall, the London rainfall in
a future period based upon the data.

88
00:06:14,720 --> 00:06:19,600
And we'll try to be aware of
updating on our data set.

89
00:06:19,600 --> 00:06:23,300
So instead of just taking an average and
including all of the data points equally,

90
00:06:23,300 --> 00:06:27,980
what we'll try to do is try to weight
the data points that are closer to

91
00:06:27,980 --> 00:06:33,230
us a little bit more and those that
are further away a little bit less.

92
00:06:33,230 --> 00:06:34,953
We're more formal in the reading.

93
00:06:34,953 --> 00:06:38,110
In the readings,
we'll deal with geometric series and

94
00:06:38,110 --> 00:06:42,660
see that we can weight our
averages through geometric series.

95
00:06:42,660 --> 00:06:45,560
We'll also show, and
this is not very deep,

96
00:06:45,560 --> 00:06:50,300
that rather than include the infinite
number of data points, what you can do is

97
00:06:50,300 --> 00:06:56,430
treat this as a weighted average of, we'll
say, for instance, you can see right here.

98
00:06:56,430 --> 00:06:59,377
We'll start with some data value, x sub 1.

99
00:06:59,377 --> 00:07:03,017
And we'll make a forecast for
x sub 2 just based upon x sub 1,

100
00:07:03,017 --> 00:07:07,090
we have a pretty meager amount of
information as we just get started.

101
00:07:08,390 --> 00:07:13,298
Then we will say, okay, so
if you would like to make a forecast for

102
00:07:13,298 --> 00:07:18,121
time period three based upon data
available in time period two,

103
00:07:18,121 --> 00:07:21,605
let's take our previous
smooth level value,

104
00:07:21,605 --> 00:07:26,733
our previous averaged value,
give that a weighting of 1-alpha.

105
00:07:26,733 --> 00:07:33,630
But we'll update it by looking at
the freshly available data point x sub 2.

106
00:07:33,630 --> 00:07:35,510
This is the common pattern.

107
00:07:35,510 --> 00:07:39,062
We'll take alpha times
your freshest data point

108
00:07:39,062 --> 00:07:43,750
+ 1-alpha times your previous forecast.

109
00:07:43,750 --> 00:07:48,226
So if you'd like to make
a forecast about time period 4,

110
00:07:48,226 --> 00:07:52,332
you'll take alpha times your
new value at time 3 and

111
00:07:52,332 --> 00:07:58,144
add to it 1-alpha times your previous
level or your previous forecast.

112
00:07:58,144 --> 00:08:02,376
Some of us learn by writing code, and
that's what we'll try to do right now.

113
00:08:02,376 --> 00:08:05,410
It's rather simple,
we just need a fore loop.

114
00:08:06,440 --> 00:08:12,152
So in our little DIY, do it yourself code,
we'll let alpha = .2.

115
00:08:12,152 --> 00:08:15,140
And that's totally
unmotivated at this point.

116
00:08:15,140 --> 00:08:17,507
We'll see how to choose a good
alpha in just a moment.

117
00:08:17,507 --> 00:08:19,870
But we'll let alpha = .2.

118
00:08:19,870 --> 00:08:24,890
We'll create a vector forecast.values,
and we'll set it to NULL.

119
00:08:24,890 --> 00:08:28,251
We're just trying to establish
array that we can use in our loop.

120
00:08:28,251 --> 00:08:32,030
n, of course, is the length of the data
that you have available to you.

121
00:08:33,250 --> 00:08:38,276
Your first forecast is just
going to be your first data point.

122
00:08:38,276 --> 00:08:41,619
And now, we'll loop to get more forecasts.

123
00:08:41,619 --> 00:08:47,958
So we'll create forecast values as alpha
times your updated your freshly available

124
00:08:47,958 --> 00:08:55,370
data point, + (1-alpha) times your
previous forecast, your previous level.

125
00:08:55,370 --> 00:08:57,975
A little before formatting,
we'll use the past command so

126
00:08:57,975 --> 00:09:00,744
it looks nice on the screen when
we actually give our forecast.

127
00:09:03,264 --> 00:09:09,441
So for the year 1913,
based upon data available up through,

128
00:09:09,441 --> 00:09:16,519
including the year 1912,
the forecast value using the unmotivated,

129
00:09:16,519 --> 00:09:22,605
almost random alpha of 0.2
would be 25.3 inches of rain.

130
00:09:22,605 --> 00:09:25,450
But let's drill down on this a little.

131
00:09:25,450 --> 00:09:28,138
How could you choose alpha intelligently?

132
00:09:28,138 --> 00:09:32,050
How much weighting do you want to give
to values that are close at hand?

133
00:09:32,050 --> 00:09:35,990
And how much weighting do you want to
give to values that were further away?

134
00:09:35,990 --> 00:09:40,670
In this particular data set,
it looks like the best alpha, best in

135
00:09:40,670 --> 00:09:46,290
terms of making our sums of squares
errors or SSE as small as possible.

136
00:09:46,290 --> 00:09:49,420
Best alpha seems to be
really rather small.

137
00:09:49,420 --> 00:09:53,681
Back around, it's hard to read
off of this picture, and so

138
00:09:53,681 --> 00:09:57,616
I've blown it up here,
back around 0.024 or so.

139
00:09:57,616 --> 00:10:03,656
We use the SSE approach, so we'll make
a forecast for time period three.

140
00:10:03,656 --> 00:10:08,142
And then, we'll compare that to the actual
data point that we have available at time

141
00:10:08,142 --> 00:10:08,974
period three.

142
00:10:08,974 --> 00:10:10,942
Make a forecast at time period four,

143
00:10:10,942 --> 00:10:14,700
compare it to the actual data
point at time period four.

144
00:10:14,700 --> 00:10:17,831
In each time, we'll, of course,
take a different square and

145
00:10:17,831 --> 00:10:19,968
then add them up to get
an aggregate error.

146
00:10:24,043 --> 00:10:28,610
Now, such a common approach, of course,
people have written routines for you.

147
00:10:28,610 --> 00:10:31,560
HoltWinters, these are names that will

148
00:10:31,560 --> 00:10:35,760
become famous to even us as we
look into the next lectures.

149
00:10:35,760 --> 00:10:40,560
HoltWinters is a routine available in R,
and

150
00:10:40,560 --> 00:10:45,530
it implements the work of
these two mathematicians.

151
00:10:45,530 --> 00:10:50,705
This is from the years 1957,
58, 1960, there abouts.

152
00:10:50,705 --> 00:10:55,349
We'll grab the time series for rain.

153
00:10:55,349 --> 00:10:58,794
There are going to be three parameters
that we'll be keeping track of in the next

154
00:10:58,794 --> 00:10:59,749
couple of lectures.

155
00:10:59,749 --> 00:11:04,846
We'll deal with level,
trend, and seasonality.

156
00:11:04,846 --> 00:11:08,558
So this is a little
unmotivated at this point,

157
00:11:08,558 --> 00:11:13,519
but we'll turn the trend and
the seasonality flags to FALSE.

158
00:11:13,519 --> 00:11:17,260
And we'll just come up
with quick forecasts.

159
00:11:18,260 --> 00:11:24,162
We predicted, or we established,
a decent value alpha 0.024.

160
00:11:24,162 --> 00:11:29,010
And you can see here, a more sophisticated
routine is coming up with an alpha

161
00:11:29,010 --> 00:11:31,150
value really very close to that.

162
00:11:32,180 --> 00:11:33,830
So that's your smoothing parameter.

163
00:11:34,870 --> 00:11:36,760
You can make a prediction.

164
00:11:36,760 --> 00:11:37,310
You should do this.

165
00:11:37,310 --> 00:11:41,280
Take the code that we
developed a few screens ago.

166
00:11:41,280 --> 00:11:47,595
And instead of alpha is 0.2,
substitute this alpha value 0.02412151 and

167
00:11:47,595 --> 00:11:52,382
you should come up with the same
prediction that the routine does.

168
00:11:52,382 --> 00:11:56,702
So you should be able to come
up with the same forecasts for

169
00:11:56,702 --> 00:12:00,050
1913 as the HoltWinters routine does.

170
00:12:00,050 --> 00:12:05,562
What we see in this picture is
the smoothed average in red.

171
00:12:05,562 --> 00:12:07,733
These are all of your
forecasts right here.

172
00:12:07,733 --> 00:12:12,860
And I've superimposed that over
the general time series plot.

173
00:12:16,830 --> 00:12:21,380
At this point, you should be able to use
Simple Exponential Smoothing to make

174
00:12:21,380 --> 00:12:22,572
a simple forecast.

175
00:12:22,572 --> 00:12:25,859
And you should be able to,
in broad strokes,

176
00:12:25,859 --> 00:12:30,890
explain Simple Exponential Smoothing
to a friend or to a colleague.