Welcome back to Practical Time Series Analysis. We've talked in our previous lectures about the fundamental driving mechanisms that give rise to stochastic processes that create the time series that you might be interested in analyzing. We've looked at moving average processes, auto regressive processes, we've included trend, we've included seasonality. We're going to take a bit of a different approach in this lecture and begin to talk about forecasting. There are many methods here. We'll start with a very basic method, Simple Exponential Smoothing, which does enjoy widespread application in business and industry. It's something that people really do. We're going to try to make predictions about the future or forecasts, let's say, based upon data that we already have available to us. So you might be interested in predicting sales figures for the upcoming holiday season based upon what you've seen over the last several seasons. You might be interested in ridership on a train, a railway system. There are all sorts of reasons people have for looking at what's happening, or making good guesses about what's likely to happen in the future based upon data that you already have available. In this particular lecture, we'll use Simple Exponential Smoothing. And you'll be able to do this with time series data that you find interesting to make a simple forecast. As is often the case in these lectures, an explicit goal is, you should be able to explain Simple Exponential Smoothing to a friend or a colleague. What is it? How do you do it? What does it do for you? The data set that we'll be examining here is on London rainfall, primarily in the 19th century, getting into the 20th century a little bit. There's a nice discussion in A Little Book of R for Time Series, and you can also access the original data. Rather than just using a built-in data set from R, let's expand a little bit and grab these data right off the Internet. R has a nice facility, the scan command, that will allow you to go to a website, grab some data, and store it in an array. Once we have our numbers available to us, we'll create a time series object to get a little bit of structure and makes some calls. Now, if you've never thought about rainfall in London before, then it's good to do even the most vanilla things. Let's get a histogram of our rainfall data. We'll take a look at the distribution. We'll almost reflexively take a look about whether it's normally distributed or not. And close, not quite normally distributed. Looks like a systematic departure from normality, but nothing too extreme. Looking at the times series as a sequence, we look for the sorts of patterns we like to observe. If you feel that that's very difficult to do just by looking at the sequence, pull up the auto correlation function and take a look. As I'm looking at that data set, I'm kind of seeing noise. I really can't make myself see much of a structure in that data set. But maybe it's there, and I'm just not seeing it. We'll call auto.arima to see if we can see if we can get nice, fitted model. And even auto.arima says, no, sorry. The coefficients, the autoregressive moving average coefficients, nothing. But we do get an average so there is a model. It's just a very simple model. The model's just 24.8 So in light of this, we'll try to do a little bit of forecasting. There are different notations that people use here. And we'll look at one of the common ones in this lecture, and we'll see another common one in the next lecture. Where we'll let the subscript tell us where we'd like the forecast. So h is how many periods into the future you'd like to look. Maybe this is Tuesday and you'd like to look for Next Tuesday. So h would be 7, if you have daily data. The superscript tells you what data you're using when you're making your forecast, data up through time step n. The most naive forecasting method I can think of, is to say that what is going to happen tomorrow, our forecast for tomorrow is just what was happening today. That's considered a naive method. In the notation that we've developed, we would say x subscript n+1, there's the next period. Based upon data available at n is just your observed value at time period n. Now, some data have a pretty obvious seasonality to them. And we would say something like, the forecast that we'll make for the next time period, n+1 based upon data available up through and including time n is what was happening one season ago. So if we're dealing with weeks, capital S there would be a 7. Another way of thinking about your forecasting is to say that we'll predict what's going to happen in the next period is just an average of what's happened previously. Simple Exponential Smoothing tries to do a little bit better than just a plain old average. It's going to develop a weighting of previous values. So in our current data set, we're going to try to predict the rainfall, the London rainfall in a future period based upon the data. And we'll try to be aware of updating on our data set. So instead of just taking an average and including all of the data points equally, what we'll try to do is try to weight the data points that are closer to us a little bit more and those that are further away a little bit less. We're more formal in the reading. In the readings, we'll deal with geometric series and see that we can weight our averages through geometric series. We'll also show, and this is not very deep, that rather than include the infinite number of data points, what you can do is treat this as a weighted average of, we'll say, for instance, you can see right here. We'll start with some data value, x sub 1. And we'll make a forecast for x sub 2 just based upon x sub 1, we have a pretty meager amount of information as we just get started. Then we will say, okay, so if you would like to make a forecast for time period three based upon data available in time period two, let's take our previous smooth level value, our previous averaged value, give that a weighting of 1-alpha. But we'll update it by looking at the freshly available data point x sub 2. This is the common pattern. We'll take alpha times your freshest data point + 1-alpha times your previous forecast. So if you'd like to make a forecast about time period 4, you'll take alpha times your new value at time 3 and add to it 1-alpha times your previous level or your previous forecast. Some of us learn by writing code, and that's what we'll try to do right now. It's rather simple, we just need a fore loop. So in our little DIY, do it yourself code, we'll let alpha = .2. And that's totally unmotivated at this point. We'll see how to choose a good alpha in just a moment. But we'll let alpha = .2. We'll create a vector forecast.values, and we'll set it to NULL. We're just trying to establish array that we can use in our loop. n, of course, is the length of the data that you have available to you. Your first forecast is just going to be your first data point. And now, we'll loop to get more forecasts. So we'll create forecast values as alpha times your updated your freshly available data point, + (1-alpha) times your previous forecast, your previous level. A little before formatting, we'll use the past command so it looks nice on the screen when we actually give our forecast. So for the year 1913, based upon data available up through, including the year 1912, the forecast value using the unmotivated, almost random alpha of 0.2 would be 25.3 inches of rain. But let's drill down on this a little. How could you choose alpha intelligently? How much weighting do you want to give to values that are close at hand? And how much weighting do you want to give to values that were further away? In this particular data set, it looks like the best alpha, best in terms of making our sums of squares errors or SSE as small as possible. Best alpha seems to be really rather small. Back around, it's hard to read off of this picture, and so I've blown it up here, back around 0.024 or so. We use the SSE approach, so we'll make a forecast for time period three. And then, we'll compare that to the actual data point that we have available at time period three. Make a forecast at time period four, compare it to the actual data point at time period four. In each time, we'll, of course, take a different square and then add them up to get an aggregate error. Now, such a common approach, of course, people have written routines for you. HoltWinters, these are names that will become famous to even us as we look into the next lectures. HoltWinters is a routine available in R, and it implements the work of these two mathematicians. This is from the years 1957, 58, 1960, there abouts. We'll grab the time series for rain. There are going to be three parameters that we'll be keeping track of in the next couple of lectures. We'll deal with level, trend, and seasonality. So this is a little unmotivated at this point, but we'll turn the trend and the seasonality flags to FALSE. And we'll just come up with quick forecasts. We predicted, or we established, a decent value alpha 0.024. And you can see here, a more sophisticated routine is coming up with an alpha value really very close to that. So that's your smoothing parameter. You can make a prediction. You should do this. Take the code that we developed a few screens ago. And instead of alpha is 0.2, substitute this alpha value 0.02412151 and you should come up with the same prediction that the routine does. So you should be able to come up with the same forecasts for 1913 as the HoltWinters routine does. What we see in this picture is the smoothed average in red. These are all of your forecasts right here. And I've superimposed that over the general time series plot. At this point, you should be able to use Simple Exponential Smoothing to make a simple forecast. And you should be able to, in broad strokes, explain Simple Exponential Smoothing to a friend or to a colleague.