1
00:00:11,070 --> 00:00:16,680
In this lecture, we are going to discuss a very simple technique and time series analysis called the

2
00:00:16,680 --> 00:00:18,060
simple moving average.

3
00:00:18,540 --> 00:00:19,740
The way it works is this.

4
00:00:20,160 --> 00:00:24,690
We start with a time series, then suppose we have some fixed length window.

5
00:00:25,230 --> 00:00:31,140
Then at each point in the Time series, we drag the window along and we calculate the sample mean of

6
00:00:31,140 --> 00:00:32,550
all the points in that window.

7
00:00:33,030 --> 00:00:34,850
This is the simple moving average.

8
00:00:35,280 --> 00:00:36,270
Let's do an example.

9
00:00:41,100 --> 00:00:47,370
Suppose we have the numbers to zero eight three six one one six five five and let's say a window is

10
00:00:47,370 --> 00:00:48,380
of size three.

11
00:00:48,900 --> 00:00:51,450
So let's try to calculate the simple moving average.

12
00:00:52,020 --> 00:00:57,110
The first value is going to be the average of two zero and eight, which is three point thirty three.

13
00:00:57,810 --> 00:01:03,140
The second value is going to be the average of zero eight and three, which is three point sixty seven.

14
00:01:03,810 --> 00:01:08,980
The third value is going to be the average of eight, three and six, which is five point six seven.

15
00:01:09,600 --> 00:01:14,160
I would encourage you to try and calculate the rest of the values yourself in case you want a better

16
00:01:14,160 --> 00:01:15,900
understanding of how this is done.

17
00:01:16,830 --> 00:01:22,170
Note that we cannot fill in the first two values, since by definition we don't have three values in

18
00:01:22,170 --> 00:01:23,700
which to calculate an average.

19
00:01:24,210 --> 00:01:29,700
Therefore, this works like the financial return where we leave the values we can calculate as an.

20
00:01:34,810 --> 00:01:40,030
So what is the purpose of the simple moving average, although it might seem extremely simple, it can

21
00:01:40,030 --> 00:01:41,290
actually be quite useful.

22
00:01:41,800 --> 00:01:45,730
For example, suppose that we need to calculate the mean and variance of a stock.

23
00:01:46,120 --> 00:01:49,670
We might use these values to characterise that stock's distribution.

24
00:01:50,620 --> 00:01:53,650
Well, we know that stock returns exhibit volatility clustering.

25
00:01:53,800 --> 00:01:55,350
So we have at two options.

26
00:01:55,780 --> 00:02:01,120
One, we can use all the values in our data set to calculate the mean and variance, or we can just

27
00:02:01,120 --> 00:02:04,460
use the most recent values specified by the window size.

28
00:02:04,960 --> 00:02:10,450
This might give us a better estimate, since probably the return at six months ago or six years ago

29
00:02:10,450 --> 00:02:12,540
might not be relevant for the return today.

30
00:02:13,150 --> 00:02:16,940
Therefore, a moving average might be more useful than an overall average.

31
00:02:17,890 --> 00:02:23,710
In addition, learning the basic calculation behind the simple moving average also sets us up to learn

32
00:02:23,710 --> 00:02:28,300
about the exponentially weighted moving average, which then sets us up to learn a more complicated

33
00:02:28,300 --> 00:02:30,110
at times series forecasting methods.

34
00:02:30,520 --> 00:02:34,870
So think of this as a stepping stone to the more advanced methods in the section.

35
00:02:39,550 --> 00:02:44,980
Let's not discuss how the simple moving average will look in code for this, it's convenient to use

36
00:02:44,980 --> 00:02:46,860
pandas in pandas.

37
00:02:46,870 --> 00:02:52,280
We use the rolling function to specify that we want to do some calculation on a rolling window.

38
00:02:52,810 --> 00:02:55,460
The input argument to this is the window size.

39
00:02:56,110 --> 00:03:00,730
Note that this doesn't return the simple moving average right away, since there are actually multiple

40
00:03:00,730 --> 00:03:03,560
things you might want to calculate from the rolling window.

41
00:03:03,970 --> 00:03:07,840
In fact, this just returns that rolling window itself in pandas.

42
00:03:07,850 --> 00:03:11,880
This is a rolling object from the rolling window return value.

43
00:03:12,130 --> 00:03:17,470
We can calculate different kinds of statistics, such as the sum of the mean, the min, the max, the

44
00:03:17,470 --> 00:03:18,840
variance and so forth.

45
00:03:19,780 --> 00:03:25,030
If you have a rolling window of multiple columns, you can also calculate multi-dimensional statistics

46
00:03:25,210 --> 00:03:27,490
such as the covariance and correlation.

47
00:03:28,120 --> 00:03:31,630
In the next lecture, we will apply this code to actual data.
