1
00:00:11,080 --> 00:00:17,320
One of the most popular applications of Time series analysis isn't finance, although this is not the

2
00:00:17,320 --> 00:00:22,510
only application of Time series, it's one that I know a lot of you are thinking about now.

3
00:00:22,510 --> 00:00:24,580
Unfortunately, a lot, of course, is out there.

4
00:00:24,580 --> 00:00:30,440
Simply take the price of a stock and treat that like any other time series seems completely reasonable.

5
00:00:31,090 --> 00:00:33,680
What's the difference between a one time series and another?

6
00:00:34,330 --> 00:00:38,140
Well, in this course you're going to learn that there's actually a huge difference.

7
00:00:39,070 --> 00:00:43,780
None of this is obvious at first, but I hope that throughout this course you will build your skill

8
00:00:43,780 --> 00:00:46,590
to a level necessary to understand what I'm saying.

9
00:00:47,170 --> 00:00:52,210
You might start this course with a naive perspective, but by the end of this course, this perspective

10
00:00:52,210 --> 00:00:53,120
will mature.

11
00:00:53,590 --> 00:00:59,560
If your logic right now is Lithium's plus stock price prediction equals profit, then I would consider

12
00:00:59,560 --> 00:01:02,510
you as someone who falls into this naive category.

13
00:01:03,520 --> 00:01:06,550
Now, don't consider this to be a bad thing, but a good thing.

14
00:01:07,180 --> 00:01:09,410
You are going to get the most out of this course.

15
00:01:10,090 --> 00:01:13,120
So this lecture is a Financial Times series primer.

16
00:01:13,510 --> 00:01:18,490
There are certain operations and definitions you simply have to know about when you're dealing with

17
00:01:18,490 --> 00:01:19,520
stock price data.

18
00:01:20,200 --> 00:01:25,120
Now, although I've taught financial engineering elsewhere very in depth, you can think of this lecture

19
00:01:25,120 --> 00:01:27,160
as like a summary of those concepts.

20
00:01:31,870 --> 00:01:37,990
OK, so clearly the stock price over time is an example of a time series, it's continuous, valued,

21
00:01:37,990 --> 00:01:40,150
and it can be thought of as discrete time.

22
00:01:40,660 --> 00:01:43,540
For example, when you download stock price data from Yahoo!

23
00:01:43,540 --> 00:01:48,610
Finance, you might get daily data or hourly data in regularly spaced intervals.

24
00:01:49,000 --> 00:01:50,410
In more advanced courses.

25
00:01:50,530 --> 00:01:55,510
You can think of stock prices as continuous time, although, as mentioned, that would be a separate

26
00:01:55,510 --> 00:01:56,140
course.

27
00:02:00,750 --> 00:02:06,640
Now, in practice, what we are interested in is not the stock price, but rather the stock return.

28
00:02:07,290 --> 00:02:09,500
Let's think about the intuition behind this.

29
00:02:10,080 --> 00:02:15,500
We learned earlier that when we talk about forecasting metrics, it's nice when they are a scale invariant.

30
00:02:16,020 --> 00:02:20,170
If your guess for the price of a one million dollar house is off by one thousand.

31
00:02:20,220 --> 00:02:21,180
That's not so bad.

32
00:02:21,630 --> 00:02:26,640
But if your guess for the price of a five dollar coffee is off by five dollars, that's a pretty bad

33
00:02:26,640 --> 00:02:27,160
yes.

34
00:02:27,660 --> 00:02:30,720
So it's more natural to think in terms of percentages.

35
00:02:31,230 --> 00:02:36,280
The percent change tells us how much money we've made or lost on a stock that we own.

36
00:02:36,870 --> 00:02:38,510
We call this the stock return.

37
00:02:39,030 --> 00:02:40,470
The equation is very simple.

38
00:02:40,470 --> 00:02:42,240
And of course, you've seen this before.

39
00:02:42,660 --> 00:02:47,310
It's the final price minus the initial price divided by the initial price.

40
00:02:51,960 --> 00:02:58,380
Now, in practice, because we index our prices by time measured in periodic intervals, it's common

41
00:02:58,380 --> 00:03:02,400
to consider the return for each of those periods also index by time.

42
00:03:03,030 --> 00:03:09,630
So we say our privacy is equal to T minus T minus one, divided by T minus one.

43
00:03:10,350 --> 00:03:15,480
Note that this is also equal to T, divided by T minus one, minus one.

44
00:03:16,560 --> 00:03:21,930
And also note that we sometimes call this the net return, although I usually won't make this distinction.

45
00:03:26,690 --> 00:03:33,770
So here's one modern reason why understanding returns is important, you see very often people considering

46
00:03:33,770 --> 00:03:40,340
investment into crypto currencies, they think Bitcoin is expensive because one Bitcoin costs fifty

47
00:03:40,340 --> 00:03:41,350
thousand dollars.

48
00:03:41,780 --> 00:03:45,880
On the other hand, some random cryptocurrency only cost a few cents.

49
00:03:46,310 --> 00:03:51,780
So they think that this random cryptocurrency is a good investment because it's quote unquote cheap.

50
00:03:52,220 --> 00:03:54,320
Of course, this fact is irrelevant.

51
00:03:54,680 --> 00:03:59,270
If you have one thousand dollars to spend, then you'll buy one fiftieth of a Bitcoin.

52
00:03:59,750 --> 00:04:05,280
Or if you buy random coin, then you'll buy 10000 random coins, assuming each one costs 10 cents.

53
00:04:06,350 --> 00:04:12,310
But owning ten thousand random coins doesn't give you more value than owning one fiftieth of a Bitcoin.

54
00:04:12,830 --> 00:04:18,290
If random coin goes down to five cents, you've lost 50 percent of your wealth and the value of your

55
00:04:18,290 --> 00:04:20,960
investment is now just five hundred dollars.

56
00:04:21,620 --> 00:04:26,660
If Bitcoin goes up to one hundred thousand, then you double your wealth and the value of your investment

57
00:04:26,840 --> 00:04:28,440
is now two thousand dollars.

58
00:04:28,820 --> 00:04:32,060
So, as Albert Einstein said, everything is relative.

59
00:04:36,610 --> 00:04:41,390
Now, you recall that a common time series transformation is to take the log of the data.

60
00:04:41,950 --> 00:04:44,900
In fact, this is central to financial analysis.

61
00:04:45,400 --> 00:04:48,570
The log of the price is simply called the log price.

62
00:04:49,690 --> 00:04:55,240
Note that we typically use lowercase letters for the log variable in uppercase letters for the original.

63
00:04:59,980 --> 00:05:05,650
Now, before we get to log returns, we're going to define another kind of return called the gross return.

64
00:05:06,220 --> 00:05:09,560
The gross return is simply one plus the return from before.

65
00:05:10,210 --> 00:05:11,530
So why is this useful?

66
00:05:12,070 --> 00:05:15,710
Well, it's a convenient way to see how much our wealth has multiplied.

67
00:05:16,150 --> 00:05:21,610
So, for example, if I invested one hundred dollars and I got back one hundred twenty dollars, then

68
00:05:21,610 --> 00:05:23,740
my gross return is one point two.

69
00:05:24,280 --> 00:05:27,430
In other words, my wealth was multiplied by one point two.

70
00:05:28,360 --> 00:05:33,610
If I invested one hundred dollars and I lost twenty dollars, then I now have eighty dollars in my gross

71
00:05:33,610 --> 00:05:35,050
return is zero point eight.

72
00:05:35,500 --> 00:05:38,620
In other words, my wealth was multiplied by zero point eight.

73
00:05:39,460 --> 00:05:42,880
So a gross return, less than one is a loss and a gross return.

74
00:05:42,880 --> 00:05:44,400
Greater than one is a gain.

75
00:05:49,010 --> 00:05:52,170
The log return is simply the log of the gross return.

76
00:05:52,820 --> 00:05:56,840
So why do we take the log of the gross return and not the log of the net return?

77
00:05:57,470 --> 00:06:00,180
Well, let's see why this is the most natural thing to do.

78
00:06:00,740 --> 00:06:03,830
We can start by noticing that the net return can be negative.

79
00:06:04,160 --> 00:06:09,410
If you lose twenty dollars on a one hundred dollar investment, then your net return is minus 20 percent.

80
00:06:09,950 --> 00:06:12,740
And of course, you can't take the log of minus 20 percent.

81
00:06:13,880 --> 00:06:19,130
Furthermore, you'll recognize that the log return corresponds to the log transformation where we had

82
00:06:19,130 --> 00:06:20,720
one before taking the log.

83
00:06:21,170 --> 00:06:26,120
So this provides some intuition behind why we had one and not some other random number.

84
00:06:27,440 --> 00:06:31,850
Finally, notice how the log return is simply the difference in log prices.

85
00:06:32,300 --> 00:06:38,000
This is very convenient, since in computers, adding and subtracting is much more efficient and numerically

86
00:06:38,000 --> 00:06:40,260
stable than multiplying and dividing.

87
00:06:40,880 --> 00:06:45,770
In fact, the first difference is a very important operation in Time series analysis.

88
00:06:46,110 --> 00:06:48,540
We'll see it applied again and again in this course.

89
00:06:49,040 --> 00:06:54,500
So it's kind of a happy coincidence that these financial concepts, such as taking the log and taking

90
00:06:54,500 --> 00:06:59,420
differences, happened to also be critical operations in TIME series analysis.

91
00:07:04,140 --> 00:07:09,270
OK, so in the next part of this lecture, we are going to take a quick look at what financial data

92
00:07:09,450 --> 00:07:10,620
actually looks like.

93
00:07:11,220 --> 00:07:17,330
The most common format for stock price data is open, high, low, close adjusted, close in volume.

94
00:07:17,910 --> 00:07:19,470
Note again, how time goes along.

95
00:07:19,470 --> 00:07:22,510
The rows and different attributes go along the columns.

96
00:07:23,070 --> 00:07:24,600
So what are these attributes?

97
00:07:25,650 --> 00:07:29,400
Well, recall that each row of data corresponds to a period in time.

98
00:07:29,580 --> 00:07:34,200
For example, one day or one hour, the open price is the price.

99
00:07:34,200 --> 00:07:37,440
At the beginning of the period, the closed price is the price.

100
00:07:37,440 --> 00:07:42,870
At the end of the period, the high price is the maximum price for the period and the low price is the

101
00:07:42,870 --> 00:07:44,400
minimum price for the period.

102
00:07:45,600 --> 00:07:48,730
Volume is the number of trades that occurred during the period.

103
00:07:49,260 --> 00:07:53,270
So if you've ever looked at a candlestick chart, you'll recognize these quantities.

104
00:07:53,610 --> 00:07:58,020
These charts basically give you a picture of what happened in the market for that period.

105
00:07:58,470 --> 00:08:03,360
We use the color red when the closed price is less than the open price and green when the closed price

106
00:08:03,360 --> 00:08:04,950
is greater than the open price.

107
00:08:09,580 --> 00:08:16,270
So what is adjusted, close adjusted closes a special column that accounts for stock splits and dividends

108
00:08:16,270 --> 00:08:17,440
in the close price.

109
00:08:17,980 --> 00:08:22,870
Note that the closed price is typically what is used for analysis, which is why there's no adjusted

110
00:08:22,870 --> 00:08:24,340
open or adjusted low.

111
00:08:24,970 --> 00:08:29,890
So basically, dividends are amounts that are paid in cash into your cash account.

112
00:08:30,460 --> 00:08:34,120
This is money you earn, but it effectively makes the stock price less.

113
00:08:34,630 --> 00:08:39,940
So the return you compute from the closed price is less if you do not account for the dividend payment.

114
00:08:40,660 --> 00:08:44,640
The net return, which takes into account dividend payments, is shown here.

115
00:08:44,830 --> 00:08:49,900
But note that we will not use this in the course since it just adds coding work without any benefit,

116
00:08:50,860 --> 00:08:55,780
we would have to spend extra effort in finding the dividend payments, which typically do not come with

117
00:08:55,780 --> 00:08:56,760
these data sets.

118
00:08:57,160 --> 00:09:00,760
You could potentially compute them on your own, but again, that takes work.

119
00:09:01,780 --> 00:09:07,510
Note that this equation makes sense because again, DFT, the dividend is money that you actually earn.

120
00:09:08,830 --> 00:09:14,080
Now, some resources out there suggest using the adjusted clothes when computing the return, which

121
00:09:14,080 --> 00:09:16,370
gives you an approximation to the true return.

122
00:09:17,320 --> 00:09:22,150
I'll leave it for you as an exercise to check whether or not they are equal in practice.

123
00:09:22,150 --> 00:09:26,830
If you're building a trading bot, it would be my preference to use the true values and actually accumulate

124
00:09:26,830 --> 00:09:29,830
the dividends instead of using the adjusted clothes.

125
00:09:34,510 --> 00:09:38,360
So the final component of the just the close is the stock split.

126
00:09:38,920 --> 00:09:44,530
Now, I mentioned this for informational purposes, but note that it's not actually needed in our analysis

127
00:09:44,770 --> 00:09:49,630
because all stock prices in our API will already be adjusted for stock splits.

128
00:09:50,290 --> 00:09:52,680
Basically, the reason for stock splits is this.

129
00:09:53,260 --> 00:09:56,400
Imagine that a share of a stock is one hundred thousand dollars.

130
00:09:56,770 --> 00:10:01,960
This is too large for many people to afford and it's not possible to buy fractional shares.

131
00:10:02,830 --> 00:10:09,010
So to ameliorate this problem, the stock will be split, for example, two for one split or a three

132
00:10:09,010 --> 00:10:09,680
for one split.

133
00:10:10,540 --> 00:10:15,910
This will result in the stock price going down by a factor of two for a two for one split or three for

134
00:10:15,910 --> 00:10:16,800
a three for one split.

135
00:10:17,470 --> 00:10:22,480
If you already own shares of the stock, you'll now own two or three times more so that the value of

136
00:10:22,480 --> 00:10:23,800
what you own is the same.

137
00:10:28,410 --> 00:10:34,410
So, as mentioned in this course, we will focus on the non adjusted close price, if you want to do

138
00:10:34,410 --> 00:10:40,050
an exact analysis, you're always welcome to download dividend data separately using whatever API you

139
00:10:40,050 --> 00:10:41,040
normally use.

140
00:10:41,640 --> 00:10:47,100
The reason we want to use the noninterest to close prices, as you recall, the other columns are not

141
00:10:47,100 --> 00:10:47,720
adjusted.

142
00:10:48,000 --> 00:10:53,970
So if you want to do a multi-dimensional analysis, this is not possible using the adjusted close since

143
00:10:53,970 --> 00:10:56,910
it's not on the same scale as the open, high and low values.
