1
00:00:00,000 --> 00:00:02,790
In the previous videos,
you experimented with

2
00:00:02,790 --> 00:00:05,580
using RNNs to predict
values in your sequence.

3
00:00:05,580 --> 00:00:07,890
The results were good but
they did need a bit of

4
00:00:07,890 --> 00:00:08,970
improvement as you were hitting

5
00:00:08,970 --> 00:00:11,370
strange plateaus in
your predictions.

6
00:00:11,370 --> 00:00:13,110
You experimented with using

7
00:00:13,110 --> 00:00:15,855
different hyperparameters and
you saw some improvement,

8
00:00:15,855 --> 00:00:18,030
but perhaps a better approach
would be to use

9
00:00:18,030 --> 00:00:21,150
LSTMs instead of RNNs
to see the impact.

10
00:00:21,150 --> 00:00:23,740
We'll explore that in this video.

11
00:00:23,990 --> 00:00:26,730
If you remember when
you looked at RNNs,

12
00:00:26,730 --> 00:00:28,350
they looked a little
bit like this.

13
00:00:28,350 --> 00:00:31,425
They had cells that took
patches as inputs or X,

14
00:00:31,425 --> 00:00:33,030
and they calculated a Y output

15
00:00:33,030 --> 00:00:34,545
as well as the state vector,

16
00:00:34,545 --> 00:00:36,450
that fed into the cell along with

17
00:00:36,450 --> 00:00:38,820
the next X which then
resulted in the Y,

18
00:00:38,820 --> 00:00:41,235
and the state vector and so on.

19
00:00:41,235 --> 00:00:44,360
The impact of this
is that while state

20
00:00:44,360 --> 00:00:46,745
is a factor in
subsequent calculations,

21
00:00:46,745 --> 00:00:50,350
its impacts can diminish
greatly over timestamps.

22
00:00:50,350 --> 00:00:52,790
LSTMs are the cell state

23
00:00:52,790 --> 00:00:54,560
to this that keep
a state throughout

24
00:00:54,560 --> 00:00:56,450
the life of the training so that

25
00:00:56,450 --> 00:00:58,580
the state is passed
from cell to cell,

26
00:00:58,580 --> 00:01:01,745
timestamp to timestamp, and
it can be better maintained.

27
00:01:01,745 --> 00:01:03,170
This means that the data

28
00:01:03,170 --> 00:01:04,880
from earlier in
the window can have

29
00:01:04,880 --> 00:01:05,930
a greater impact on

30
00:01:05,930 --> 00:01:09,515
the overall projection
than in the case of RNNs.

31
00:01:09,515 --> 00:01:12,095
The state can also
be bidirectional

32
00:01:12,095 --> 00:01:14,585
so that state moves
forwards and backwards.

33
00:01:14,585 --> 00:01:17,340
In the case of texts,
this was really powerful.

34
00:01:17,340 --> 00:01:19,200
Within the prediction
of numeric sequences,

35
00:01:19,200 --> 00:01:20,430
it may or may not be, and it'll

36
00:01:20,430 --> 00:01:22,680
be interesting to
experiment with.

37
00:01:22,680 --> 00:01:25,805
I'm not going to go into
a lot of detail here but

38
00:01:25,805 --> 00:01:28,445
hundreds videos around
LSTM are terrific.

39
00:01:28,445 --> 00:01:29,690
From there, you can really

40
00:01:29,690 --> 00:01:32,580
understand how they
work under the hood.