1
00:00:00,000 --> 00:00:02,070
So let's now take
a look at how to

2
00:00:02,070 --> 00:00:04,155
implement LSTMs in code.

3
00:00:04,155 --> 00:00:06,030
Here's my model where I've added

4
00:00:06,030 --> 00:00:08,505
the second layer as an LSTM.

5
00:00:08,505 --> 00:00:13,245
I use the tf.keras.layers.LSTM
to do so.

6
00:00:13,245 --> 00:00:15,645
The parameter passed
in is the number of

7
00:00:15,645 --> 00:00:17,940
outputs that I desire
from that layer,

8
00:00:17,940 --> 00:00:19,815
in this case it's 64.

9
00:00:19,815 --> 00:00:24,585
If I wrap that with
tf.keras.layers.Bidirectional,

10
00:00:24,585 --> 00:00:27,870
it will make my cell state
go in both directions.

11
00:00:27,870 --> 00:00:30,480
You'll see this when you
explore the model summary,

12
00:00:30,480 --> 00:00:32,925
which looks like this.

13
00:00:32,925 --> 00:00:34,980
We have our embedding and our

14
00:00:34,980 --> 00:00:37,335
bidirectional
containing the LSTM,

15
00:00:37,335 --> 00:00:39,795
followed by the two dense layers.

16
00:00:39,795 --> 00:00:41,810
If you notice the output from

17
00:00:41,810 --> 00:00:44,705
the bidirectional is now a 128,

18
00:00:44,705 --> 00:00:47,825
even though we told
our LSTM that we wanted 64,

19
00:00:47,825 --> 00:00:51,560
the bidirectional doubles
this up to a 128.

20
00:00:51,560 --> 00:00:54,360
You can also stack LSTMs like

21
00:00:54,360 --> 00:00:57,440
any other keras layer by
using code like this.

22
00:00:57,440 --> 00:01:00,980
But when you feed an LSTM
into another one,

23
00:01:00,980 --> 00:01:03,544
you do have to put
the return sequences

24
00:01:03,544 --> 00:01:06,395
equal true parameter
into the first one.

25
00:01:06,395 --> 00:01:08,270
This ensures that the outputs of

26
00:01:08,270 --> 00:01:12,185
the LSTM match the desired
inputs of the next one.

27
00:01:12,185 --> 00:01:15,395
The summary of the model
will look like this.

28
00:01:15,395 --> 00:01:18,130
Let's look at the impact
of using an LSTM

29
00:01:18,130 --> 00:01:20,755
on the model that we looked
at in the last module,

30
00:01:20,755 --> 00:01:23,480
where we had subword tokens.