So let's now take
a look at how to implement LSTMs in code. Here's my model where I've added the second layer as an LSTM. I use the tf.keras.layers.LSTM
to do so. The parameter passed
in is the number of outputs that I desire
from that layer, in this case it's 64. If I wrap that with
tf.keras.layers.Bidirectional, it will make my cell state
go in both directions. You'll see this when you
explore the model summary, which looks like this. We have our embedding and our bidirectional
containing the LSTM, followed by the two dense layers. If you notice the output from the bidirectional is now a 128, even though we told
our LSTM that we wanted 64, the bidirectional doubles
this up to a 128. You can also stack LSTMs like any other keras layer by
using code like this. But when you feed an LSTM
into another one, you do have to put
the return sequences equal true parameter
into the first one. This ensures that the outputs of the LSTM match the desired
inputs of the next one. The summary of the model
will look like this. Let's look at the impact
of using an LSTM on the model that we looked
at in the last module, where we had subword tokens.