There can be a limitation when approaching
text classification in this way. Consider the following.
Here's a sentence. Today has a beautiful blue. What do you think would
come next? Probably sky. Right? Today has
a beautiful blue sky. Why would you say that? Well, there's a big clue
in the word blue. In a context like this, it's quite likely that when we're talking about a beautiful
blue something, we mean a beautiful blue sky. So, the context word
that helps us understand the next word is very close to the word that
we're interested in. But, what about
a sentence like this, I lived in Ireland
so at school they made me learn how
to speak something. How would you finish
that sentence? Well, you might say Irish but you'd be much more
accurate if you said, I lived in Ireland so at school they made me learn
how to speak Gaelic. First of course, is
the syntactic issue. Irish describes the people, Gaelic describes the language. But more importantly
in the ML context is the key word that gives us the details about the language. That's the word Ireland, which appears much
earlier in the sentence. So, if we're looking
at a sequence of words we might lose that context. With that in mind an update
to RNNs is called LSTM, long short - term memory
has been created. In addition to the context
being PaaSed as it is in RNNs, LSTMs have an additional pipeline of contexts called cell state. This can pass through
the network to impact it. This helps keep context from
earlier tokens relevance in later ones so issues like the one that we just
discussed can be avoided. Cell states can also
be bidirectional. So later contexts can impact earlier ones as we'll see
when we look at the code. The detail about LSTMs
is beyond the scope of this course but
you can learn more about them in
this video from Andrew.