So, let's take a look at this in a slightly more sophisticated example, so I'm going to take the tokenizer as we had before. But, I'm also going to introduce this pad-sequences tool. The idea behind the pad-sequences tool is that, it allows you to use sentences of different lengths. And use padding or truncation, to make all of the sentences the same length, so in this case I have the same sentences as before. I love my dog, I love my cat, you love my dog, but I've added this new sentence. Do you think my dog is amazing? Which is a different length from these other sentences, these all had four words, this one has more. So my tokenizer, I'm going to create as before, but I'm also going to use this parameter called an OOV token. The idea here is that I'm going to create a new token, a special token that I'm going to use for words that aren't recognized, aren't in the word_index itself. So, I'm going to just create this, and I'm going to create something unique here, that I wouldn't expect to see in the corpus. Something like bracket OOV, and I'm going to specify my OOV token, is that. So then, I'm going to call tokenizer fit_ on_ texts sentences, and I'm going to take a look at the word_index for that. And let's actually run this, well see now that on my word_index, OOV is now value 1, my is value 2, love is 3 et cetera. And, we have a total of 11 unique words in this corpus, it's actually ten words plus the OOV token. So on the tokenizer, I can then convert the words in those sentences to sequences of tokens, by calling the texts_ to_ sequences method. And that's going to produce sequences, and that's what I'm printing out here, so my sequences are 5, 3,2, 4 for the first sentence, which is I I love my dog, 5324 et cetera. So these are the sequences {5,3,2,4} {5,3 2, 7} ,{ 6,3,2,4} and {8, 6, 9, 2,4 10, 11}. Now, we can see these are all different lengths, but we want to make them the same length. So that's where pad_ sequences comes into it, so I'm going to say here my padded set is pad _sequences with the sequences. I'm going to say, let's make it a maximum length of five words. So, this maximum length of five words, means that are these four words sentences end up being pre padded with 0. And the 6 word sentence, ends up having the first word cut off, because we did say maximum length equals 5. If I said maximum length equals 8 for example. And then ran this, we could see now that they're all pre padded with zeros, including this long sentences being pre padded with a single 0. There are methods on pad_sequences that we saw in the lessons, that will allow us to do it post. If we wanted to do so, and then the zeros would appear afterwards. So now, if I want to take a look at words that the tokenizer wasn't fit to. So for example, my text data is I really love my dog, and my dog loves my manatee. If I now tokenize them and create sequences out of that, we'll see {5,1, 3, 2, 4} for the first sentence. And 5 is I, 1 is out of vocabulary, because really wasn't actually there, and {3,2 ,4 } I still love my dog. So this is, how the out of vocabulary token comes into it, when it sees a word that wasn't in the word_index. it will replace it, it will just use the out of vocabulary token 1 for that, and similarly for my dog loves my manatee, I get {2, 4, 1, 2 ,1}. The word loves is not in it, even though the word love is, and of course manatee isn't in it either. So, I end up with just with {2, 4, 2} other words that really have meaning in this and that's my dog, my which is my dog. My and loves and manatee are out of vocabulary tokens, and of course here you can see, I'm also padding them. So, my {5,1, 3, 2, 4} gets padded and my {2, 4, 1, 2,1} also gets padded, because I said that maxlen=10. If I said that for example, to 2, we'll see they end up getting truncated, I'm getting the last two words here. So that's a basic introduction to, how tokenizer works, and how padding actually works. To give you padding, to be able to get your sentences all the same length, hope this was useful for you.