1
00:00:02,240 --> 00:00:06,530
Everyone and welcome back to this class natural language processing in Python.

2
00:00:11,600 --> 00:00:16,490
In this lecture, we will continue looking at our code to implement a cipher, a decryption algorithm.

3
00:00:17,210 --> 00:00:20,780
We are finally ready to look at the results, as you recall.

4
00:00:20,930 --> 00:00:25,970
We have already stored the best mapping which we can use to produce our final decoded message.

5
00:00:31,590 --> 00:00:37,380
First, what we'd like to do is print out the log likelihood of this decoded message, along with the

6
00:00:37,380 --> 00:00:39,270
log likelihood of the true message.

7
00:00:40,700 --> 00:00:43,910
What is also interesting is to check which letters we got wrong.

8
00:00:45,300 --> 00:00:47,640
To do that, I'm going to loop through the true mapping.

9
00:00:48,270 --> 00:00:52,320
We'll call the key true and the value v inside the loop.

10
00:00:52,410 --> 00:00:55,830
We'll get the prediction by indexing the best map by V.

11
00:00:56,580 --> 00:00:58,020
We'll call the result pred.

12
00:00:58,830 --> 00:01:04,860
Remember that the best map is a reverse mapping because we were using it for decoding rather than encoding.

13
00:01:05,400 --> 00:01:10,470
So pred is the value in the reverse mapping, whereas true was the key in the true mapping.

14
00:01:11,250 --> 00:01:14,190
Then we check if true is not equal to PRED.

15
00:01:14,190 --> 00:01:16,590
And if that's the case, we print out both letters.

16
00:01:17,190 --> 00:01:20,580
As you can see, the results are close, but not exactly perfect.

17
00:01:22,000 --> 00:01:26,380
In fact, strangely, our log likelihood is higher than the true log likelihood.

18
00:01:26,890 --> 00:01:30,760
This seems strange because it seems like the real answer should be the maximum.

19
00:01:31,420 --> 00:01:33,610
We'll discuss the reasons for that in the conclusion.

20
00:01:35,630 --> 00:01:37,730
We can also see that we got a few letters wrong.

21
00:01:38,510 --> 00:01:40,070
We'll see why that is very shortly.

22
00:01:46,000 --> 00:01:48,940
In the next block of code, we do the somewhat obvious thing.

23
00:01:49,090 --> 00:01:53,050
Print out the actual decoded message and compare it with the true message.

24
00:01:53,650 --> 00:01:56,260
As you can see, most of it is correct.

25
00:01:57,470 --> 00:02:02,090
There are some things that are incorrect, as you can tell, since our mapping does not match the true

26
00:02:02,090 --> 00:02:03,140
mapping precisely.

27
00:02:03,980 --> 00:02:06,290
All right, so let's read the decoded message.

28
00:02:07,970 --> 00:02:14,000
I then lounge down the street and found, as I expected, that there was a muse in a lane which runs

29
00:02:14,000 --> 00:02:15,650
down by one wall of the garden.

30
00:02:16,280 --> 00:02:22,550
I lent the isolés a hand in rubbing down their horses and received an exchange to Pence a glass of half

31
00:02:22,550 --> 00:02:29,120
in half, two fields of shake tobacco and as much information as I could desire about Miss Adler to

32
00:02:29,120 --> 00:02:31,430
say nothing of half a duck in.

33
00:02:31,460 --> 00:02:32,450
So that one's wrong.

34
00:02:33,020 --> 00:02:38,720
Other people in the neighborhood in whom I was not in the least interested, but whose biographies I

35
00:02:38,720 --> 00:02:40,010
was compelled to listen to.

36
00:02:40,850 --> 00:02:45,770
So you can see that we got one character wrong here, even though there are more entries in the mapping

37
00:02:45,770 --> 00:02:46,310
that are wrong.

38
00:02:48,240 --> 00:02:53,760
In fact, if we did get something wrong, it's a necessary for at least two entries in the mapping to

39
00:02:53,760 --> 00:02:54,330
be wrong.

40
00:02:55,050 --> 00:02:58,110
This is because, of course, the letters in the mapping are unique.

41
00:02:58,440 --> 00:03:03,060
Therefore, if one is wrong, it must be replaced by some other letter, which is also wrong.

42
00:03:03,780 --> 00:03:07,950
Therefore, if we get something wrong, there must be at least two wrong entries.

43
00:03:08,580 --> 00:03:14,190
And you saw that in this particular run, we got three wrong entries, and this will vary from run to

44
00:03:14,190 --> 00:03:14,520
run.

45
00:03:16,000 --> 00:03:21,250
Actually, it's important to consider that when you're using genetic algorithms or evolutionary algorithms,

46
00:03:21,520 --> 00:03:26,380
it's important to run it multiple times to check whether you've really arrived at the best possible

47
00:03:26,380 --> 00:03:26,800
answer.

48
00:03:27,460 --> 00:03:30,220
This is just due to the inherent randomness in the algorithm.

49
00:03:30,940 --> 00:03:35,020
Sometimes you may get stuck at a local maximum or just start from a bad spot.

50
00:03:35,910 --> 00:03:38,310
In which case, running it again would be necessary.

51
00:03:39,240 --> 00:03:43,740
It's easy to see why we got this word wrong and why our model got some mappings wrong.

52
00:03:45,300 --> 00:03:50,390
The word we are looking for is does in and we mistakenly use a K instead of a z.

53
00:03:51,690 --> 00:03:57,960
That makes sense because Ozzy and Ziggy are probably very rare by grams being rare.

54
00:03:57,990 --> 00:04:01,770
Their probabilities are very small, leading to a smaller likelihood.

55
00:04:03,470 --> 00:04:10,880
The bigram OK and key are probably not as rare, simply because Kay is a more common letter than Z.

56
00:04:11,540 --> 00:04:16,519
Therefore, they lead to a higher log likelihood thus under our model assumptions.

57
00:04:16,790 --> 00:04:18,649
Using a K makes more sense.

58
00:04:19,130 --> 00:04:22,420
Our model doesn't know that Duncan is not a real word.

59
00:04:28,510 --> 00:04:33,910
You'll also notice that even though we only got one letter in the message wrong, our mapping has three

60
00:04:33,910 --> 00:04:34,500
letters wrong.

61
00:04:35,110 --> 00:04:37,930
Specifically, the extra letter here seems to be a cue.

62
00:04:38,590 --> 00:04:40,060
It makes sense to get that wrong.

63
00:04:40,450 --> 00:04:40,960
Why?

64
00:04:41,530 --> 00:04:44,380
Well, because there is no cue in the actual message.

65
00:04:44,830 --> 00:04:50,050
Therefore, we would have no idea where AQ should go because it doesn't contribute anything to the log

66
00:04:50,050 --> 00:04:50,650
likelihood.

67
00:04:52,910 --> 00:04:58,490
Finally, we have one last thing in the script, which is something I always like to do for every iterative

68
00:04:58,490 --> 00:04:59,000
algorithm.

69
00:05:00,100 --> 00:05:02,680
That is to plot the objective per iteration.

70
00:05:03,400 --> 00:05:09,340
We expect it to go up very fast the beginning and then taper off at the end as the algorithm converges.

71
00:05:10,000 --> 00:05:14,290
And this is the behavior you usually expect for most machine learning algorithms.

72
00:05:14,950 --> 00:05:20,200
And in fact, that's exactly what we see, which is a good sanity check for whether or not our algorithm

73
00:05:20,200 --> 00:05:20,800
is working.

