1
00:00:11,640 --> 00:00:17,430
So interestingly although we are doing something like binary classification our model does not actually

2
00:00:17,430 --> 00:00:19,590
output a binary decision.

3
00:00:19,590 --> 00:00:23,460
It only outputs distances between matches and non matches.

4
00:00:23,700 --> 00:00:31,650
Thus it's actually up to us to choose a threshold at which we consider two images to be a match or not.

5
00:00:31,650 --> 00:00:37,980
Now you might think since we push the matches toward zero and the non matches to at least one then the

6
00:00:37,980 --> 00:00:41,270
threshold should be in the middle like zero point five.

7
00:00:41,340 --> 00:00:44,690
But in fact the threshold we choose is a hyper parameter.

8
00:00:44,820 --> 00:00:46,370
It could be less than zero point five.

9
00:00:46,380 --> 00:00:48,150
Or it could be more.

10
00:00:48,420 --> 00:00:53,760
One thing you'll see is that we are going to get really good separation on the train set but not as

11
00:00:53,760 --> 00:01:01,290
good on the test said What this tells us is that it may be worthwhile to have three data sets train

12
00:01:01,290 --> 00:01:03,280
of validation and test.

13
00:01:03,280 --> 00:01:09,520
And this way we could use the validation set to choose a threshold which is often what we use to evaluate

14
00:01:09,760 --> 00:01:12,440
all other hyper parameters as well.

15
00:01:12,550 --> 00:01:20,310
Then when we've chosen our hyper parameters we can test our model on the true test set before we start.

16
00:01:20,500 --> 00:01:25,960
Let's remember that when we're working with PI torch we need to carry our arrays from num pi land to

17
00:01:25,960 --> 00:01:29,860
PI torch land before making predictions and vice versa.

18
00:01:29,860 --> 00:01:34,120
And so here's a convenience function called predict that does all that for us.

19
00:01:34,390 --> 00:01:38,020
It takes in to num pi arrays of images x1 and x2.

20
00:01:38,200 --> 00:01:44,990
Then inside the function it converts X1 next to the torch tenses and moves them to the GP you.

21
00:01:45,040 --> 00:01:49,320
Next we pass X1 and next two into our model to get a prediction.

22
00:01:49,510 --> 00:01:53,440
Bring it back to the CPSU and then bring it back to num by land.

23
00:01:53,440 --> 00:01:57,550
Finally we call the flat in function since the model returns and end by one output.

24
00:01:57,700 --> 00:01:59,530
But we just want a one dimensional array

25
00:02:02,660 --> 00:02:07,310
so let's look at this code to calculate the accuracy of our model now.

26
00:02:07,330 --> 00:02:13,350
You recall that with this particular dataset one inherent property of it is that we have imbalanced

27
00:02:13,350 --> 00:02:14,560
classes.

28
00:02:14,670 --> 00:02:17,700
There are many more non matches than matches.

29
00:02:17,700 --> 00:02:23,240
Therefore accuracy is not a reliable indicator of how good the model is.

30
00:02:23,250 --> 00:02:29,370
Instead we'll report the true positive rate and the true negative rate also known as the sensitivity

31
00:02:29,430 --> 00:02:31,190
and specificity.

32
00:02:31,320 --> 00:02:37,620
So that's one of the things we're going to do in this function get train accuracy as you can see it

33
00:02:37,620 --> 00:02:43,620
accepts one argument the threshold which will be used to assign a classification decision.

34
00:02:43,700 --> 00:02:49,610
We're also going to make a histogram plot of both the matches and non matches so that we can appropriately

35
00:02:49,610 --> 00:02:56,330
choose a threshold must store the distances we find for the matches in a list called Positive distances

36
00:02:56,840 --> 00:03:01,790
and we'll store the distances we find for the non matches analysts called negative distances

37
00:03:04,520 --> 00:03:10,820
we initialize T.P. t n FP NF n to zero and we'll accumulate these as we loop through the data

38
00:03:14,210 --> 00:03:19,470
so first we're going to initialize to arrays X batch 1 and x batch 2.

39
00:03:19,490 --> 00:03:24,920
Note that in this case we want to loop through every sample we have not just sample from the non match

40
00:03:24,920 --> 00:03:26,770
class like we did in the generator.

41
00:03:27,290 --> 00:03:31,430
So that's why we have a custom loop here and we're not using the generators we defined earlier

42
00:03:34,200 --> 00:03:34,980
inside the loop.

43
00:03:34,980 --> 00:03:38,550
We grab a batch of indices so that's pours back in is easier

44
00:03:41,670 --> 00:03:46,140
and you'll notice that we have two loops so there's one for the positive list and there's one for the

45
00:03:46,140 --> 00:03:54,420
negatives list so the tuples we encounter right now are positive matches.

46
00:03:54,450 --> 00:04:04,570
Next we fill up X batch 1 and x batch 2 so that's pretty simple it's the same as before.

47
00:04:04,640 --> 00:04:13,100
Next we use our predict function to get the distances for each pair of images so that distances.

48
00:04:13,470 --> 00:04:18,080
And then we convert these into a list and store it in our positive distances list.

49
00:04:19,850 --> 00:04:24,300
Finally we check how many of these distances are less than the threshold.

50
00:04:24,410 --> 00:04:27,320
That's how many true positives we have.

51
00:04:27,320 --> 00:04:30,480
Then we check how many distances are greater than the threshold.

52
00:04:30,560 --> 00:04:32,530
And that's how many false negatives we have.

53
00:04:36,700 --> 00:04:44,010
Next we have the same loop but this time over the train and negatives list this loop is largely the

54
00:04:44,010 --> 00:04:44,910
same.

55
00:04:44,910 --> 00:04:51,060
We gather the batch of data we pass it to model dot predicts to get the distances but this time we add

56
00:04:51,060 --> 00:04:57,920
the distances to the negative distances list any distances which are less than the threshold are now

57
00:04:57,920 --> 00:05:01,860
false positives and he distances greater than the threshold.

58
00:05:01,940 --> 00:05:03,140
Our true negatives

59
00:05:08,010 --> 00:05:11,100
once we are outside the loop we are done counting.

60
00:05:11,220 --> 00:05:14,280
So we can calculate the true positive rate and the true negative rate.

61
00:05:15,960 --> 00:05:22,940
And finally we plot a histogram of the negative distances and positive distances now because there will

62
00:05:22,940 --> 00:05:26,330
be many more values in the negative distances list.

63
00:05:26,360 --> 00:05:31,790
Technically speaking this histogram should be much taller than the other one but we can pass in the

64
00:05:31,790 --> 00:05:33,840
argument density equals true.

65
00:05:33,980 --> 00:05:39,440
So that map plot liberal normalize the height of both histogram is essentially making them into probability

66
00:05:39,440 --> 00:05:45,380
distributions.

67
00:05:45,410 --> 00:05:46,450
Next we have a function.

68
00:05:46,460 --> 00:05:50,630
Get test accuracy which does the exact same thing but over the test set

69
00:05:54,110 --> 00:05:57,290
and next we have the results.

70
00:05:57,400 --> 00:06:02,480
Note that this takes some time because we have to loop over all the data points and generate the data

71
00:06:02,480 --> 00:06:03,630
on the fly.

72
00:06:03,710 --> 00:06:07,790
So I've printed out the batch number on each iteration of the loop

73
00:06:18,510 --> 00:06:19,390
once we're done.

74
00:06:19,410 --> 00:06:24,520
You can see that we get very good separation between the positive distances and the negative distance

75
00:06:24,520 --> 00:06:30,780
is all the positive distances are close to zero and all the negative distances are close to one or bigger

76
00:06:30,780 --> 00:06:33,090
than one as they should be.

77
00:06:33,270 --> 00:06:38,900
Theoretically we could choose a threshold that would give us 100 percent accuracy on the train set.

78
00:06:39,300 --> 00:06:44,580
But as I said earlier you probably want to do that with the validation set instead.

79
00:06:44,580 --> 00:06:47,750
Let's see why.

80
00:06:47,830 --> 00:06:53,200
So now we're going to look at the function and get test accuracy and I've pre computed this threshold

81
00:06:53,200 --> 00:06:53,940
of zero point eight

82
00:07:03,690 --> 00:07:04,010
okay.

83
00:07:04,060 --> 00:07:09,070
So you can see that we don't get as good separation between the positive distances and the negative

84
00:07:09,070 --> 00:07:10,230
distances.

85
00:07:10,240 --> 00:07:11,650
There is a bit of an overlap.

86
00:07:12,460 --> 00:07:17,650
Nonetheless we still get above 90 percent for both the true positive rate and the true negative rate.

87
00:07:20,000 --> 00:07:24,770
Keep in mind that the results here will be slightly different each time you run the script due to various

88
00:07:24,770 --> 00:07:26,130
factors.

89
00:07:26,240 --> 00:07:30,310
First it depends on the images that we chose for the train set and the test set.

90
00:07:30,560 --> 00:07:32,140
It also depends on the neural networks.

91
00:07:32,150 --> 00:07:36,580
Initial parameters hyper parameters and how long it was trained for.

92
00:07:36,620 --> 00:07:40,420
So I've gotten results better than this but I've also gotten results worse than this.

93
00:07:40,430 --> 00:07:43,300
So test it out on your own and see what you can get.