1
00:00:14,850 --> 00:00:21,900
So now let's talk about the all important lesson, which is assessing model performance, so we're going

2
00:00:21,900 --> 00:00:26,280
to take a look at the metrics we use to assess model performance.

3
00:00:26,910 --> 00:00:30,090
So let's do a quick recap on something here.

4
00:00:30,690 --> 00:00:36,720
Remember, when we're training a model, you have to split your dataset into two or even three segments.

5
00:00:37,350 --> 00:00:40,860
So generally, let's just think about it as two segments.

6
00:00:40,860 --> 00:00:45,690
For now, even though this diagram has three, I will mention the two at one afterward and the explanation.

7
00:00:45,690 --> 00:00:46,500
So you understand.

8
00:00:46,980 --> 00:00:52,320
But remember, we're training a model with something data that's a training data.

9
00:00:52,740 --> 00:00:56,040
The model sees that data and tries to learn from that data.

10
00:00:56,580 --> 00:00:57,390
That's important.

11
00:00:57,690 --> 00:01:01,320
That's basically what the model uses as its foundation for knowledge.

12
00:01:01,970 --> 00:01:09,090
However, how do we know that the model hasn't only learned what the training data says and then can't

13
00:01:09,090 --> 00:01:15,000
actually reproduce stuff in the real world with unseen data because you can have a model that performs

14
00:01:15,000 --> 00:01:16,950
a very, very well in our training dataset.

15
00:01:17,340 --> 00:01:20,500
However, when it sees new data, it performs absolutely poorly.

16
00:01:21,000 --> 00:01:25,860
And that is called overfitting, by the way, which we'll talk about in later examples.

17
00:01:26,280 --> 00:01:28,980
But for now, this is how we split.

18
00:01:28,980 --> 00:01:31,080
Or do you have a training dataset?

19
00:01:31,530 --> 00:01:32,970
You have a validation dataset.

20
00:01:32,970 --> 00:01:37,890
The validation dataset is what you've seen, especially when training to Kerry's model because they

21
00:01:37,890 --> 00:01:44,700
actually specifically state validation laws and validation accuracy after each epoch, after each,

22
00:01:44,880 --> 00:01:46,440
after the data input data.

23
00:01:46,770 --> 00:01:48,750
All of it has been passed through the network.

24
00:01:49,230 --> 00:01:54,690
You didn't test the performance of your network on the validation data just to see how well it loose.

25
00:01:55,230 --> 00:01:57,420
And that's basically the training process here.

26
00:01:58,320 --> 00:02:00,300
So what about a test dataset?

27
00:02:00,690 --> 00:02:06,120
Well, oftentimes you'll see people use validation and test interchangeably.

28
00:02:06,690 --> 00:02:08,280
And I tend to do it as well.

29
00:02:08,280 --> 00:02:12,960
Everyone tends to do it because most times you don't have to split your datasets into tree.

30
00:02:13,440 --> 00:02:17,700
You can split it, split into two just to train and test generally what it's called.

31
00:02:18,270 --> 00:02:25,860
And if you would testing that on a two dataset, that's a tiered, unseen dataset, separate set, which

32
00:02:25,860 --> 00:02:26,850
we will call the test here.

33
00:02:27,690 --> 00:02:30,930
That's basically a way to assess your final multiple model performance.

34
00:02:31,500 --> 00:02:37,470
A lot of people who trained Kaggle models will be will be familiar with that concept where you can have

35
00:02:37,470 --> 00:02:42,330
a training dataset, you can split that training with a setting to train and test, and then you have

36
00:02:42,330 --> 00:02:48,210
your final performance test or validation data set at the end of the competition.

37
00:02:49,650 --> 00:02:52,350
So let's take a look at example of the training process here.

38
00:02:52,890 --> 00:03:00,450
So remember, we pass all of the images true over network forward propagate, then get the losses,

39
00:03:00,840 --> 00:03:04,950
then use back propagation with gradient descent to optimize the weights.

40
00:03:05,190 --> 00:03:11,320
That's basically what happens during one epoch, and then we test it on our validation dataset to get

41
00:03:11,340 --> 00:03:15,990
lost loss accuracy to basically to see how it performs at that point.

42
00:03:16,950 --> 00:03:23,700
Then we do this again for eyeball to eyeball tree epoch and all the way up to what epoch and it's off.

43
00:03:24,240 --> 00:03:25,220
So that's when we end.

44
00:03:25,230 --> 00:03:27,690
Typically, that could be 50 epochs, but it's up to you.

45
00:03:27,690 --> 00:03:35,040
It's up to the network, although how much time you have, how good you want and how much time you have

46
00:03:35,040 --> 00:03:36,330
to devote for your model training.

47
00:03:36,930 --> 00:03:39,240
But generally, 50 is a good number to equate.

48
00:03:40,470 --> 00:03:44,560
And then in the end, you can actually test that model that you've after.

49
00:03:44,580 --> 00:03:50,490
You've trained here on an unseen test dataset to get the loss of accuracy.

50
00:03:50,880 --> 00:03:54,270
That's the best practice when training deep learning networks.

51
00:03:55,140 --> 00:03:59,220
So let's take a look at some of the basic performance metrics we use.

52
00:03:59,850 --> 00:04:04,650
So what do we use to calculate what is back propagation?

53
00:04:04,650 --> 00:04:07,890
Use back propagation uses our training loss.

54
00:04:08,220 --> 00:04:10,040
That's how it updates to gradient weights.

55
00:04:10,170 --> 00:04:10,440
Do we?

56
00:04:10,440 --> 00:04:16,830
It's on the network as we use as we apply the chain rule downstream, the network gets right to left

57
00:04:17,250 --> 00:04:18,480
on during training.

58
00:04:18,960 --> 00:04:20,730
So we use the training loss for that.

59
00:04:21,090 --> 00:04:23,370
So that's the first metric which you've seen before.

60
00:04:23,790 --> 00:04:27,600
Training accuracy is the other metric that we tend to look at.

61
00:04:28,470 --> 00:04:34,380
Treating accuracy just simply tells you how well your model is performing on the training dataset.

62
00:04:34,390 --> 00:04:39,900
So how many of the training training categories that all model gets right after during the training

63
00:04:39,900 --> 00:04:46,500
process, then we have a test or validation loss and test and validation accuracy as well.

64
00:04:46,800 --> 00:04:49,470
So you can see these are all key metrics here.

65
00:04:50,670 --> 00:04:52,740
So take a look at something here.

66
00:04:53,430 --> 00:04:55,560
What if a were handwritten?

67
00:04:55,560 --> 00:05:03,120
The digit model had issues with identifying its entries, so you can see this is the actual label here

68
00:05:03,630 --> 00:05:07,680
and this predicted tree when it's actually it.

69
00:05:08,430 --> 00:05:12,210
And in this case, here actually was also it, and it predicted a tree.

70
00:05:12,480 --> 00:05:14,220
I mean, to be fair, I think.

71
00:05:14,350 --> 00:05:18,700
Has a tree, but let's assume maybe the pen skipped over here and it was in it.

72
00:05:19,150 --> 00:05:22,690
This looks more like an idiot, but it can easily be mistaken for a tree.

73
00:05:23,290 --> 00:05:27,250
But either way, how do we identify such problems with our model?

74
00:05:27,580 --> 00:05:30,910
How do you even know which classes are performing the worst, which are performing the best?

75
00:05:31,240 --> 00:05:32,440
So how do you even know?

76
00:05:32,530 --> 00:05:36,730
How do you identify this problem that your model is predicting trees inside of it?

77
00:05:37,120 --> 00:05:40,480
Can we use the accuracy that we saw before?

78
00:05:41,580 --> 00:05:43,230
Can we use these training metrics?

79
00:05:43,650 --> 00:05:45,390
Well, no, we can't.

80
00:05:45,630 --> 00:05:52,470
What we need is something called the confusion matrix and Classification Report to identify these sort

81
00:05:52,470 --> 00:05:57,240
of intricate, intricate models, specific predispositions.

82
00:05:57,780 --> 00:06:03,510
So in the next section, we'll take a look at both the confusion matrix and classification report.

83
00:06:03,870 --> 00:06:08,940
This is a very important section in understanding how all models perform in the real world.

84
00:06:09,300 --> 00:06:11,430
So go over this lesson again.

85
00:06:11,430 --> 00:06:12,270
Go over the slides.

86
00:06:12,570 --> 00:06:16,050
Watch the videos a couple of times just to make sure you understand these lessons.

87
00:06:16,200 --> 00:06:16,620
Thank you.