1
00:00:00,890 --> 00:00:08,520
‫Our next step is to create architecture for our model, we will be using almost the same architecture.

2
00:00:09,120 --> 00:00:11,890
‫We will be using sequential keras api.

3
00:00:12,360 --> 00:00:15,630
‫And we will be adding four con layer.

4
00:00:15,930 --> 00:00:17,980
‫We will start with 32 filters.

5
00:00:18,240 --> 00:00:19,740
‫Then 64 filters.

6
00:00:19,920 --> 00:00:21,480
‫Then 128 filters.

7
00:00:21,810 --> 00:00:23,880
‫Then again, 128 filters.

8
00:00:25,780 --> 00:00:28,660
‫And after each of this con layer, we will.

9
00:00:29,080 --> 00:00:31,080
‫We are also applying pooling layer.

10
00:00:33,370 --> 00:00:37,840
‫And as always, the activation function is the RELU for all these layers.

11
00:00:39,430 --> 00:00:43,630
‫And after that, this time, we are also applying dropout layer.

12
00:00:46,080 --> 00:00:55,650
‫So what this layer will do is it will deactivate 50 percent of neurons during each epoch.

13
00:00:56,490 --> 00:01:01,170
‫It will randomly pick 50 percent of our neurons and it will deactivate them.

14
00:01:01,770 --> 00:01:08,190
‫And we will be training model with the remaining 50 percent of neurons during each epoch.

15
00:01:08,910 --> 00:01:15,860
‫So for each epoch, we are activating randomly 50 percent of our total neurons.

16
00:01:17,720 --> 00:01:24,780
‫We are using dropout here because dropout is a very effective layer to avoid overfitting in our model

17
00:01:24,780 --> 00:01:24,990
‫.

18
00:01:28,290 --> 00:01:30,710
‫So this is our a model architecture.

19
00:01:32,390 --> 00:01:34,670
‫Now, the next step is  to compile

20
00:01:35,660 --> 00:01:38,650
‫For lost function, we will be using binary cross entropy.

21
00:01:39,050 --> 00:01:41,030
‫Since we have two different classes.

22
00:01:42,640 --> 00:01:50,860
‫If you remember earlier in mnist, we had 10 different classes and there we were using sparce categorical

23
00:01:51,250 --> 00:01:52,100
‫Cross entropy.

24
00:01:52,750 --> 00:01:59,040
‫But here, since we have one the two classes, we are using binary Cross entropy for optimizer

25
00:01:59,050 --> 00:01:59,260
‫.

26
00:01:59,410 --> 00:02:03,820
‫We are using rms prop with learning rate of zero point zero zero one.

27
00:02:05,120 --> 00:02:10,390
‫And since this is a classification problem, we are calculating accuracy metrics as well.

28
00:02:13,590 --> 00:02:16,140
‫The next step is to train our model.

29
00:02:17,830 --> 00:02:26,390
‫Since we are taking our data from train generator, we have to use fit generator to fit our model.

30
00:02:26,530 --> 00:02:31,240
‫So we'll be using model dot fit generator, then train generator

31
00:02:32,270 --> 00:02:39,370
‫This the data generator, which will continuously generate data in the batches of 32 images.

32
00:02:41,120 --> 00:02:44,950
‫And then here we are using steps per Epoch as hundred.

33
00:02:46,970 --> 00:02:51,510
‫Earlier in our last model, we were using a batch size of 20 steps.

34
00:02:51,680 --> 00:02:57,120
‫Per Epoch as hundred because we only had 2000 images

35
00:02:57,140 --> 00:02:58,550
‫For training purposes.

36
00:02:59,420 --> 00:03:07,880
‫But this time, since we are randomly generating images from this transformation, we can use more than

37
00:03:08,160 --> 00:03:09,680
‫2000 images as well.

38
00:03:10,370 --> 00:03:16,060
‫This time we are using a batch size of 32 and steps per epoch as hundred.

39
00:03:16,770 --> 00:03:22,850
‫So overall, in each epoch we are feeding around three thousand two hundred images.

40
00:03:25,270 --> 00:03:28,780
‫The number of epochs this time is hundred.

41
00:03:30,240 --> 00:03:36,450
‫And similarly, we will use validation generator to get the validation data.

42
00:03:39,090 --> 00:03:46,050
‫Now, since we are running this four hundred epochs, if you are using a system with less than 16

43
00:03:46,050 --> 00:03:52,050
‫Gb of RAM and without any graphics card, it may take up to one and half to two hours to create

44
00:03:52,050 --> 00:03:52,580
‫this model.

45
00:03:55,020 --> 00:04:01,980
‫That's why I have already train this model and I have the data of over hundred epochs here.

46
00:04:04,230 --> 00:04:13,740
‫You can see that our validation accuracy is increasing with each epoch and at around  ninety two hundred

47
00:04:13,820 --> 00:04:14,110
‫Epoch

48
00:04:14,160 --> 00:04:20,340
‫We are getting a validation accuracy between 82 to 84 percent and.

49
00:04:21,440 --> 00:04:25,650
‫A training accuracy of around 84 to 85 percent

50
00:04:26,940 --> 00:04:35,190
‫So if you compare in our last model, we were getting training accuracy of around 95 to 98 percent

51
00:04:36,000 --> 00:04:41,140
‫and a significantly lower validation accuracy of our own 79 percent.

52
00:04:42,690 --> 00:04:49,460
‫In this model, we are getting almost same validation and our training accuracy of around 84 percent.

53
00:04:51,370 --> 00:05:00,070
‫So you can say that with our image pre processing and creating dummy images we have treated over fitting

54
00:05:00,160 --> 00:05:00,910
‫in our model

55
00:05:03,660 --> 00:05:11,220
‫After running this, you can save your model by model dot save method and lets just create this graph

56
00:05:12,150 --> 00:05:18,150
‫to see how our validation accuracy and training accuracy are changing with each epoch.

57
00:05:20,160 --> 00:05:23,280
‫So this orange and red lines are for accuracy.

58
00:05:24,240 --> 00:05:28,950
‫This orange line is for training accuracy and red line is for validation accuracy.

59
00:05:30,510 --> 00:05:36,600
‫You can see here that the validation accuracy is more than 80 percent as well as the training.

60
00:05:36,600 --> 00:05:38,760
‫Accuracy is also more than 80 percent.

61
00:05:40,320 --> 00:05:42,560
‫And both are moving together.

62
00:05:42,720 --> 00:05:45,420
‫So there are no evidence of overfitting.

63
00:05:45,660 --> 00:05:46,890
‫In our model

64
00:05:48,690 --> 00:05:50,670
‫And this is still increasing.

65
00:05:51,090 --> 00:06:00,930
‫So if you run it for, say, 40 or 50 more epochs, the validation accuracy may reach around 85, 86

66
00:06:00,930 --> 00:06:01,820
‫percent as well.

67
00:06:03,760 --> 00:06:05,230
‫So that's all for this video.

68
00:06:06,310 --> 00:06:14,620
‫We see that by augmenting our initial dataset, by applying sheer rotation, width shift, height, shift

69
00:06:15,120 --> 00:06:15,910
‫And flips

70
00:06:16,480 --> 00:06:24,590
‫We can treat overfitting fitting in our data and we can get a higher valuation accuracy from our model.

71
00:06:25,180 --> 00:06:25,630
‫Thank you.