1
00:00:00,060 --> 00:00:06,270
Hi and welcome back in this lesson, we'll take a look at training your own vision, transform a classifier

2
00:00:06,810 --> 00:00:07,570
in careers.

3
00:00:07,740 --> 00:00:11,190
So let's open the book 61 and we'll begin to lesson.

4
00:00:11,760 --> 00:00:17,640
So firstly, this column book comes from the official Karats tutorial site, and the credit goes to

5
00:00:17,640 --> 00:00:19,590
Khalid Salama for creating this.

6
00:00:20,190 --> 00:00:21,720
So what are we going to do?

7
00:00:21,750 --> 00:00:29,700
We're going to train a vision, transform a network, a pretty simple model on the Sofar 100 dataset

8
00:00:29,700 --> 00:00:35,250
so that Safa remember Safa attend to SAFA 100, which has 100 different image classes.

9
00:00:35,250 --> 00:00:40,600
So firstly, you need to install TensorFlow add ons, which I've already done here.

10
00:00:40,620 --> 00:00:45,450
So now let's load our libraries so we can begin to listen.

11
00:00:46,170 --> 00:00:49,230
So as I said, we have 100 classes and input.

12
00:00:49,230 --> 00:00:52,470
Image size for this lesson is 32 by 22.

13
00:00:53,640 --> 00:00:56,720
And you can see once a download data which might take.

14
00:00:57,180 --> 00:00:59,610
Shouldn't take too long, it's downloading pretty quickly.

15
00:01:00,690 --> 00:01:07,380
You can see the number of in the shape of the training dataset, as well as a test dataset here.

16
00:01:07,470 --> 00:01:14,040
So you can see we have 5000 images in the training set and 10000 in our test set.

17
00:01:14,520 --> 00:01:20,370
Next, we need to configure some type of parameters, and for now, we can just leave this at the default

18
00:01:20,370 --> 00:01:21,180
settings here.

19
00:01:21,250 --> 00:01:26,580
If you wanted to experiment later one, you can be just manipulating these parameters.

20
00:01:26,610 --> 00:01:32,120
I would encourage any number of epochs or learning rates, as well as with the key parameters.

21
00:01:32,130 --> 00:01:33,960
Maybe they can help but shouldn't help.

22
00:01:33,960 --> 00:01:36,090
Too much should still converge eventually.

23
00:01:37,620 --> 00:01:42,100
So now we're going to create our data augmentation of transform.

24
00:01:42,120 --> 00:01:48,630
So we're using random flip resizing, normalization, random rotation, as well as random zoom.

25
00:01:49,650 --> 00:01:56,280
So let's do that and now we need to implement our multilayer perceptual and the MLP head of the network.

26
00:01:56,820 --> 00:01:58,770
So let's do the deal.

27
00:02:00,450 --> 00:02:04,650
Next, we need to implement the patch creation, and we're created as a layer here.

28
00:02:04,740 --> 00:02:08,580
So let's create this class that those two patch creation here.

29
00:02:10,020 --> 00:02:13,260
Now we can you can display the patches if you want.

30
00:02:18,390 --> 00:02:19,380
There we go, see this.

31
00:02:19,410 --> 00:02:23,400
I have no idea what this image is, which someone could tell me.

32
00:02:24,420 --> 00:02:27,900
But anyway, this is what the image looks like as in the patch structure here.

33
00:02:28,620 --> 00:02:31,800
So, yeah, so all good so far.

34
00:02:32,370 --> 00:02:36,600
Now we need to create the momentum or the patch, including Leo.

35
00:02:36,780 --> 00:02:38,370
So let's do that as well.

36
00:02:38,580 --> 00:02:40,530
Remember the different building blocks we need for this?

37
00:02:41,550 --> 00:02:44,460
And finally, we need to build a V8 model.

38
00:02:45,030 --> 00:02:51,180
So this function here creates uses the players we created above to patch layers and all of those things

39
00:02:51,630 --> 00:02:57,030
to create the multiple layers of the transformer block so you can create multiple layers here defined

40
00:02:57,030 --> 00:02:58,680
by the number of transform lives.

41
00:02:59,400 --> 00:03:01,620
And finally, returns the model here.

42
00:03:01,770 --> 00:03:07,180
So this is a very good exercise if you want to figure out how we actually build these models here.

43
00:03:08,250 --> 00:03:13,110
Now this is a run that we trained the networks you can see with Defined Optimizer.

44
00:03:13,560 --> 00:03:20,460
We can pop up, compile a little model here and set some other points here where we only keep the best

45
00:03:20,460 --> 00:03:21,990
suites as well.

46
00:03:23,310 --> 00:03:28,140
Then we just have the history where the model is trained so we can look back and plot some graphs and

47
00:03:28,140 --> 00:03:29,460
then we just load the last.

48
00:03:29,610 --> 00:03:34,920
Well, the best model using that checkpoint felt that returned from the checkpoint above, and then

49
00:03:34,920 --> 00:03:36,270
we just evaluated the model.

50
00:03:36,360 --> 00:03:42,270
We get accuracy and top five accuracy, and we print this out and we return to history so we can plot

51
00:03:42,270 --> 00:03:43,530
those graphs after.

52
00:03:44,160 --> 00:03:49,260
So let's create our classifier here and then we can run the experiment.

53
00:03:49,920 --> 00:03:51,780
So let's do this.

54
00:03:51,780 --> 00:03:54,840
This may take a while to train, but we'll see how it goes.

55
00:03:55,320 --> 00:04:02,790
There's a summary here that shows us that after 100 epochs, the VAT achieved zero 55 percent accuracy

56
00:04:02,790 --> 00:04:04,770
and 82 percent top five accuracy.

57
00:04:05,590 --> 00:04:10,680
Now, that's pretty decent on so far so far 100, which has 100 classes.

58
00:04:11,280 --> 00:04:18,450
However, it's not that competitive compared to make cities resonate 50 version two and which achieves

59
00:04:18,450 --> 00:04:21,360
67 percent accuracy, which is much better here.

60
00:04:21,930 --> 00:04:29,670
So to do that, to get better results, what you can do is use a pre-trained VAT and then fine tune

61
00:04:29,670 --> 00:04:31,140
on this target dataset.

62
00:04:31,260 --> 00:04:35,700
Like I mentioned in the slides before, that's probably going to give you better results.

63
00:04:35,700 --> 00:04:41,820
I haven't actually experimented with that too much, but it would be an interesting project or homework

64
00:04:41,820 --> 00:04:43,440
lesson for some of you guys to do.

65
00:04:44,070 --> 00:04:49,620
So I'm not going to wait for this to trigger the 100 epochs, so it's going to close this off after.

66
00:04:50,160 --> 00:04:50,940
But you can.

67
00:04:51,090 --> 00:04:56,190
And if you want to examine the results and see if you get anything better, you'll feel free to do that.

68
00:04:56,680 --> 00:05:00,960
Clubmate books are free, and if you want to get access to the GPUs, you can upgrade to Core.

69
00:05:00,960 --> 00:05:02,370
That proved like I have.

70
00:05:02,880 --> 00:05:08,880
It's just 10 U.S. dollars a month and I'd say it's pretty much would it if you're going to be using

71
00:05:08,880 --> 00:05:09,870
GPUs a lot.

72
00:05:09,870 --> 00:05:15,840
So maybe for the time that you have this course would be a good idea to upgrade, to call out proof.

73
00:05:16,110 --> 00:05:18,030
Anyway, that's it for this lesson.

74
00:05:18,060 --> 00:05:19,130
Thank you for watching.

75
00:05:19,140 --> 00:05:23,580
And in the next lesson, we'll take a look at some other classifiers.

76
00:05:23,940 --> 00:05:25,140
So stay tuned for that.

77
00:05:25,590 --> 00:05:25,860
But.