1
00:00:11,730 --> 00:00:17,320
At this point we finally understand how to fully build a convolution or neural network in python.

2
00:00:17,580 --> 00:00:21,830
Now we're ready to outline the upcoming code as usual.

3
00:00:21,840 --> 00:00:25,920
I'm going to repeat the steps of our analysis just to be sure.

4
00:00:25,950 --> 00:00:30,260
Step number one is to load in the data for the following examples.

5
00:00:30,270 --> 00:00:37,320
We'll be using more difficult datasets in particular fashion amnesty and see 410 fashioned amnesty was

6
00:00:37,320 --> 00:00:43,620
designed to be a drop in replacement for the amnesty data set which today is considered to be too easy

7
00:00:43,620 --> 00:00:49,790
for modern machine learning algorithms one might consider it to be a solved problem.

8
00:00:50,170 --> 00:00:56,560
Thus fashion amnesty was born and it was designed to be a standard benchmark just like amnesty but with

9
00:00:56,560 --> 00:01:04,640
the goalpost further ahead just like amnesty it's a data set of 28 by 28 grayscale images and it even

10
00:01:04,640 --> 00:01:07,220
has the same number of samples.

11
00:01:07,250 --> 00:01:13,020
The difference is that instead of simply handwritten digits we have images of different types of clothing.

12
00:01:13,190 --> 00:01:17,240
For example t shirts shoes pants and so forth.

13
00:01:17,240 --> 00:01:21,450
Your job is to classify these images into the correct category.

14
00:01:21,560 --> 00:01:25,580
The CFR 10 data set is older but more difficult.

15
00:01:25,580 --> 00:01:31,670
This dataset contains color images so the data takes up more ram and the images are also a bit bigger

16
00:01:31,820 --> 00:01:33,690
32 by 32.

17
00:01:33,800 --> 00:01:38,930
It contains classes such as automobile frog horse cat and dog.

18
00:01:38,930 --> 00:01:44,900
You can already imagine that with such tiny images it might be very easy to mistake a cat for a dog

19
00:01:44,900 --> 00:01:46,130
or a dog for a horse

20
00:01:51,340 --> 00:01:53,410
step number two is to build a model.

21
00:01:54,040 --> 00:01:58,660
Luckily we just discussed that in painstaking detail over the past few lectures.

22
00:01:58,750 --> 00:02:02,320
In case you forgot we'll be using convolution or neural networks

23
00:02:07,500 --> 00:02:09,690
step number three is to train the model.

24
00:02:09,690 --> 00:02:15,240
Step number four is to evaluate the model and step number five is to use the model to make predictions

25
00:02:16,580 --> 00:02:17,120
just like the.

26
00:02:17,120 --> 00:02:18,050
And then section.

27
00:02:18,050 --> 00:02:19,470
Nothing here changes.

28
00:02:19,490 --> 00:02:22,270
These steps are model agnostic.

29
00:02:22,520 --> 00:02:28,970
As per my rule all machine learning interfaces are the same one new thing you'll learn about later on

30
00:02:28,970 --> 00:02:34,730
in this section is what to do if your data set is too large to fit into memory or if you want to augment

31
00:02:34,730 --> 00:02:35,990
your data.

32
00:02:35,990 --> 00:02:42,180
For example we know that a dog facing left is a dog but a dog facing right is also a dog.

33
00:02:42,260 --> 00:02:48,140
So should our CNN not be able to recognize both of these orientations as a dog.

34
00:02:48,140 --> 00:02:49,910
The answer is yes.

35
00:02:49,910 --> 00:02:52,100
We also know that deep learning loves data.

36
00:02:52,100 --> 00:02:53,810
The more data the better.

37
00:02:53,810 --> 00:03:00,620
This is why deep learning has come to prominence in recent years so data augmentation is a method of

38
00:03:00,620 --> 00:03:04,460
virtually adding more data without actually having it.

39
00:03:04,490 --> 00:03:05,950
We'll discuss this more later on.

40
00:03:05,980 --> 00:03:09,140
But for our initial example we're going to keep things simple

41
00:03:14,290 --> 00:03:17,490
so let's elaborate on step number one loading in the data.

42
00:03:18,220 --> 00:03:23,710
Luckily both the data sets we'll be using for these examples are also included in torch vision.

43
00:03:23,890 --> 00:03:28,870
It's generally going to follow the same pattern as before you can even probably try to guess the function

44
00:03:28,870 --> 00:03:32,110
names and you might be right for a fashion M.A..

45
00:03:32,110 --> 00:03:39,440
We load in the data by calling torture vision that data sets that fashion feminist for a CFR 10 and

46
00:03:39,440 --> 00:03:44,020
we load in the data by calling torch vision datasets that see 410.

47
00:03:44,140 --> 00:03:45,370
Nothing too surprising there.

48
00:03:48,310 --> 00:03:53,830
Inside the training loop we can loop over the data sets in batches by using the data loader which you

49
00:03:53,830 --> 00:04:01,460
guessed it looks exactly the same as before.

50
00:04:01,470 --> 00:04:02,790
So what changes.

51
00:04:03,060 --> 00:04:05,920
Other than changing the model and changing the data functions.

52
00:04:05,940 --> 00:04:07,290
Nothing really.

53
00:04:07,320 --> 00:04:10,140
So this is a great time to remind you of my two rules.

54
00:04:10,260 --> 00:04:14,220
All data is the same and all machine learning interfaces are the same

55
00:04:19,330 --> 00:04:20,410
so just as a quiz.

56
00:04:20,410 --> 00:04:25,390
Let's think about which of these steps will be the same and which will be different.

57
00:04:25,390 --> 00:04:29,910
First the training loop as you know we'll be doing Batch gradient descent.

58
00:04:30,070 --> 00:04:32,640
Will this be the same or different.

59
00:04:32,680 --> 00:04:35,710
The answer is the same next question.

60
00:04:36,220 --> 00:04:39,430
Let's consider evaluating the accuracy of the model.

61
00:04:39,460 --> 00:04:41,900
Will this be the same or different.

62
00:04:41,920 --> 00:04:45,700
The answer is the same next question.

63
00:04:45,700 --> 00:04:50,900
Let's consider plotting the confusion matrix given the model predictions and model targets.

64
00:04:50,920 --> 00:04:53,380
Will this be the same or different.

65
00:04:53,380 --> 00:04:54,610
The answer is the same

66
00:04:59,690 --> 00:05:05,000
the only minor non-trivial thing I want to highlight is that the classes are going to be encoded as

67
00:05:05,000 --> 00:05:07,700
integers from 0 up to 9 inclusive.

68
00:05:08,180 --> 00:05:11,660
But how do we know which integer corresponds to which class.

69
00:05:11,660 --> 00:05:14,960
For example what if zero is dress and one is teacher.

70
00:05:14,960 --> 00:05:16,100
Is that obvious.

71
00:05:16,100 --> 00:05:17,400
Of course not.

72
00:05:17,420 --> 00:05:24,030
I just did the easy thing and look them up on the Internet and hardcoded them as strings this way when

73
00:05:24,030 --> 00:05:26,050
we look at the misclassified samples.

74
00:05:26,100 --> 00:05:31,500
It won't say something like 0 is misclassified as one which doesn't really mean anything but rather

75
00:05:31,640 --> 00:05:35,100
t shirt as misclassified as Drescher which is more interpretable.