1
00:00:00,840 --> 00:00:02,640
Hi and welcome back to the course.

2
00:00:03,150 --> 00:00:09,480
So what I'm going to do in this chapter is give you a general introduction to convolutional neural networks.

3
00:00:09,930 --> 00:00:15,630
It's going to be a gentle overview and it hopefully develops your intuition and understanding of CNN's

4
00:00:15,630 --> 00:00:22,290
because we're going to go into CNN's and quite a bit of detail over the next, maybe about an hour of

5
00:00:22,290 --> 00:00:23,220
video lectures.

6
00:00:23,730 --> 00:00:27,660
So before we start, we need to ground you in your understanding of it.

7
00:00:28,350 --> 00:00:30,150
So let's take a look at images.

8
00:00:30,300 --> 00:00:31,380
What are images?

9
00:00:31,770 --> 00:00:33,030
Let's take a look at this cat.

10
00:00:33,570 --> 00:00:34,680
You know it's a cat, right?

11
00:00:35,160 --> 00:00:35,940
What about this?

12
00:00:36,330 --> 00:00:37,240
You know it's a dog.

13
00:00:37,860 --> 00:00:43,350
But what do what did we see in the images that told us that this was a cat and this was a dog?

14
00:00:43,950 --> 00:00:49,770
You can use you as a human who has learned the differences between cats and dogs from a very young age.

15
00:00:50,130 --> 00:00:51,570
You know what a cat looks like?

16
00:00:51,570 --> 00:00:55,230
It has these whiskers, these these big eyes.

17
00:00:55,260 --> 00:00:57,360
This is, you know what?

18
00:00:57,360 --> 00:00:58,380
The shape of a cat.

19
00:00:58,410 --> 00:01:00,030
And you know, dogs are quite distinct.

20
00:01:00,390 --> 00:01:04,260
So there are many things that told us it was a cat or dog.

21
00:01:05,040 --> 00:01:09,240
Things like whiskers, sheep's eyes, flicker color, all of the things we went through.

22
00:01:09,540 --> 00:01:14,860
But how does a computer or an algorithm or some software do this?

23
00:01:14,880 --> 00:01:17,940
How does it know whether it's a cat or a dog?

24
00:01:17,970 --> 00:01:19,830
How do we predict this from an image?

25
00:01:20,490 --> 00:01:24,150
That's what convolutional neural networks allow us to do.

26
00:01:24,630 --> 00:01:31,710
These are neural net is these are a type of neural networks that allow us to feed image images in and

27
00:01:31,710 --> 00:01:33,960
get a class output out of it.

28
00:01:34,080 --> 00:01:36,540
And that's the most basic sense of what it is.

29
00:01:37,200 --> 00:01:43,230
So let's build some intuition of how CNN's work so you can see this here.

30
00:01:43,320 --> 00:01:44,810
This looks like the number one.

31
00:01:44,820 --> 00:01:46,080
It actually is number one.

32
00:01:46,470 --> 00:01:47,460
Handwritten digit here.

33
00:01:47,730 --> 00:01:49,620
So what digit does it?

34
00:01:49,830 --> 00:01:50,490
It's the number one.

35
00:01:50,730 --> 00:01:53,820
You know that you know the digits from one to 10.

36
00:01:53,820 --> 00:01:57,360
You know, all of the symbols in all the letters, you know, a lot of things.

37
00:01:57,480 --> 00:02:00,750
So how did you actually know what what it was?

38
00:02:00,990 --> 00:02:05,040
So how would you tell a computer to actually figure out what is it?

39
00:02:05,700 --> 00:02:06,840
Was it the overall shape?

40
00:02:07,260 --> 00:02:08,460
Maybe it was something else.

41
00:02:08,940 --> 00:02:11,640
So what if maybe it lies only in this region?

42
00:02:12,270 --> 00:02:13,650
This, then it's a one.

43
00:02:14,100 --> 00:02:19,320
But there are many issues with that, because what if most of the slides as a one?

44
00:02:19,590 --> 00:02:24,290
But it was a piece that was outside here, like a seven or eight different shape, you know?

45
00:02:24,290 --> 00:02:25,230
And you know, that can happen.

46
00:02:26,100 --> 00:02:27,230
So what else?

47
00:02:27,270 --> 00:02:28,230
What if it was shifted?

48
00:02:28,470 --> 00:02:30,170
What if we shifted it out of this region?

49
00:02:30,750 --> 00:02:31,230
Then you can.

50
00:02:31,290 --> 00:02:33,180
Then you know, it's not.

51
00:02:33,180 --> 00:02:39,360
There's no longer one in your programming and you're hardcoded looking for a one window here.

52
00:02:39,810 --> 00:02:45,240
So how did how does a CNN, how does a neural network able to do this?

53
00:02:45,750 --> 00:02:53,070
Well, that's how basically neural networks or convolutional neural networks have many filters, and

54
00:02:53,070 --> 00:02:57,990
these filters allow us to extract features from different parts of the image.

55
00:02:58,140 --> 00:03:01,410
Here, I'll explain this to you and in the later chapters.

56
00:03:01,410 --> 00:03:05,980
But for now, think about this go back to the number one in your head.

57
00:03:06,780 --> 00:03:10,380
I can go about this in the slide, but I still have to go through all of these here.

58
00:03:10,380 --> 00:03:11,670
But think of the number one there.

59
00:03:12,150 --> 00:03:18,720
Imagine we had a filter that was looking for only like a thin, narrow, vertical shaded region.

60
00:03:19,470 --> 00:03:26,670
So imagine you have a specific filter and by filter, I mean like merging a little sliding window so

61
00:03:26,670 --> 00:03:32,310
that that when it goes over those pixels, it activates that's exactly what happens in neural networks.

62
00:03:32,730 --> 00:03:38,700
So imagine we have hundreds or thousands of these little filters, each little little filter looking

63
00:03:38,700 --> 00:03:40,770
for specific features in an image.

64
00:03:41,190 --> 00:03:47,700
And depending on which of those filters to run on, then that depends on the class.

65
00:03:47,940 --> 00:03:52,470
So you can see it outputs probability in the end of what the images.

66
00:03:52,470 --> 00:03:53,940
In this case, it's a cat.

67
00:03:55,200 --> 00:04:00,720
So that should ground you in a basic understanding of what CNN's are.

68
00:04:01,320 --> 00:04:03,060
This is a very long chapter.

69
00:04:03,870 --> 00:04:04,890
So we're going to do.

70
00:04:05,460 --> 00:04:11,910
We're going to do a lot of detail on CNN's starting with what a convolutions than what a feature detectors,

71
00:04:12,360 --> 00:04:18,870
how treaty filters work when you do competitions on 3D images, what's it like or color images?

72
00:04:19,560 --> 00:04:26,370
We're going to take a look at Colonel sidestepped patting stride activation activation littlies, which

73
00:04:26,370 --> 00:04:31,010
is the redo function pulling fully connected layers soft max.

74
00:04:31,020 --> 00:04:34,530
Then we're going to take a look at putting it all together to build the CNN.

75
00:04:35,160 --> 00:04:39,390
And then we're going to take a look at how we get our parameter counts from a CNN.

76
00:04:40,230 --> 00:04:46,080
And then we'll discuss why CNN's looks at wealth images and then we go into the training process, which

77
00:04:46,080 --> 00:04:49,980
is the inner workings nitty-gritty, the make CNN's come alive.

78
00:04:50,820 --> 00:04:55,980
We talk about loss, lost functions and back propagation gradient descent, and then we take a look

79
00:04:55,980 --> 00:04:59,580
at some advanced things like optimizes and learning read schedulers.

80
00:05:00,030 --> 00:05:02,430
And then we do a summary of students in the end.

81
00:05:02,850 --> 00:05:05,160
So that's it for this chapter.

82
00:05:05,580 --> 00:05:12,300
We'll see you in the next chapter where we discuss convolutions, so I'll stop there for now and I'll

83
00:05:12,300 --> 00:05:13,560
see you in the next section.

84
00:05:14,040 --> 00:05:14,460
Thank you.
