1
00:00:01,060 --> 00:00:04,780
‫In this lecture, we will understand the concept of a filter.

2
00:00:06,740 --> 00:00:14,610
‫We have been saying till now that a cell in the Convolutional layer gets information from a set of pictures

3
00:00:14,610 --> 00:00:22,010
‫cells or a set of cells in the previous layer, for example, that red cell in the Convolutional layer

4
00:00:22,010 --> 00:00:27,290
‫is getting information from these nine cells and this red rectangle.

5
00:00:29,550 --> 00:00:30,810
‫But what does this mean?

6
00:00:32,190 --> 00:00:35,580
‫How is it getting the information from all these pixels?

7
00:00:36,960 --> 00:00:39,080
‫We have 25 pixels here.

8
00:00:40,440 --> 00:00:48,630
‫And our cell here can have only one value, which should be the representative value for these 25 pixels.

9
00:00:50,340 --> 00:00:56,700
‫So we need to find a way to convert these 25 values of pixels into one value.

10
00:00:58,550 --> 00:01:07,320
‫This is done by using a filter. Filter is a matrix of same dimensions as our window of receptive field.

11
00:01:08,730 --> 00:01:11,580
‫So if the window is five cross five.

12
00:01:12,570 --> 00:01:14,950
‫Filter  is also of dimension five

13
00:01:14,990 --> 00:01:15,460
‫Cross five.

14
00:01:17,050 --> 00:01:21,790
‫If it is of three cross three, filter will also be of three cross three dimensions.

15
00:01:24,870 --> 00:01:33,690
‫No, we have a window of five into five pixels containing pixel value and we have a five in two five

16
00:01:34,560 --> 00:01:36,870
‫matrix containing some values.

17
00:01:39,000 --> 00:01:43,860
‫We multiply each pixel value with the corresponding filter value.

18
00:01:45,280 --> 00:01:47,500
‫And add all of these products up.

19
00:01:49,540 --> 00:01:51,370
‫So the pixel value here.

20
00:01:53,500 --> 00:02:00,760
‫Will be multiplied with zero point four the next pixel value will be multiplied with zero point three.

21
00:02:01,120 --> 00:02:01,750
‫And so on.

22
00:02:02,770 --> 00:02:05,850
‫And all these products will be added up

23
00:02:07,610 --> 00:02:09,620
‫This will give us one number.

24
00:02:10,370 --> 00:02:15,230
‫And this number will represent information in these 25 pixels.

25
00:02:18,190 --> 00:02:23,310
‫Now the question comes, how do we decide the values in this filter ?

26
00:02:25,290 --> 00:02:27,360
‫The answer to this is very pleasing.

27
00:02:28,290 --> 00:02:30,330
‫We do not have to decide these values.

28
00:02:31,600 --> 00:02:34,370
‫Our network will learn these values also.

29
00:02:35,620 --> 00:02:39,610
‫So when we are training our model, these values will be self learnt.

30
00:02:43,330 --> 00:02:50,290
‫Now to demonstrate how filters work and how they are able to extract certain features out.

31
00:02:51,900 --> 00:02:54,990
‫I have taken a five into five input image.

32
00:02:56,510 --> 00:03:01,670
‫With zero one type pixel values and a three by three filter.

33
00:03:05,690 --> 00:03:06,680
‫Look at this filter.

34
00:03:07,810 --> 00:03:10,600
‫This filter looks like a cross.

35
00:03:11,810 --> 00:03:16,610
‫That is the diagonal values are one and the other are 0.

36
00:03:18,600 --> 00:03:24,280
‫If we use this filter with a stride of one, we get this output.

37
00:03:27,570 --> 00:03:30,720
‫The gif below shows you how we get this output.

38
00:03:33,020 --> 00:03:40,040
‫How the filter values are multiplied and their product values are added up to get the first value.

39
00:03:40,610 --> 00:03:44,020
‫Then the next value and then the next and so on.

40
00:03:49,370 --> 00:03:54,800
‫This final output, which we get after applying the filter is called a feature map.

41
00:03:56,630 --> 00:04:05,110
‫A feature map, because each filter highlights some feature of the input image, the images on the

42
00:04:05,110 --> 00:04:10,780
‫right are demonstrating how particular features are highlighted by filters.

43
00:04:12,870 --> 00:04:21,330
‫For example, if we use a vertical filter, that is the middle column of this matrix is one one one,

44
00:04:22,470 --> 00:04:24,990
‫and these side columns are zero zero zero.

45
00:04:27,100 --> 00:04:31,150
‫This type of filter transforms the image to this image.

46
00:04:33,350 --> 00:04:40,580
‫Notice that vertical white lines are enhanced and the rest of the image is blurred.

47
00:04:42,810 --> 00:04:45,870
‫Similarly, if we use the horizontal filter.

48
00:04:47,130 --> 00:04:50,230
‫That is this middle row will be one one, one.

49
00:04:52,800 --> 00:04:56,310
‫And top and bottom row will consist of zeros.

50
00:04:57,870 --> 00:05:00,060
‫If we use such horizontal filter.

51
00:05:01,180 --> 00:05:02,260
‫We set this image.

52
00:05:03,610 --> 00:05:09,460
‫You can notice that horizontal white lines are highlighted and rest is blurred.

53
00:05:11,980 --> 00:05:13,450
‫This is what a filter does.

54
00:05:14,760 --> 00:05:20,910
‫A filter is a set of values which transforms the window by doing sum of products.

55
00:05:22,520 --> 00:05:29,180
‫What we get after applying a filter is called a feature map, each feature map has some particular

56
00:05:29,180 --> 00:05:30,400
‫feature highlighted.

57
00:05:33,320 --> 00:05:37,770
‫So what we will do is we will use many types of filter.

58
00:05:38,810 --> 00:05:44,600
‫So that each filter creates different feature maps containing different features.

59
00:05:46,340 --> 00:05:51,560
‫This means our convolutional live is going to be a bundle of feature maps.

60
00:05:52,870 --> 00:05:56,470
‫And each feature map has some particular highlighted feature.

61
00:05:57,910 --> 00:06:01,870
‫Important thing to notice here is what happens in the next layer.

62
00:06:03,390 --> 00:06:04,650
‫So this cell.

63
00:06:05,640 --> 00:06:09,000
‫In the first feature map of Convolutional layer 2.

64
00:06:10,260 --> 00:06:11,250
‫What does this see.

65
00:06:12,310 --> 00:06:17,280
‫Is it only this rectangle on the first feature map of previous layer

66
00:06:18,250 --> 00:06:22,500
‫Or this rectangle on all feature maps in the previous layer

67
00:06:24,410 --> 00:06:32,210
‫The answer is that each cell on Convolutional layer two will be getting information of all the feature

68
00:06:32,210 --> 00:06:34,070
‫maps in the previous layer.

69
00:06:35,610 --> 00:06:43,050
‫Because only then can these cells combine the different features to find more high level features.

70
00:06:46,050 --> 00:06:47,870
‫I'll summarize again for clarity.

71
00:06:49,280 --> 00:06:55,130
‫We apply a filter on the previous layer of data to extract features.

72
00:06:58,100 --> 00:07:01,730
‫The output after applying filter is called a feature map.

73
00:07:03,490 --> 00:07:08,710
‫We apply many different types of filters to extract many different types of features.

74
00:07:09,920 --> 00:07:12,710
‫This gives us a bundle of feature maps.

75
00:07:14,660 --> 00:07:18,480
‫The first bundle of feature maps is called Convolutional layer One.

76
00:07:21,860 --> 00:07:26,210
‫Convolutional layer 2 works on these extracted features.

77
00:07:27,530 --> 00:07:30,110
‫To extract even higher level of features.

78
00:07:33,670 --> 00:07:36,670
‫Next, we are going to discuss about the input layer.

79
00:07:38,250 --> 00:07:41,690
‫Input layer also has multiple layers of information.

80
00:07:42,790 --> 00:07:44,390
‫These layers are called channels.

81
00:07:44,950 --> 00:07:47,290
‫We talk about channels in the next video.