﻿1
00:00:01,670 --> 00:00:08,930
‫Now, one of the problems that originate with large stride is that we may not be able to cover the

2
00:00:08,930 --> 00:00:09,950
‫entire image.

3
00:00:11,270 --> 00:00:19,190
‫For example let us try to cover this image with a five by five window moving with a stride

4
00:00:19,190 --> 00:00:19,700
‫of three.

5
00:00:22,280 --> 00:00:27,020
‫So the first neuron covers these first twenty five pixels.

6
00:00:28,070 --> 00:00:33,890
‫Now, in the next stride, we will cover these next twenty five pixels.

7
00:00:34,400 --> 00:00:38,300
‫So we have a stride of three after the fourth pixel.

8
00:00:38,420 --> 00:00:41,630
‫We count another twenty five pixels and so on.

9
00:00:42,160 --> 00:00:46,820
‫At this point where are neuron is seeing these 25 pixels.

10
00:00:48,730 --> 00:00:52,840
‫Now, in the next stride, we do not have 25 pixels.

11
00:00:54,190 --> 00:00:58,090
‫We will be out of pixel columns which can be covered.

12
00:00:59,710 --> 00:01:05,890
‫So if I take next stride, there are only 20 pixels available and not 25.

13
00:01:07,250 --> 00:01:10,850
‫This is a problem because we want uniformity.

14
00:01:11,450 --> 00:01:17,270
‫That is each neuron should have same receptive field of 25 pixels.

15
00:01:19,720 --> 00:01:21,820
‫In this situation, we have two options.

16
00:01:23,320 --> 00:01:27,400
‫First option is to ignore these extra pixels at the border.

17
00:01:29,800 --> 00:01:38,230
‫So since we cannot cover the last two pixel columns, we leave out one pixel column from left and one

18
00:01:38,230 --> 00:01:39,700
‫pixel column on the right.

19
00:01:41,530 --> 00:01:51,010
‫So instead of a 16 by 16 image, we will consider only a 14 by 14 image, one line of pixel remove from

20
00:01:51,130 --> 00:01:51,880
‫all the sides.

21
00:01:53,570 --> 00:01:58,000
‫This 14 by 14 image can be covered with a five by five window.

22
00:01:59,150 --> 00:02:00,180
‫And a stride of three.

23
00:02:02,310 --> 00:02:04,020
‫This particular option.

24
00:02:05,200 --> 00:02:06,940
‫It's called Valid padding.

25
00:02:09,190 --> 00:02:12,130
‫Which actually means that we are not using any padding.

26
00:02:13,240 --> 00:02:20,150
‫And that we only use valid window locations and ignore the extra pixels at the border.

27
00:02:23,030 --> 00:02:29,780
‫Second option is adding extra rows and columns of dummy pixels or blank pixels.

28
00:02:31,770 --> 00:02:36,390
‫For example, here we add one more layer of pixels.

29
00:02:38,010 --> 00:02:40,760
‫We can then move forward with a stride of three.

30
00:02:41,400 --> 00:02:49,170
‫And the last neutron will also have a view field of twenty five pixels twenty from the original image

31
00:02:50,100 --> 00:02:53,460
‫and five of our artificially generated blank pixels.

32
00:02:56,300 --> 00:03:05,990
‫This type of padding is called same padding, which means pad in such a way so as to have an output

33
00:03:06,380 --> 00:03:10,540
‫which can be covered by a window of same width and height.

34
00:03:12,980 --> 00:03:20,810
‫By default, padding arguments in our software are set valid if you think the border of your

35
00:03:20,810 --> 00:03:28,940
‫image stores important information in those scenarios only we will change this parameter to same

36
00:03:29,090 --> 00:03:30,040
‫valid works well.

37
00:03:33,700 --> 00:03:37,170
‫Here is the definition of two arguments that we have covered.

38
00:03:39,330 --> 00:03:47,430
‫First, it stride, stride denotes how many steps we take in each step of convolution.

39
00:03:49,130 --> 00:03:55,820
‫So in the first step, that is our first neuron is looking at this red rectangle.

40
00:03:57,020 --> 00:04:01,280
‫And the second neuron is looking at this blue rectangle.

41
00:04:01,880 --> 00:04:04,850
‫Then we have a stride of two.

42
00:04:06,900 --> 00:04:11,480
‫We can specify both horizontal and vertical strides separately.

43
00:04:12,940 --> 00:04:15,970
‫By default, stride value is set at one.

44
00:04:18,690 --> 00:04:21,150
‫The second concept that we discussed is of padding.

45
00:04:22,600 --> 00:04:30,940
‫Padding is the process of adding zeros to the input image to maintain the dimension of output as an

46
00:04:30,970 --> 00:04:31,390
‫input.

47
00:04:32,900 --> 00:04:41,810
‫If we decide to ignore the border values, which could not be covered due to our large stride, in that

48
00:04:41,810 --> 00:04:47,390
‫case, we are not using any padding for which the argument is valid padding.

49
00:04:49,100 --> 00:04:57,200
‫If we are adding additional black pixels so that our window can cover the border pixels, also in that

50
00:04:57,200 --> 00:04:59,480
‫scenario, we are using same padding.

51
00:05:01,310 --> 00:05:06,740
‫So these are the two arguments that we will need to specify when we train our convolutional layer.

