1
00:00:11,640 --> 00:00:16,920
In this lecture, we are going to continue our discussion of convolution by looking at a totally different

2
00:00:16,920 --> 00:00:18,720
perspective on how it works.

3
00:00:19,080 --> 00:00:24,240
This is very helpful for understanding convolution, but it doesn't teach you anything new mechanically.

4
00:00:24,540 --> 00:00:29,940
So this lecture is optional if you want to move on and just learn about how to make scenes.

5
00:00:34,850 --> 00:00:37,910
Something I mentioned very often is victimization.

6
00:00:38,330 --> 00:00:44,660
We like to victimize operations in code because using numpy functions is a lot more efficient than writing

7
00:00:44,660 --> 00:00:46,370
your own python for loops.

8
00:00:46,760 --> 00:00:52,100
The pattern you're looking for when you want to vector arise in operation is usually of the form of

9
00:00:52,100 --> 00:00:53,180
a dot product.

10
00:00:53,630 --> 00:00:59,090
A dot product is an element y's multiplication and then the summation of those results.

11
00:00:59,480 --> 00:01:04,370
So whenever you see the sum over ei of I times by that's a dot products.

12
00:01:05,710 --> 00:01:11,590
This also applies to matrix multiplication, since if A and B are matrices, then the matrix multiplication

13
00:01:11,590 --> 00:01:13,900
of A and B is just a dot b.

14
00:01:15,650 --> 00:01:19,190
You'll notice that for convolution we yet again have something similar.

15
00:01:19,370 --> 00:01:22,280
The only difference is that there are two summations.

16
00:01:22,670 --> 00:01:27,080
However, this is not really a relevant detail since the outcome is still the same.

17
00:01:27,140 --> 00:01:30,770
Instead of summing over one axis or something over two axes.

18
00:01:30,770 --> 00:01:32,090
But it's still an element.

19
00:01:32,090 --> 00:01:33,200
Why some in add?

20
00:01:38,080 --> 00:01:41,290
The question is why is the DOT product important?

21
00:01:41,890 --> 00:01:47,680
One definition of the DOT product other than the element why some in ad is that it's the magnitude of

22
00:01:47,680 --> 00:01:54,250
aid multiplied by the magnitude of B multiplied by the cosine of the angle between A and B.

23
00:01:54,970 --> 00:02:00,970
We sometimes call this the cosine similarity or cosine distance, depending on the sign that you use.

24
00:02:05,830 --> 00:02:06,880
So how does this work?

25
00:02:06,880 --> 00:02:07,870
Geometrically.

26
00:02:08,350 --> 00:02:10,720
Consider just the angle for a moment.

27
00:02:11,140 --> 00:02:16,020
If the angle between the two vectors is zero, then the cosine of that angle is one.

28
00:02:16,030 --> 00:02:18,160
That's the maximum value of the cosine.

29
00:02:23,040 --> 00:02:26,610
Now imagine that the angle between two vectors is 90 degrees.

30
00:02:27,180 --> 00:02:29,550
Then the cosine of that angle is zero.

31
00:02:34,450 --> 00:02:38,980
Finally imagine that the angle between a two vectors is 180 degrees.

32
00:02:39,010 --> 00:02:41,680
This is basically as far apart as possible.

33
00:02:42,070 --> 00:02:46,960
Then the cosine of that angle is minus one, which is the minimum value of the cosine.

34
00:02:51,730 --> 00:02:55,120
So if you're just using raw cosine, then it's a similarity.

35
00:02:55,300 --> 00:02:58,270
The larger the number, the closer the two vectors are.

36
00:02:58,510 --> 00:03:01,060
The smaller the number, the further away they are.

37
00:03:01,450 --> 00:03:05,080
The maximum value when A and B are parallel is one.

38
00:03:05,380 --> 00:03:09,460
The minimum value when A and B are anti parallel is minus one.

39
00:03:09,670 --> 00:03:14,290
And if A and B are orthogonal, then the cosine similarity is just zero.

40
00:03:19,230 --> 00:03:19,590
Okay.

41
00:03:19,590 --> 00:03:20,850
So why is that important?

42
00:03:21,480 --> 00:03:28,620
Consider now how you would find that the cosine of the angle between two vectors that's just a be divided

43
00:03:28,620 --> 00:03:31,140
by the magnitude of a and the magnitude of B.

44
00:03:31,260 --> 00:03:33,900
So I just rearrange the equation that we had before.

45
00:03:38,820 --> 00:03:43,140
Now let's compare this to another popular measurement, the Pearson correlation.

46
00:03:43,560 --> 00:03:46,530
The Pearson correlation is defined as what you see here.

47
00:03:47,070 --> 00:03:50,190
But notice how similar this is to cosine similarity.

48
00:03:50,490 --> 00:03:54,810
The only difference is that the Pearson correlation uses mean subtraction.

49
00:03:55,720 --> 00:03:59,390
And so now you have two hints that convolution is really correlation.

50
00:03:59,410 --> 00:04:04,630
The first one from before was that we were actually doing what is called the cross correlation.

51
00:04:04,990 --> 00:04:11,200
And second, now you have that the DOT product is actually very similar to the Pearson correlation.

52
00:04:16,030 --> 00:04:21,100
So I hope that by this point you are convinced that the DOD product, while it seems like an abstract

53
00:04:21,100 --> 00:04:24,220
concept, can be thought of as a correlation measure.

54
00:04:24,610 --> 00:04:28,330
It tells me how correlated is the first thing with the second thing.

55
00:04:28,540 --> 00:04:33,550
If they are highly, positively correlated, then the DOT products should be large and positive.

56
00:04:33,580 --> 00:04:37,000
That means two vectors pointing in nearly the same direction.

57
00:04:38,440 --> 00:04:43,270
If they are highly negatively correlated, then the DOT product should be large and negative.

58
00:04:43,300 --> 00:04:46,690
That means the two vectors are pointing in nearly opposite direction.

59
00:04:47,670 --> 00:04:53,190
Finally, if the two vectors are orthogonal or at right angles, then the DOT product should be zero.

60
00:04:58,030 --> 00:05:03,400
The reason why this is important is you don't have to think of a filter as an abstract concept.

61
00:05:03,430 --> 00:05:05,650
In fact, it's just a pattern finder.

62
00:05:06,010 --> 00:05:09,220
This actually makes the term filter make a lot more sense.

63
00:05:09,250 --> 00:05:15,250
It filters out everything not related to the pattern contained in the filter by setting them to zero

64
00:05:15,250 --> 00:05:17,950
and keeps everything that is related to the pattern.

65
00:05:18,580 --> 00:05:24,970
So what convolution is doing is it's passing this filter along each point on the original input image

66
00:05:24,970 --> 00:05:26,320
and sliding it along.

67
00:05:27,070 --> 00:05:32,860
At each point it asks Is the pattern, here is the pattern, here is the pattern here and so forth.

68
00:05:33,190 --> 00:05:39,070
Then it gives us a high number in the positions where the pattern is found and then a small number where

69
00:05:39,070 --> 00:05:40,450
the pattern is not found.

70
00:05:40,690 --> 00:05:45,070
And thus, this is your first alternative perspective on convolution.

71
00:05:45,400 --> 00:05:51,880
It's just a sliding pattern finder that passes through an entire image looking for a particular pattern.
