1
00:00:00,000 --> 00:00:01,770
In the previous lessons,

2
00:00:01,770 --> 00:00:03,930
you saw the impacts
that convolutions and

3
00:00:03,930 --> 00:00:06,735
pooling had on your networks
efficiency and learning,

4
00:00:06,735 --> 00:00:09,015
but a lot of that was
theoretical in nature.

5
00:00:09,015 --> 00:00:11,160
So I thought it'd be
interesting to hack some code

6
00:00:11,160 --> 00:00:13,995
together to show how
a convolution actually works.

7
00:00:13,995 --> 00:00:16,360
We'll also create a little
pooling algorithm,

8
00:00:16,360 --> 00:00:18,465
so you can visualize its impact.

9
00:00:18,465 --> 00:00:20,610
There's a notebook that
you can play with too,

10
00:00:20,610 --> 00:00:22,185
and I'll step through that here.

11
00:00:22,185 --> 00:00:25,000
Here's the notebook for
playing with convolutions.

12
00:00:25,000 --> 00:00:28,645
So first, we'll set up
our inputs and in particular,

13
00:00:28,645 --> 00:00:30,925
import the misc
library from SciPy.

14
00:00:30,925 --> 00:00:33,775
Now, this is a nice shortcut
for us because

15
00:00:33,775 --> 00:00:37,115
misc.ascent returns a nice image
that we can play with,

16
00:00:37,115 --> 00:00:39,715
and we don't have to worry
about managing our own.

17
00:00:39,715 --> 00:00:42,385
Matplotlib contains
the code for drawing

18
00:00:42,385 --> 00:00:43,675
an image and it will render it

19
00:00:43,675 --> 00:00:45,565
right in the browser with Colab.

20
00:00:45,565 --> 00:00:49,210
Here, we can see
the ascent image from SciPy.

21
00:00:49,210 --> 00:00:52,150
Next up, we'll take
a copy of the image,

22
00:00:52,150 --> 00:00:54,940
and we'll add it with
our homemade convolutions,

23
00:00:54,940 --> 00:00:56,845
and we'll create
variables to keep track

24
00:00:56,845 --> 00:00:59,095
of the x and y dimensions
of the image.

25
00:00:59,095 --> 00:01:02,715
So we can see here that
it's a 512 by 512 image.

26
00:01:02,715 --> 00:01:04,854
So now, let's create
a convolution

27
00:01:04,854 --> 00:01:06,370
as a three by three array.

28
00:01:06,370 --> 00:01:08,395
We'll load it with values
that are pretty good

29
00:01:08,395 --> 00:01:10,795
for detecting sharp edges first.

30
00:01:10,795 --> 00:01:13,570
Here's where we'll
create the convolution.

31
00:01:13,570 --> 00:01:15,355
We iterate over the image,

32
00:01:15,355 --> 00:01:16,930
leaving a one pixel margin.

33
00:01:16,930 --> 00:01:19,390
You'll see that the loop
starts at one and not zero,

34
00:01:19,390 --> 00:01:23,035
and it ends at size x minus
one and size y minus one.

35
00:01:23,035 --> 00:01:25,315
In the loop, it
will then calculate

36
00:01:25,315 --> 00:01:26,995
the convolution value by

37
00:01:26,995 --> 00:01:29,265
looking at the pixel
and its neighbors,

38
00:01:29,265 --> 00:01:31,165
and then by multiplying
them out by

39
00:01:31,165 --> 00:01:33,205
the values determined
by the filter,

40
00:01:33,205 --> 00:01:35,080
before finally summing it all up.

41
00:01:35,080 --> 00:01:39,205
Let's run it. It takes
just a few seconds,

42
00:01:39,205 --> 00:01:42,150
so when it's done,
let's draw the results.

43
00:01:42,150 --> 00:01:44,935
We can see that
only certain features

44
00:01:44,935 --> 00:01:46,615
made it through the filter.

45
00:01:46,615 --> 00:01:49,720
I've provided a couple more
filters, so let's try them.

46
00:01:49,720 --> 00:01:51,445
This first one is really great

47
00:01:51,445 --> 00:01:53,215
at spotting vertical lines.

48
00:01:53,215 --> 00:01:56,550
So when I run it, and
plot the results,

49
00:01:56,550 --> 00:01:58,375
we can see that
the vertical lines

50
00:01:58,375 --> 00:02:00,085
in the image made it through.

51
00:02:00,085 --> 00:02:01,765
It's really cool because

52
00:02:01,765 --> 00:02:02,975
they're not just
straight up and down,

53
00:02:02,975 --> 00:02:04,750
they are vertical in perspective

54
00:02:04,750 --> 00:02:07,245
within the perspective
of the image itself.

55
00:02:07,245 --> 00:02:11,395
Similarly, this filter works
well for horizontal lines.

56
00:02:11,395 --> 00:02:12,850
So when I run it,

57
00:02:12,850 --> 00:02:14,605
and then plot the results,

58
00:02:14,605 --> 00:02:15,835
we can see that a lot of

59
00:02:15,835 --> 00:02:18,325
the horizontal lines
made it through.

60
00:02:18,325 --> 00:02:20,440
Now, let's take
a look at pooling,

61
00:02:20,440 --> 00:02:22,005
and in this case, Max pooling,

62
00:02:22,005 --> 00:02:23,675
which takes pixels in chunks of

63
00:02:23,675 --> 00:02:26,670
four and only passes
through the biggest value.

64
00:02:26,670 --> 00:02:29,995
I run the code and then
render the output.

65
00:02:29,995 --> 00:02:33,420
We can see that the features
of the image are maintained,

66
00:02:33,420 --> 00:02:35,450
but look closely at the axes,

67
00:02:35,450 --> 00:02:37,105
and we can see that
the size has been

68
00:02:37,105 --> 00:02:40,765
halved from the
500's to the 250's.

69
00:02:40,765 --> 00:02:43,390
For fun, we can try
the other filter,

70
00:02:43,390 --> 00:02:45,715
run it, and then compare

71
00:02:45,715 --> 00:02:48,780
the convolution with
its pooled version.

72
00:02:48,780 --> 00:02:50,695
Again, we can see that

73
00:02:50,695 --> 00:02:53,095
the features have not
just been maintained,

74
00:02:53,095 --> 00:02:55,510
they may have also
been emphasized a bit.

75
00:02:55,510 --> 00:02:57,925
So that's how convolutions work.

76
00:02:57,925 --> 00:03:01,495
Under the hood, TensorFlow is
trying different filters on

77
00:03:01,495 --> 00:03:03,625
your image and learning which

78
00:03:03,625 --> 00:03:06,465
ones work when looking
at the training data.

79
00:03:06,465 --> 00:03:08,880
As a result, when it works,

80
00:03:08,880 --> 00:03:10,285
you'll have greatly reduced

81
00:03:10,285 --> 00:03:12,640
information passing
through the network,

82
00:03:12,640 --> 00:03:16,135
but because it isolates
and identifies features,

83
00:03:16,135 --> 00:03:19,000
you can also get
increased accuracy.

84
00:03:20,000 --> 00:03:21,500
Have a play with the filters
in this workbook and

85
00:03:21,500 --> 00:03:22,500
see if you can come up with

86
00:03:22,500 --> 00:03:25,000
some interesting effects
of your own.