1
00:00:00,120 --> 00:00:07,860
Hi and welcome to Section 36, where we take a look at colorizing black and white photos using a pre-trained

2
00:00:07,860 --> 00:00:09,600
caffey model in open city.

3
00:00:10,140 --> 00:00:16,020
This is a very cool project and I actually have experimented with this with some wolf only black and

4
00:00:16,020 --> 00:00:18,360
white photos, and the results were quite good.

5
00:00:18,720 --> 00:00:20,910
So this is what we can expect.

6
00:00:20,910 --> 00:00:23,640
This picture illustrates what this model does.

7
00:00:24,090 --> 00:00:29,700
You're taking a black and white image and accurately converting it to color, and you can see the results

8
00:00:29,700 --> 00:00:29,870
here.

9
00:00:29,870 --> 00:00:31,350
It looks really, really good.

10
00:00:31,950 --> 00:00:34,050
No, this work was done by this.

11
00:00:34,080 --> 00:00:34,680
What is here?

12
00:00:34,680 --> 00:00:36,180
Let's open the research paper.

13
00:00:36,540 --> 00:00:41,970
This was in 2016 Richard Zang, Philip Isola and Alexi Afroz.

14
00:00:42,390 --> 00:00:44,340
These guys did it from Berkeley University.

15
00:00:44,340 --> 00:00:51,090
Did a really, really good job at creating a colorization algorithm, deep learning algorithm.

16
00:00:51,660 --> 00:00:53,670
So let's take a look at how it works.

17
00:00:53,880 --> 00:00:59,940
So what the authors did, they embraced the underlying uncertainty of the problem, which is black and

18
00:00:59,940 --> 00:01:03,810
white, to color confusion by posing the problem as a classification test.

19
00:01:04,290 --> 00:01:09,930
So if they can understand that firstly, the butterfly, they reduced the amount of colors that it needs

20
00:01:09,930 --> 00:01:10,440
to search.

21
00:01:10,710 --> 00:01:14,550
Basically, that's available for the algorithm to converge on.

22
00:01:14,910 --> 00:01:20,610
So that's already reduced their diversity of colors that for what this this picture could be.

23
00:01:21,390 --> 00:01:27,060
And the system is implemented as a feedforward CNN, which is a standard CNN that streamed over a million

24
00:01:27,060 --> 00:01:27,900
example images.

25
00:01:28,290 --> 00:01:35,220
That's quite a lot of black and white color photos and evaluate the algorithm using the colorization

26
00:01:35,220 --> 00:01:44,040
Turing tests, which basically acts as humans humans to take a look at the AI colorized image, as well

27
00:01:44,040 --> 00:01:49,470
as the ground truth colorized image and let them pick to see which one they believe is the actual color

28
00:01:49,470 --> 00:01:51,870
image as opposed to the air generated one.

29
00:01:52,260 --> 00:01:57,270
And then method successfully fooled 32 percent of people, which is significantly higher than previous

30
00:01:57,270 --> 00:01:57,660
methods.

31
00:01:58,170 --> 00:01:59,940
So that was quite an achievement.

32
00:02:00,960 --> 00:02:07,010
So now let's dive into the code and take a look and see how we implement this algorithm in open TV.

33
00:02:07,200 --> 00:02:08,340
So what do we do?

34
00:02:08,790 --> 00:02:14,820
Let's just dumb little images that we have done that we need to get for the project.

35
00:02:15,570 --> 00:02:21,290
So let's run this block of code here, and it just downloads our colorization model since I've already

36
00:02:21,330 --> 00:02:27,870
done it, and you need to type in way to just copy over a total of these image images.

37
00:02:29,550 --> 00:02:30,250
So there we go.

38
00:02:30,270 --> 00:02:32,460
So that should be done quickly and simply enough.

39
00:02:32,970 --> 00:02:39,750
So what did that file contain that file contain or colorization images, which are the images we're

40
00:02:39,750 --> 00:02:46,860
going to use here, some images of some famous images of some old popular people back in the past.

41
00:02:47,370 --> 00:02:52,890
So we have Audrey Hepburn, Bruce Lee, Elvis, Jimi Hendrix, Marilyn Monroe and Queen Elizabeth.

42
00:02:53,400 --> 00:02:55,050
And you can see these files here.

43
00:02:55,410 --> 00:02:58,920
This is just a colorization notebook that we don't we're not using here with.

44
00:02:58,920 --> 00:02:59,700
Code is in here.

45
00:03:00,360 --> 00:03:02,010
This is a proto text file.

46
00:03:02,010 --> 00:03:03,300
This belongs to the model.

47
00:03:03,630 --> 00:03:09,270
This is the cafe model file, and these are the points in the whole which we which I'm not sure what

48
00:03:09,270 --> 00:03:14,160
it's actually used for, exactly, but it's probably the kernel that is applied to the image.

49
00:03:14,670 --> 00:03:18,080
So we just need to see if this point is probably like pre-processing Typekit.

50
00:03:19,230 --> 00:03:26,340
So we just point to the fact that you get old, black and white images pointed a kernel here and then

51
00:03:26,550 --> 00:03:32,820
well, in the main program here, we used to see v2.10 and read from cafe to lower the coffee model.

52
00:03:33,240 --> 00:03:35,070
So that's the model that we pointed to here.

53
00:03:35,250 --> 00:03:38,220
This is a proto text file and the cafe model.

54
00:03:38,910 --> 00:03:41,790
You don't need to understand what the text file is right now.

55
00:03:42,180 --> 00:03:44,340
That's just a model structure file, essentially.

56
00:03:44,700 --> 00:03:48,690
But you look and you learn these things in the deep learning section of the course.

57
00:03:49,110 --> 00:03:50,220
So don't worry about it.

58
00:03:50,220 --> 00:03:50,460
No.

59
00:03:50,460 --> 00:03:50,970
Read No.

60
00:03:50,970 --> 00:03:56,370
This is more of an implementation cookbook type project that we're going to use that we're going to

61
00:03:56,370 --> 00:03:56,610
do.

62
00:03:57,030 --> 00:04:03,840
So we load a cluster sentence here, so we just load the kernel and then we populate the cluster centres

63
00:04:03,840 --> 00:04:05,880
as a one by one convolutional kernel.

64
00:04:06,360 --> 00:04:13,080
So these things here, these are things that are basically pre-processing steps that may confuse you

65
00:04:13,080 --> 00:04:18,740
to definitely actually confuse me because every research paper has a lot of different things that it

66
00:04:19,110 --> 00:04:24,750
that it what is do put together that is quite difficult for them, for a person who's not familiar with

67
00:04:24,750 --> 00:04:26,130
the research to understand.

68
00:04:27,180 --> 00:04:32,400
So you need to need to read the research paper quite in depth to understand what exactly is going on

69
00:04:32,400 --> 00:04:32,880
sometimes.

70
00:04:33,270 --> 00:04:35,970
So these are some lives that you're extracting here.

71
00:04:35,970 --> 00:04:41,910
You can see two layers right here and then what we do for each image, you know, image about who load

72
00:04:41,910 --> 00:04:47,010
that image, then we just run some pre-processing basically on the image.

73
00:04:47,430 --> 00:04:49,950
So we normalize it and expect it to float to the two.

74
00:04:50,550 --> 00:04:55,110
And then we convert that to the larger format, which is a different color space.

75
00:04:55,410 --> 00:04:59,870
So we got the image as a lab and then we just pull out all channel.

76
00:05:00,430 --> 00:05:06,520
Get to heights and widths here, resize it for the network, because CNN's tend to be trained on specific

77
00:05:06,520 --> 00:05:11,720
sized images, we need to so they can only take inputs of a specific size.

78
00:05:11,740 --> 00:05:12,790
So that's where we do that.

79
00:05:12,790 --> 00:05:18,580
Therefore, then we resize image of the network, which we I was done right here.

80
00:05:19,810 --> 00:05:21,220
We can video tribute to love.

81
00:05:21,220 --> 00:05:22,690
So that's what this is here.

82
00:05:22,720 --> 00:05:26,380
Completing this image to love and then we extract the Ilocano.

83
00:05:26,380 --> 00:05:32,800
Here we subtract 50 four means centering, which again, is a pre processing step that depending on

84
00:05:33,280 --> 00:05:38,140
if it's a mean, the dataset like a normalized dataset, you have to do that for any input, image and

85
00:05:38,140 --> 00:05:38,530
future.

86
00:05:39,700 --> 00:05:42,450
And then we just get the inputs at the inputs.

87
00:05:42,470 --> 00:05:47,440
We just convert the image to a blob in one line here and we feed it into the function right here.

88
00:05:47,860 --> 00:05:53,500
Then we do the forward pass, which you've seen previously with a yellow and SD models that we loaded

89
00:05:53,500 --> 00:05:54,790
in open TV.

90
00:05:55,510 --> 00:05:57,820
And then we get the shape we get to see about oate.

91
00:05:58,270 --> 00:06:02,620
We pre process it to basically converted back into a BGR.

92
00:06:03,160 --> 00:06:08,650
OK, that's the open TV's RGV type image format, and we show the original.

93
00:06:09,160 --> 00:06:15,280
So let's run this code and see how well this performance in our images.

94
00:06:15,730 --> 00:06:18,790
And note these images I pulled off the internet.

95
00:06:19,120 --> 00:06:22,780
They aren't images that this dataset was trained on.

96
00:06:23,020 --> 00:06:24,820
So let's take a look and see how it performs.

97
00:06:25,120 --> 00:06:26,260
So this is Audrey Hepburn.

98
00:06:27,340 --> 00:06:30,190
And you can see it looks quite good, in my opinion.

99
00:06:30,610 --> 00:06:33,850
This is a black and white photo, and this is a color that gets a skin tone here.

100
00:06:33,850 --> 00:06:34,930
Color probably, right?

101
00:06:35,470 --> 00:06:36,310
It looks quite good.

102
00:06:36,730 --> 00:06:42,670
Let's take a look at Elvis, and he looks quite good too as well with the color enhancements.

103
00:06:43,180 --> 00:06:44,370
It looks very accurate.

104
00:06:44,410 --> 00:06:46,690
Let's look at the famous look at the famous bristly.

105
00:06:47,560 --> 00:06:49,120
Again, this looks amazing.

106
00:06:50,170 --> 00:06:54,760
And take a look at Queen Elizabeth, and this one looks very accurate.

107
00:06:54,790 --> 00:06:57,250
However, you can still tell it's not ideal.

108
00:06:57,250 --> 00:07:00,170
It's not like a perfectly natural, color looking image.

109
00:07:01,300 --> 00:07:06,190
And let's take a look at Marilyn Monroe, and this looks really, really well done as well.

110
00:07:07,240 --> 00:07:10,170
And Jimi Hendrix, the if got gets a skin tone.

111
00:07:10,690 --> 00:07:12,460
It's got everything very, very right.

112
00:07:12,460 --> 00:07:18,040
So I'm quite impressed with how this performed on these images, and I hope you are impressed, too,

113
00:07:18,490 --> 00:07:21,790
so you can take a look and experiment with your own images with this.

114
00:07:22,690 --> 00:07:28,060
We're not going to move on to in painting, which is another method which is a method that we're going

115
00:07:28,060 --> 00:07:29,650
to apply to all photos again.

116
00:07:29,920 --> 00:07:36,730
However, this one is basically restoring damaged photos, erasing things like creases or missing spots

117
00:07:36,730 --> 00:07:37,390
in the image.

118
00:07:37,840 --> 00:07:39,820
We're going to use in painting to restore them.

119
00:07:40,150 --> 00:07:41,620
So stay tuned for that lesson.