1
00:00:00,150 --> 00:00:01,320
Hi and welcome back.

2
00:00:01,530 --> 00:00:07,860
So we're going to take a look at the crab cut algorithm for Beckham removal, which is a very cool algorithm,

3
00:00:07,870 --> 00:00:09,370
so let's take a look at this lesson.

4
00:00:09,390 --> 00:00:12,040
So let's load on libraries and images.

5
00:00:12,750 --> 00:00:18,360
We actually download images that we're using in this lesson separately from my GitHub using a cat.

6
00:00:19,410 --> 00:00:20,910
So all of that should be done.

7
00:00:20,940 --> 00:00:21,480
There we go.

8
00:00:22,110 --> 00:00:24,990
So let's take a look at the crabcakes algorithm.

9
00:00:25,320 --> 00:00:28,680
So before we actually explain this, let me just take a look.

10
00:00:28,730 --> 00:00:30,120
Show you the output of our code.

11
00:00:30,690 --> 00:00:32,190
So what do we do?

12
00:00:32,220 --> 00:00:37,620
We define we have to manually target as we define a box around the foreground.

13
00:00:37,650 --> 00:00:43,920
If we wish to extract, what we want to do is extract only the woman and put everything that's around

14
00:00:43,920 --> 00:00:47,610
to to be black, except for the background from it, basically.

15
00:00:48,270 --> 00:00:51,720
So the output that we want is going to look like this.

16
00:00:52,560 --> 00:00:53,010
See this.

17
00:00:53,110 --> 00:00:54,480
It's fairly well done.

18
00:00:54,990 --> 00:01:00,240
And nowadays there's a lot of very cool, deep learning methods which you will see in the feature in

19
00:01:00,240 --> 00:01:05,580
the other section of the course, the deep learning part of it that does this quite well, but grab

20
00:01:05,580 --> 00:01:10,990
cut was one of the more effective algorithms I did back when removal back in the day of classical division.

21
00:01:11,080 --> 00:01:17,610
When I see back in the day, I mean, like 2015, 2016 and a few years prior to that, a few years before.

22
00:01:18,330 --> 00:01:25,290
So the things have certainly changed since 2016, which is why this course was created, because there's

23
00:01:25,290 --> 00:01:30,390
a lot of different deep learning techniques being pioneered and implemented right now that are changing

24
00:01:30,390 --> 00:01:31,080
computer vision.

25
00:01:31,410 --> 00:01:37,080
However, it's still very good to know the classical computer vision techniques because you do tend

26
00:01:37,080 --> 00:01:39,480
to use them a lot in conjunction with deep learning.

27
00:01:39,900 --> 00:01:43,510
So let's take a look at how the grab algorithm works.

28
00:01:43,530 --> 00:01:49,740
So as I said, the user defines a rectangle and this rectangle will be taken as basically the foreground

29
00:01:49,740 --> 00:01:50,400
and the background.

30
00:01:50,490 --> 00:01:54,180
But we don't know what it is, but we just know that foreground is in that rectangle.

31
00:01:55,080 --> 00:02:01,740
So the algorithm using Gaussian mixture models GNN, basically, it tries to predict whether the pixels

32
00:02:01,740 --> 00:02:04,890
in that box belong to the foreground or the background.

33
00:02:06,270 --> 00:02:12,090
How it does it is that it creates a graph that is built from the pixel distribution and the nodes in

34
00:02:12,090 --> 00:02:17,160
these graphs pixels effectively where the additional nodes are.

35
00:02:17,400 --> 00:02:19,770
I did, where we have a source and sync node.

36
00:02:20,130 --> 00:02:25,320
So what this means is that every foreground pixel is connected to a source node and every background

37
00:02:25,320 --> 00:02:26,940
pixel is connected to sync node.

38
00:02:27,180 --> 00:02:29,550
So that syncs end button upon.

39
00:02:30,390 --> 00:02:38,070
So the width of these edges connecting the pixels to the source node and node, or defined by the probability

40
00:02:38,070 --> 00:02:40,180
of pixel being in the foreground or background.

41
00:02:40,800 --> 00:02:43,470
So that's how the glamour basically separates it.

42
00:02:43,950 --> 00:02:49,230
And then if there's a large difference in the pixel color, the edge between them will get a little

43
00:02:49,230 --> 00:02:54,690
wet, which means that then the mini cut algorithm is used to segment the graph at that point, so it

44
00:02:54,690 --> 00:02:59,850
cuts the graph into two separate sources there the source node in the sink node using a minimum cost

45
00:02:59,850 --> 00:03:00,330
function.

46
00:03:00,870 --> 00:03:04,920
And, of course, function is the sum of all tweets, and the edges of this can get a bit confusing.

47
00:03:04,920 --> 00:03:10,270
So I don't expect you to fully understand the algorithm here, but we can take a look of this picture

48
00:03:10,270 --> 00:03:13,470
in these pictures here that will help illustrate the concept a bit.

49
00:03:13,470 --> 00:03:20,100
So you start with image e and you can see this is the background here, and this is the foreground image

50
00:03:20,100 --> 00:03:25,020
with seeds and you can see the construct this graph to control the pixels pixels here.

51
00:03:25,500 --> 00:03:31,530
So we have a mapping of it here, and the object basically sinks the sync node here, and you can see

52
00:03:31,530 --> 00:03:36,600
the signs of probability of each pixel belonging to the foreground of background based on that.

53
00:03:37,110 --> 00:03:42,300
So you get the segmentation results because this is effectively a segmentation algorithm and then we

54
00:03:42,450 --> 00:03:43,650
implement a cut here.

55
00:03:44,220 --> 00:03:48,660
So actually goes from B to C to D, which is the final result, actually.

56
00:03:49,290 --> 00:03:51,300
But hopefully that do you understand that?

57
00:03:51,810 --> 00:03:57,200
And you can take a look at the people here, which is quite good to look at, as well as the open CV

58
00:03:57,210 --> 00:03:57,960
documentation.

59
00:03:58,500 --> 00:04:01,590
So firstly, let's take a look at how we implement this using open.

60
00:04:02,610 --> 00:04:10,260
So we create a mask of a black masquerade here of the same size shape as our original image, and then

61
00:04:10,260 --> 00:04:11,180
we just separate.

62
00:04:11,190 --> 00:04:16,920
We create this empty ambitieux matrix, red background, middle and foreground model from the specific

63
00:04:16,920 --> 00:04:17,580
linked here.

64
00:04:17,610 --> 00:04:18,690
We don't change this layout.

65
00:04:19,170 --> 00:04:22,710
And then we set the box, which is this bounding box here.

66
00:04:23,130 --> 00:04:25,410
That's a region of interest, so our way.

67
00:04:26,340 --> 00:04:31,650
Alternatively, if we use it using this in your local system, you could use V to select our way to

68
00:04:31,650 --> 00:04:33,870
actually use your mouse to select these pixels.

69
00:04:34,740 --> 00:04:37,970
We expect drawing a rectangle here, so that's how we interpret this.

70
00:04:38,070 --> 00:04:39,060
So let's run this.

71
00:04:40,350 --> 00:04:40,830
There we go.

72
00:04:42,300 --> 00:04:45,000
So now we're going to look at a grab algorithm itself.

73
00:04:45,330 --> 00:04:52,200
So the inputs, this image can see it here or this look a reference here by that point to here.

74
00:04:52,770 --> 00:04:54,300
So we have an image to mask.

75
00:04:54,300 --> 00:04:59,810
We defined above the rectangle, which we drew, which hardcoded into the background model the full.

76
00:05:00,300 --> 00:05:07,710
Which is a fixed what could see it here and then we said this, this is a parameter here that actually

77
00:05:07,710 --> 00:05:10,650
just tells us which which drawing wants it to use.

78
00:05:10,680 --> 00:05:14,880
OK, so no, we just take this.

79
00:05:15,690 --> 00:05:19,680
You get the image mass multiplied by the mask and create this new image out of it.

80
00:05:20,100 --> 00:05:23,130
That's how we get mask here.

81
00:05:24,060 --> 00:05:29,640
So we have mask being shown here so that we can actually see the pixels a bit brighter.

82
00:05:30,420 --> 00:05:36,030
This one here, we just make it everything that's thorough background and foreground at that point,

83
00:05:36,300 --> 00:05:38,040
using graphic as it gives us.

84
00:05:39,270 --> 00:05:44,300
And then we get the final image output here, which is the output we wanted.

85
00:05:44,910 --> 00:05:45,990
So that's pretty cool.

86
00:05:46,530 --> 00:05:48,630
You can mess around with your own images with dropkicked.

87
00:05:49,080 --> 00:05:55,110
I haven't used it that much in practice, to be fair, but I know a couple of guys who do and they use

88
00:05:55,110 --> 00:05:59,460
it to good effect with with separating some background stuff from photographs.

89
00:06:00,030 --> 00:06:05,760
So I wish you luck with this and I'll see you the next lesson where we took a look at optical character

90
00:06:05,760 --> 00:06:11,040
recognition using PI Tesseract, which is a very cool library that is OCR.

91
00:06:11,160 --> 00:06:12,330
So stay tuned for that.

92
00:06:12,570 --> 00:06:13,020
Thank you.