WEBVTT

00:00.280 --> 00:00.720
All right.

00:00.720 --> 00:03.600
Let's learn how to segment anything for inpainting.

00:03.600 --> 00:05.320
So that's the original image.

00:05.360 --> 00:12.720
And we're going to be able to get an exact section of the image just by telling in the prompt what we

00:12.720 --> 00:14.480
want to get in this case Rhino's head.

00:14.560 --> 00:16.520
So it's going to create the mask for us.

00:16.560 --> 00:23.760
And then we can pass that in to generate a new image using that specific space.

00:23.760 --> 00:24.880
This is a little bit messed up.

00:24.880 --> 00:26.800
This one it's got an extra hand.

00:26.800 --> 00:29.920
But I'll talk you through some of the reasons why that might be.

00:30.200 --> 00:30.520
All right.

00:30.560 --> 00:32.760
So that's what we're going to try and accomplish.

00:33.640 --> 00:36.360
And let's get started.

00:38.840 --> 00:39.160
All right.

00:39.200 --> 00:43.520
So let's run pip install file client.

00:45.760 --> 00:50.000
And we also need to get the environment variables.

00:50.640 --> 00:54.480
So you need your API key in your env file.

00:55.520 --> 00:58.400
And then we just need to make the call to file.

00:58.400 --> 01:03.760
So file has the segment anything model which is open source model by Facebook.

01:04.200 --> 01:05.800
And we're going to be using that.

01:05.840 --> 01:09.830
First we need to get the file client and requests.

01:09.870 --> 01:14.910
And then we're going to create a function just to give us updates when we.

01:17.630 --> 01:20.310
When we're waiting for the response.

01:22.510 --> 01:24.990
Is instance update.

01:27.110 --> 01:29.950
Client in progress.

01:30.950 --> 01:40.990
And then I'm just going to say log in to logs print the log message.

01:41.910 --> 01:42.110
Okay.

01:42.150 --> 01:45.430
So that's just going to give us updates as we go along.

01:45.430 --> 01:47.790
We also need to upload the image itself.

01:47.790 --> 01:49.830
So I'm just going to say image URL.

01:50.190 --> 01:55.350
We can we've got like a Rhino image that we're going to upload.

01:56.270 --> 02:02.550
And that's just to get the URL from, from the image that we can pass into the next model.

02:02.550 --> 02:07.910
So the next one we're going to do is actually creating the segment anything result.

02:07.950 --> 02:12.590
So just going to say result equals false file client subscribe.

02:12.950 --> 02:15.510
The model itself is called Sam.

02:15.620 --> 02:17.420
That's one that I found that works really well.

02:17.700 --> 02:20.020
And the prompt here is just the rhino's head.

02:20.260 --> 02:23.140
I'm also going to add another couple of features.

02:23.300 --> 02:25.780
So use grounding Dino or Dino.

02:26.380 --> 02:31.940
That is just deciding whether you're using Sam or grounding Dino to do the object detection.

02:32.380 --> 02:35.980
I found that this worked a little bit better for me, but you can experiment yourself.

02:36.020 --> 02:38.540
I've also added fill holes and expand mask.

02:38.580 --> 02:45.620
That's because I want like a full kind of blob rather than it defining the outline too much.

02:46.020 --> 02:48.820
Then I wanted to make a bigger space, essentially.

02:49.700 --> 02:55.260
And if you do that, then you get the result.

02:55.260 --> 03:04.820
You can also get the URL and let's just say let's print out the original image and the mask.

03:04.860 --> 03:06.700
We just kind of see how that looks.

03:07.820 --> 03:08.940
Let's see if this works.

03:10.060 --> 03:10.580
Okay.

03:10.620 --> 03:11.980
That's the original image.

03:12.340 --> 03:13.580
And there's the mask.

03:14.100 --> 03:18.220
So you can see it's drawn an outline around the rhino's head.

03:18.540 --> 03:22.100
And then drew the outline just like magic, which is really cool.

03:22.620 --> 03:23.620
Now why do you need this?

03:23.660 --> 03:29.420
If you want to do inpainting, you need to be able to say exactly what part of the image you want to

03:29.580 --> 03:30.780
make changes to.

03:31.180 --> 03:32.180
Here we can do that.

03:32.180 --> 03:40.540
We can say a line in the suit, and then we're calling the inpainting model with the image URL and the

03:40.540 --> 03:41.380
mask URL.

03:41.380 --> 03:46.740
So this is like a programmatic way to get out the image that we.

03:47.460 --> 03:49.740
The section of the image that we want to change.

03:50.620 --> 03:52.140
And that's super helpful.

03:52.820 --> 03:56.300
And we're just going to print out here the transformed image.

03:57.820 --> 03:59.740
So I'll just show you what that looks like.

04:01.340 --> 04:02.420
And it's going to take.

04:02.460 --> 04:05.980
And it's only going to generate within that white space right.

04:05.980 --> 04:09.140
Like it doesn't generate it doesn't change anything else about the image.

04:09.700 --> 04:13.660
And you can see here like sometimes it struggles to fit something in.

04:13.700 --> 04:15.780
So you can see it's like a bit of a gap here.

04:16.860 --> 04:19.860
That is one downside of inpainting because that's just the way it is.

04:20.220 --> 04:22.260
But it depends on what your use case is.

04:22.260 --> 04:27.660
For example, if this is a little bit weird because it's a rhino going to a lion to have different shapes,

04:27.660 --> 04:31.020
but if it was like a human face, it would work a lot better.
