WEBVTT

00:00.360 --> 00:03.960
-: Hey, I'm gonna walk you through how to prompt DALL-E 3.

00:03.960 --> 00:06.900
It's only available right now through GPT-4,

00:06.900 --> 00:10.320
which is on the paid plan, which you need to have,

00:10.320 --> 00:12.900
but it's pretty accessible once you do have that.

00:12.900 --> 00:15.150
So you can literally just ask it for an image

00:15.150 --> 00:17.100
and it's gonna come up with something.

00:17.100 --> 00:21.030
Give me an image of a corgi

00:21.030 --> 00:25.053
sitting on top of the Brandenburg Gate.

00:26.100 --> 00:31.020
And it's pretty useful for these types of queries

00:31.020 --> 00:32.880
where you're actually asking it

00:32.880 --> 00:35.820
to put something on top of something or behind something.

00:35.820 --> 00:39.660
This is something that the other AI image tools

00:39.660 --> 00:40.890
are currently not very good at.

00:40.890 --> 00:44.490
So Midjourney doesn't have that much situational awareness

00:44.490 --> 00:48.570
as well as Stable Diffusion also struggles with this.

00:48.570 --> 00:49.860
That said, the latest version

00:49.860 --> 00:52.980
of Stable Diffusion is a diffusion transformer model

00:52.980 --> 00:56.400
and it's rumored that DALL-E is the same.

00:56.400 --> 00:59.070
And what that means is it has an LLM

00:59.070 --> 01:02.070
that also powers the image generation.

01:02.070 --> 01:03.660
It's not just a diffusion model.

01:03.660 --> 01:05.160
It understands a little bit more

01:05.160 --> 01:06.690
about what's supposed to be in the image

01:06.690 --> 01:09.450
and it can reason a little bit better about that.

01:09.450 --> 01:10.590
So here we go.

01:10.590 --> 01:13.080
Now we've got this image, which is pretty cool.

01:13.080 --> 01:15.030
There's a few different things we can do here.

01:15.030 --> 01:17.130
So one is we can download the image.

01:17.130 --> 01:19.800
We could also click on the image.

01:19.800 --> 01:22.320
You can see that it's coming back here.

01:22.320 --> 01:25.770
Now we have some additional things that we can do.

01:25.770 --> 01:27.900
So we can click here to see the prompt

01:27.900 --> 01:30.030
and see what prompt it's written for us.

01:30.030 --> 01:32.790
This is one of the core things about accessibility,

01:32.790 --> 01:34.590
is that you don't need to know prompt engineering

01:34.590 --> 01:35.640
to use DALL-E.

01:35.640 --> 01:37.950
It's gonna write this for you here.

01:37.950 --> 01:41.580
You can see that it's expanded upon

01:41.580 --> 01:45.690
our initial just simple prompt to give us a much better one.

01:45.690 --> 01:48.600
But we're gonna say, first of all, we could just chat back

01:48.600 --> 01:52.560
and say the dog is supposed

01:52.560 --> 01:55.290
to look realistic.

01:55.290 --> 01:59.220
'Cause a lot of the times it's not, which is a real problem.

01:59.220 --> 02:01.020
And this is really useful

02:01.020 --> 02:04.140
because you can just talk in natural language and get back,

02:04.140 --> 02:06.030
it's gonna write another prompt for us.

02:06.030 --> 02:07.620
It's creating the image.

02:07.620 --> 02:09.510
So we're gonna get back something different.

02:09.510 --> 02:10.343
It's gonna respond,

02:10.343 --> 02:12.330
it's gonna write a different prompt for us,

02:12.330 --> 02:13.593
which is really helpful.

02:32.850 --> 02:33.683
Okay, here we go.

02:33.683 --> 02:36.570
Now we have a realistic dog, which is much better.

02:36.570 --> 02:37.870
Let's give it a thumbs up.

02:38.970 --> 02:41.160
And it describes what's going on.

02:41.160 --> 02:42.780
But the problem is we actually wanted it

02:42.780 --> 02:44.130
on top of the gate here.

02:44.130 --> 02:45.780
So I'm gonna click into the image.

02:45.780 --> 02:48.150
And one of the cool things they added more recently

02:48.150 --> 02:50.560
is you can select

02:51.993 --> 02:53.730
a certain part of it

02:53.730 --> 02:56.943
and you can get rid of that.

03:01.080 --> 03:01.990
And

03:05.610 --> 03:07.380
you can then prompt again.

03:07.380 --> 03:11.403
You can say Brandenburg, no dog.

03:12.630 --> 03:13.623
See what that does.

03:14.610 --> 03:18.390
So it should just get rid of this part here.

03:18.390 --> 03:19.790
So this is called inpainting

03:20.760 --> 03:22.860
and this used to be available at DALL-E 2.

03:22.860 --> 03:24.840
It's not available anymore unfortunately,

03:24.840 --> 03:28.593
but we can get access it in ChatGPT here.

03:44.280 --> 03:45.113
Okay, there we go.

03:45.113 --> 03:46.620
It's removed the dog from the image,

03:46.620 --> 03:48.210
which is really cool.

03:48.210 --> 03:52.680
Now we wanna click on this image,

03:52.680 --> 03:54.270
wanna potentially edit this

03:54.270 --> 03:58.143
and say, should say put the Corgi here,

04:00.000 --> 04:03.930
should say giant corgi

04:03.930 --> 04:07.053
on top of Brandenburg Gate.

04:11.220 --> 04:13.710
And again, this is just gonna change this part here.

04:13.710 --> 04:15.900
So if we're happy with the rest of the image,

04:15.900 --> 04:20.043
then this is a really useful way to improve things.

04:58.290 --> 04:59.123
Here we go.

04:59.123 --> 05:01.590
Now we have a Corgi on top of the Brandenburg Gate.

05:01.590 --> 05:03.660
So that's pretty helpful.

05:03.660 --> 05:06.690
And we're not gonna go too deep into this.

05:06.690 --> 05:09.600
The functionality is changing over time.

05:09.600 --> 05:13.380
But one thing I will mention though is that you can,

05:13.380 --> 05:17.880
if you go into here, you go into Explore GPTs,

05:17.880 --> 05:21.060
that you could find, the specific DALL-E one.

05:21.060 --> 05:24.330
You can use these where it only does images,

05:24.330 --> 05:26.820
and there's some that people have made that are custom.

05:26.820 --> 05:28.530
So this one's Cartoonize Yourself.

05:28.530 --> 05:30.540
It's got specific prompts for that.

05:30.540 --> 05:32.310
And you know, here's one

05:32.310 --> 05:34.683
that's really good at logos, et cetera.

05:36.840 --> 05:38.340
Here, this one actually is pretty cool.

05:38.340 --> 05:40.890
It creates prompts that are good for Midjourney

05:40.890 --> 05:42.330
so you can try that.

05:42.330 --> 05:43.200
All right, enjoy.

05:43.200 --> 05:45.210
Hopefully you have a good time with DALL-E

05:45.210 --> 05:47.550
and you can also access it via API,

05:47.550 --> 05:50.733
but primarily people are using it through the interface.
