WEBVTT

00:01.110 --> 00:01.943
-: All right.

00:01.943 --> 00:04.470
What are Midjourney's capabilities and limitations?

00:04.470 --> 00:06.300
What's good about it and what's bad?

00:06.300 --> 00:09.030
One cool thing is that it's built into Discord

00:09.030 --> 00:10.560
into the community tool.

00:10.560 --> 00:13.050
Similar to Slack. You type in your prompt,

00:13.050 --> 00:14.430
it's /imagine,

00:14.430 --> 00:16.260
and then whatever your prompt is.

00:16.260 --> 00:18.480
And that makes it easier in some ways.

00:18.480 --> 00:20.580
Like you can use it on mobile pretty easily.

00:20.580 --> 00:22.290
And you can also click to upscale

00:22.290 --> 00:24.720
or click on the variance as well.

00:24.720 --> 00:27.630
It's community first because it's built into Discord.

00:27.630 --> 00:31.560
Every single member, you know, of every single user

00:31.560 --> 00:34.890
of Midjourney is a member of this platform.

00:34.890 --> 00:36.630
And because, you know,

00:36.630 --> 00:40.170
because you get a lot of people posting in there, you get

00:40.170 --> 00:42.120
to see a lot of variation in terms of like

00:42.120 --> 00:44.310
what prompts are working for different people.

00:44.310 --> 00:45.540
And you can just take those prompts

00:45.540 --> 00:46.950
and then copy them as well.

00:46.950 --> 00:49.113
So it's a real hotbed of innovation.

00:50.310 --> 00:52.830
One cool thing you can do is you can add a base image

00:52.830 --> 00:53.663
into the prompt.

00:53.663 --> 00:56.280
So you can provide a kind of an example,

00:56.280 --> 01:00.570
and that tends to really improve performance of the prompts.

01:00.570 --> 01:03.630
You can also do image blending, so this is kind of unique.

01:03.630 --> 01:05.760
You can just give it two different images

01:05.760 --> 01:07.320
and blend them together.

01:07.320 --> 01:09.690
I found that that can be quite powerful.

01:09.690 --> 01:11.910
Negative prompting is really key.

01:11.910 --> 01:13.260
It's something you can't do in DALL-E,

01:13.260 --> 01:15.270
but you can do, in Stable Diffusion,

01:15.270 --> 01:16.950
you can remove different elements.

01:16.950 --> 01:19.500
So in this example, we've got Peppa Pig,

01:19.500 --> 01:20.430
a children's cartoon,

01:20.430 --> 01:22.200
but we've stripped

01:22.200 --> 01:23.880
out the cartoon aspect

01:23.880 --> 01:26.313
and got this kind of nice fluffy toy instead.

01:27.600 --> 01:30.480
Here are weighted terms, so this is, you know, kind

01:30.480 --> 01:34.290
of like a bit more fine-grained, you know, control

01:34.290 --> 01:35.970
over the prompt.

01:35.970 --> 01:37.290
Similar to negative prompts.

01:37.290 --> 01:38.460
Like if you put minus one,

01:38.460 --> 01:41.220
then that'd be the same as negative.

01:41.220 --> 01:44.370
But you know, here we blended a little bit of Van Gogh

01:44.370 --> 01:45.330
and the, and you know,

01:45.330 --> 01:49.320
and some Picasso to kind of get the ultimate style

01:49.320 --> 01:51.960
that we wanted, which is kind of interesting.

01:51.960 --> 01:54.090
And then we also have a lot of model settings.

01:54.090 --> 01:57.300
So you know, you can choose which version you're using.

01:57.300 --> 01:59.520
So if you wanna scale back to an older version,

01:59.520 --> 02:00.690
they're all very different.

02:00.690 --> 02:02.760
Like v5.0 is a lot more realistic,

02:02.760 --> 02:04.770
v4.0 is a lot more cinematic,

02:04.770 --> 02:08.460
and they have a few different other styles and modes,

02:08.460 --> 02:10.380
but it does have some limitations.

02:10.380 --> 02:13.860
So one thing that, I guess it could be a positive

02:13.860 --> 02:15.390
or a negative is that it tends

02:15.390 --> 02:17.160
to have this fantasy aesthetic.

02:17.160 --> 02:21.240
So everything comes out, you know, relatively like fairytale

02:21.240 --> 02:22.740
or video gamey.

02:22.740 --> 02:25.530
And I think that that's likely to stay just

02:25.530 --> 02:30.510
because the community itself rates the images

02:30.510 --> 02:33.030
and the community, I think the early community were a lot

02:33.030 --> 02:35.130
of like digital artists

02:35.130 --> 02:36.600
and game designers, things like that.

02:36.600 --> 02:38.430
So it's kind of had that look

02:38.430 --> 02:39.420
and feel, it looks like something

02:39.420 --> 02:41.733
from Deviant Art or Art Station.

02:42.720 --> 02:45.780
The other thing that it doesn't do very well still is kind

02:45.780 --> 02:47.820
of like faces and hands.

02:47.820 --> 02:49.200
In some cases, like if you ask

02:49.200 --> 02:52.260
for a model, it always looks too polished and perfect.

02:52.260 --> 02:54.540
The hands themselves tend to be pretty bad,

02:54.540 --> 02:58.200
although that is better in v5.0.

02:58.200 --> 03:00.510
You know, this is like a pretty interesting image,

03:00.510 --> 03:02.280
but you know, you can tend

03:02.280 --> 03:03.870
to see the faces look really scary,

03:03.870 --> 03:06.150
not, you know, not particularly realistic.

03:06.150 --> 03:07.980
Occasionally you just have the,

03:07.980 --> 03:10.170
like a really odd thing happening like, you know,

03:10.170 --> 03:12.060
this woman's weird face here,

03:12.060 --> 03:14.100
or you know, this woman like facing the wrong

03:14.100 --> 03:15.870
way from the MacBook.

03:15.870 --> 03:18.660
So yeah, still some issues.

03:18.660 --> 03:20.850
There's no API available unfortunately,

03:20.850 --> 03:23.610
so this is really only a consumer tool.

03:23.610 --> 03:24.810
You can't really use it

03:24.810 --> 03:28.500
to, you know, use it programmatically for anything.

03:28.500 --> 03:29.820
But other than that,

03:29.820 --> 03:33.660
like I think Midjourney is like a very good tool to use.

03:33.660 --> 03:36.090
It's surprising actually how well it's done

03:36.090 --> 03:37.800
considering it doesn't have the funding

03:37.800 --> 03:40.740
of OpenAI or, you know, the open source.

03:40.740 --> 03:42.150
It doesn't benefit from open source

03:42.150 --> 03:44.970
like Stability does Stable Diffusion.

03:44.970 --> 03:48.600
So yeah, I think this is, it's carved out really niche.

03:48.600 --> 03:50.880
I think the community has driven it forward

03:50.880 --> 03:53.530
and we'll continue to drive it forward in the future.
