WEBVTT

00:00.690 --> 00:02.400
-: All right, Stable Diffusion.

00:02.400 --> 00:04.170
What is it good at, what is it bad at?

00:04.170 --> 00:07.350
So the really interesting thing about stable

00:07.350 --> 00:08.820
diffusion is that it's open source.

00:08.820 --> 00:10.980
So I think that that's been a real benefit.

00:10.980 --> 00:13.140
A lot of people have innovated on top of it

00:13.140 --> 00:15.360
and figured out interesting ways to use it.

00:15.360 --> 00:19.890
So you do have to use a GPU graphics card to run it,

00:19.890 --> 00:22.800
but you can actually do that in the cloud,

00:22.800 --> 00:24.150
which is pretty cheap.

00:24.150 --> 00:27.810
Or for free, actually, in Google Colab.

00:27.810 --> 00:31.080
So it's a great way to learn, or if you have an M1 Mac

00:31.080 --> 00:33.480
or M2 Mac, that you can do that as well.

00:33.480 --> 00:35.460
You can use negative prompts, which is helpful.

00:35.460 --> 00:38.190
I think it gives you a bit more control flexibility,

00:38.190 --> 00:40.110
so you can kind of, in this case,

00:40.110 --> 00:42.180
we've got a photograph of astronaut riding a horse,

00:42.180 --> 00:44.580
but we don't want 'em to be in space.

00:44.580 --> 00:47.100
There's also classifier free guidance

00:47.100 --> 00:49.550
that kind of chooses how much the prompt matches,

00:50.880 --> 00:52.320
how much the image matches the prompt,

00:52.320 --> 00:55.020
how creative the AI can be, essentially.

00:55.020 --> 00:57.030
But the really cool thing, I think is Dreambooth.

00:57.030 --> 01:01.020
So that's the ability to train the AI on some concept

01:01.020 --> 01:03.420
or person, like this is a picture of me,

01:03.420 --> 01:05.670
but generated by AI.

01:05.670 --> 01:07.500
I think this is like the most advanced,

01:07.500 --> 01:10.920
like futuristic thing that AI can do at the minute.

01:10.920 --> 01:12.570
It's really cool.

01:12.570 --> 01:14.460
There's some limitations though.

01:14.460 --> 01:16.800
I think technical ability is a big one.

01:16.800 --> 01:18.630
You really have to know how to code

01:18.630 --> 01:19.590
to mess around with this.

01:19.590 --> 01:23.190
I think otherwise you're gonna get confused pretty quickly.

01:23.190 --> 01:24.990
It also takes like a long time to run.

01:24.990 --> 01:27.450
So unless you have powerful computer,

01:27.450 --> 01:29.340
it's gonna be like really difficult for you

01:29.340 --> 01:31.953
to iterate or kind of change anything about it.

01:32.790 --> 01:35.610
Also, runs into problems with faces and hands.

01:35.610 --> 01:37.920
So here's a few generations I've made

01:37.920 --> 01:39.900
where I've run into issues,

01:39.900 --> 01:42.060
although I think fine tuning the model,

01:42.060 --> 01:44.790
like using Dreambooth has definitely helped.

01:44.790 --> 01:46.440
So there's some people have made custom models

01:46.440 --> 01:48.940
that are really good at stock photos, for example.

01:49.800 --> 01:51.660
There's also kind of questionable provenance.

01:51.660 --> 01:54.600
So it's like, initially, it kind of came out

01:54.600 --> 01:56.460
as part of Stability AI,

01:56.460 --> 01:59.823
but then it was actually built by Runway,

02:01.620 --> 02:03.210
which is another AI company.

02:03.210 --> 02:05.790
But it was kind of funded by Stability AI.

02:05.790 --> 02:09.270
But then it was also a team from from Germany,

02:09.270 --> 02:10.650
the LMU Munich.

02:10.650 --> 02:14.160
So I think there were actually some issues

02:14.160 --> 02:16.140
where the model got taken down

02:16.140 --> 02:17.670
from the face for a while

02:17.670 --> 02:19.860
because it was fighting IP breach.

02:19.860 --> 02:22.140
But yeah, I mean, I think it's probably all sorted out

02:22.140 --> 02:26.880
now, but definitely, it's not as clear

02:26.880 --> 02:28.320
in terms of who owns it

02:28.320 --> 02:30.963
and who's responsible for improving it.

02:32.130 --> 02:34.200
But otherwise, I think Stable Diffusion

02:34.200 --> 02:37.290
is really the only AI art tool

02:37.290 --> 02:38.880
that you could really build your business on.

02:38.880 --> 02:41.100
Because it's open source,

02:41.100 --> 02:44.910
you don't need to worry about them blocking anything, right?

02:44.910 --> 02:47.070
Like you can run it on your own computer

02:47.070 --> 02:49.890
and then you can modify it the way that you want.

02:49.890 --> 02:51.450
You can fine tune it with Dreambooth.

02:51.450 --> 02:55.920
So I think that this has got the most commercial value out

02:55.920 --> 02:57.633
of any of the AI models.
