WEBVTT

00:00.027 --> 00:01.740
-: Hey, I'm gonna walk you through

00:01.740 --> 00:04.290
how to get stable diffusion web UI running

00:04.290 --> 00:07.170
or otherwise known as AUTOMATIC1111.

00:07.170 --> 00:08.370
This is a way

00:08.370 --> 00:11.340
to get stable diffusion running locally on your computer.

00:11.340 --> 00:12.750
And before you do this,

00:12.750 --> 00:14.730
you will need some technical knowledge.

00:14.730 --> 00:17.160
It's not straightforward, it's a bit complicated.

00:17.160 --> 00:18.930
There's a lot of complicated parameters.

00:18.930 --> 00:20.670
You might not know how they work.

00:20.670 --> 00:23.940
There's also a requirement which is you need a GPU.

00:23.940 --> 00:25.803
So I'm running this on a Mac M2.

00:26.760 --> 00:28.710
There's also, you know, windows version

00:28.710 --> 00:33.300
for NVIDIA style GPUs if you have a gaming pc.

00:33.300 --> 00:35.400
But that's something that needs to happen

00:35.400 --> 00:38.700
'cause it can't, it's pretty heavy load

00:38.700 --> 00:42.330
and it can't really run just on your normal computer.

00:42.330 --> 00:45.690
You need a graphics processing unit, a GPU. Cool.

00:45.690 --> 00:48.450
This is Stable Diffusion web UI.

00:48.450 --> 00:50.160
Just wanna show you what it looks like when it's running

00:50.160 --> 00:52.530
and then we'll talk about how to install it.

00:52.530 --> 00:55.710
I'm in the stable diffusion web UI folder

00:55.710 --> 01:00.710
and then I just say bash webui.sh and it's running

01:00.875 --> 01:03.510
and it's gonna install anything it needs

01:03.510 --> 01:07.140
and get it to the place where it's actually running.

01:07.140 --> 01:09.000
You might get some error messages here,

01:09.000 --> 01:10.530
which you'll have to figure out. (laughs)

01:10.530 --> 01:12.390
Sometimes it's a bit complicated, like I said,

01:12.390 --> 01:14.610
there's a technical ability to it.

01:14.610 --> 01:16.920
It says it's just installing the requirements

01:16.920 --> 01:20.730
and once it's running, it will be available on this URL,

01:20.730 --> 01:23.520
which you'll be able to grab from the terminal.

01:23.520 --> 01:26.820
So I have a few extensions installed to try,

01:26.820 --> 01:28.870
so I am bringing some of these things in.

01:29.717 --> 01:33.947
Okay, yeah, here we go. This is the URL.

01:33.947 --> 01:37.203
So you can see there, paste it.

01:39.030 --> 01:42.120
All right, we have it so we can change models up here.

01:42.120 --> 01:47.120
I'm using the standard stable diffusion, the v1.5,

01:47.310 --> 01:50.760
a lot of people use version 1.5 over the version two

01:50.760 --> 01:53.580
or plus models just because it's a bit more flexible.

01:53.580 --> 01:57.210
There's a lot of user generated type content.

01:57.210 --> 01:58.950
People have trained models based on this,

01:58.950 --> 02:01.170
that are quite useful and all of those,

02:01.170 --> 02:03.570
like when you download them, they show up here.

02:03.570 --> 02:04.920
I'll give you a quick example.

02:04.920 --> 02:09.160
So just say like a cat wearing a cape

02:10.260 --> 02:12.930
and hit generate and this is where you put the prompt.

02:12.930 --> 02:15.690
You can also do negative prompts in here.

02:15.690 --> 02:18.930
There's, you know, a lot of functionality

02:18.930 --> 02:21.750
in terms of the parameters and stuff you can change.

02:21.750 --> 02:23.610
Don't worry about that too much right now.

02:23.610 --> 02:25.470
We'll cover that. So here we go.

02:25.470 --> 02:27.210
There's a cat generated with a cape

02:27.210 --> 02:29.190
and then I'm just gonna show you how

02:29.190 --> 02:31.554
that looks like a different model.

02:31.554 --> 02:35.610
So this is a cat wearing a cape.

02:35.610 --> 02:36.900
And actually by the way,

02:36.900 --> 02:39.240
one thing that's interesting is the every image

02:39.240 --> 02:40.080
that's generated,

02:40.080 --> 02:42.920
the prompt is saved as metadata in the image.

02:42.920 --> 02:44.400
So you can also see the seed

02:44.400 --> 02:47.400
and what model you used, et cetera, what sampler.

02:47.400 --> 02:49.050
That's a cat wearing a cape.

02:49.050 --> 02:49.920
I'm just gonna show you,

02:49.920 --> 02:52.713
I'm just gonna try the inkpunk-diffusion model.

02:57.390 --> 03:00.810
And that's working. And if you are downloading a model,

03:00.810 --> 03:01.950
lemme just get generate here.

03:01.950 --> 03:04.740
If you're downloading a new model, say from civitai,

03:04.740 --> 03:07.260
bear in mind that some of these are not safe for work,

03:07.260 --> 03:10.230
but if you download the model then you need

03:10.230 --> 03:13.650
to put it into your folder here.

03:13.650 --> 03:16.380
So this is stable diffusion web UI

03:16.380 --> 03:19.389
and then there's this models folder

03:19.389 --> 03:21.840
and then you have stable diffusion

03:21.840 --> 03:22.950
and this is where you stick it

03:22.950 --> 03:24.480
and when it's there,

03:24.480 --> 03:28.470
then you can just hit this refresh and it should show up

03:28.470 --> 03:29.940
or you can restart.

03:29.940 --> 03:33.210
Yeah, here we go. So this inkpunk-diffusion

03:33.210 --> 03:35.340
it didn't do anything different really

03:35.340 --> 03:37.590
'cause I didn't use the trigger word,

03:37.590 --> 03:40.803
but specifically there's a trigger word nvinkpunk.

03:41.774 --> 03:45.180
If I hit generate, it's gonna do it in this style.

03:45.180 --> 03:47.970
So this is a dream booth model that we can use.

03:47.970 --> 03:50.280
But there are, you know, lots of different types.

03:50.280 --> 03:52.620
Some have trigger words, some don't.

03:52.620 --> 03:54.063
You can see how that works.

03:57.540 --> 03:59.940
Yeah, here we go. So now it's in the ink punk style

03:59.940 --> 04:02.343
where you can see it's diffusing towards that.

04:10.078 --> 04:14.100
There we go, an inkpunk cat in a cape looks a lot cooler.

04:14.100 --> 04:15.510
So you can change the sampling steps.

04:15.510 --> 04:19.050
This is how many, yeah, how long to run diffusion process.

04:19.050 --> 04:20.670
You can also change the sampler.

04:20.670 --> 04:23.580
I just used Euler a, ancestral that stands for,

04:23.580 --> 04:25.770
there's a lot of differences between them,

04:25.770 --> 04:27.300
but the ones that stand out,

04:27.300 --> 04:28.830
a lot of people use eular a,

04:28.830 --> 04:33.090
a lot of people use DPM++ 2M Karras,

04:33.090 --> 04:37.020
this one here and then DDIM.

04:37.020 --> 04:38.640
That was like the original one

04:38.640 --> 04:39.930
that was like made specifically

04:39.930 --> 04:42.060
for stable diffusion, I think.

04:42.060 --> 04:43.710
This is how many images you get.

04:43.710 --> 04:46.200
So this would generate four images in a row

04:46.200 --> 04:50.310
and then this batch size is like how many images per batch

04:50.310 --> 04:51.660
that can be multiplied together.

04:51.660 --> 04:53.727
If you don't have a lot of VRAM

04:53.727 --> 04:56.430
and then your batch size, you wanna keep that small

04:56.430 --> 04:58.950
because it's generating at the same time

04:58.950 --> 05:01.363
it can really lag on your computer.

05:01.363 --> 05:06.180
A CFG scale is how much of a difference the prompt makes.

05:06.180 --> 05:09.270
So how close it is to the prompt versus how creative.

05:09.270 --> 05:13.397
So it's similar to temperature on the GPT-3

05:13.397 --> 05:15.060
or GPT-4 interface.

05:15.060 --> 05:16.980
You can add extensions here.

05:16.980 --> 05:20.220
So I've added control net, which I'm not gonna walk in

05:20.220 --> 05:21.450
but walk you through.

05:21.450 --> 05:24.840
But there's a lot of like cool stuff you can add in here

05:24.840 --> 05:26.310
and they'll show up in different places

05:26.310 --> 05:28.170
in the user interface.

05:28.170 --> 05:31.118
The way that you add extensions is if you go here, oh sorry,

05:31.118 --> 05:34.523
extensions and then available, load from,

05:34.523 --> 05:38.730
and it loads it all from this, this JSON file

05:38.730 --> 05:40.980
and you can see which ones there are and what they do.

05:40.980 --> 05:42.660
And then you can install them.

05:42.660 --> 05:45.000
You can also load them from a URL

05:45.000 --> 05:47.610
if you just find one out there on the web.

05:47.610 --> 05:49.590
But obviously be careful 'cause you're loading code.

05:49.590 --> 05:52.260
Some of them show up as their own separate tabs.

05:52.260 --> 05:54.150
Inpaint Anything is one example.

05:54.150 --> 05:57.420
They have, in some cases, they have like settings here

05:57.420 --> 05:58.560
that you can change.

05:58.560 --> 06:00.480
There's also like a lot of built-in extensions,

06:00.480 --> 06:03.990
like the image to image stuff you have inpaint in here,

06:03.990 --> 06:04.920
which I won't go through.

06:04.920 --> 06:07.080
You have interrogate clip,

06:07.080 --> 06:09.480
so you can reverse engineer the prompt from an image,

06:09.480 --> 06:10.980
which is pretty cool.

06:10.980 --> 06:13.800
There's a lot of parameters here you can mess around with.

06:13.800 --> 06:18.090
There's also the different scripts you can run.

06:18.090 --> 06:20.370
So like the X, Y, Z plot would generate,

06:20.370 --> 06:22.440
you could generate like say the same image

06:22.440 --> 06:27.420
but with five different values for CFG scale for example.

06:27.420 --> 06:30.930
So there's a bunch of different things you can do here.

06:30.930 --> 06:33.960
In the extras is where you do the scaling.

06:33.960 --> 06:37.740
So if you wanna upscale the image to a higher resolution

06:37.740 --> 06:40.170
or higher size, and then this is where you would do it.

06:40.170 --> 06:42.480
In extras you can also do batch processing

06:42.480 --> 06:45.313
for multiple images if you need to.

06:45.313 --> 06:48.930
Cool. That's the user interface.

06:48.930 --> 06:50.850
And if you ever get confused

06:50.850 --> 06:52.590
or you don't know what different things do,

06:52.590 --> 06:54.870
the wiki here is really good.

06:54.870 --> 06:58.140
Like it has a pretty detailed understanding

06:58.140 --> 07:00.210
of what these different things are.

07:00.210 --> 07:03.390
You know, check on that. But to install specifically,

07:03.390 --> 07:06.750
I'll just show you what I did to install on the Mac,

07:06.750 --> 07:08.520
and this is on the Apple Silicon.

07:08.520 --> 07:11.490
So it's the M1 and two Macs in particular.

07:11.490 --> 07:12.810
I used Homebrew, right?

07:12.810 --> 07:17.810
Once you have Homebrew, you run this in the terminal.

07:18.300 --> 07:21.870
I say if I just press Ctrl+C to cancel this,

07:21.870 --> 07:26.870
if you run that, it will install the cmake and protobuf

07:26.940 --> 07:29.340
and get wget. If you don't have them

07:29.340 --> 07:33.060
and then you just do a git clone when you're in your folder

07:33.060 --> 07:36.510
and that will just pull down from GitHub

07:36.510 --> 07:38.430
all the stable diffusion web UI code

07:38.430 --> 07:40.440
and that will give you that local folder

07:40.440 --> 07:42.330
that I was showing you before.

07:42.330 --> 07:46.650
Then you need to download the stable diffusion models.

07:46.650 --> 07:49.895
It has some examples here like where you can get them from.

07:49.895 --> 07:52.830
Version 1.5 is the one that I use.

07:52.830 --> 07:55.020
There's also a special one for Inpainting,

07:55.020 --> 07:56.220
which is quite useful

07:56.220 --> 07:59.790
and or there's version two if you want as well.

07:59.790 --> 08:01.800
And typically you just click it

08:01.800 --> 08:04.410
and it will take you out to somewhere like hugging face

08:04.410 --> 08:06.720
where you can, you know, download the model.

08:06.720 --> 08:08.670
The files and versions are in here

08:08.670 --> 08:10.920
and I think basically you just need this

08:10.920 --> 08:12.900
checkpoint file here.

08:12.900 --> 08:17.520
So if you click that, it'll start downloading. Okay.

08:17.520 --> 08:21.450
Then once you have, once you have that available,

08:21.450 --> 08:23.790
you drop that in your models folder

08:23.790 --> 08:25.800
and then you can just cd like change directory

08:25.800 --> 08:30.800
into stable diffusion web UI and then run bash webui.sh

08:30.989 --> 08:34.200
or just this ./webui.sh.

08:34.200 --> 08:36.180
And that will do everything for you.

08:36.180 --> 08:38.311
Now if you're running any issues

08:38.311 --> 08:42.083
there's actually quite a lot of information here.

08:42.083 --> 08:45.150
There's like all these issues you can search through

08:45.150 --> 08:47.760
and if you just search for the error message you're getting,

08:47.760 --> 08:50.760
quite often people have already written how to solve it.

08:50.760 --> 08:52.320
There's a lot of pull requests

08:52.320 --> 08:54.990
where they're trying to improve some of the UI

08:54.990 --> 08:56.940
and some of the issues that you run into.

08:56.940 --> 08:59.150
And there's like specific discussions.

08:59.150 --> 09:04.150
So that is how to get it running on Mac.

09:04.200 --> 09:08.310
On Windows so let's assume you have a NVIDIA GPU.

09:08.310 --> 09:09.720
Again, the process is similar.

09:09.720 --> 09:13.350
They don't have homebrew on the Windows,

09:13.350 --> 09:16.410
but basically the way it works is you download this

09:16.410 --> 09:18.450
and extract the zip file

09:18.450 --> 09:21.060
and then you double click the update.bat

09:21.060 --> 09:22.020
and then that will get it

09:22.020 --> 09:23.700
to like the latest version essentially.

09:23.700 --> 09:25.350
And then there's a run.bat script.

09:25.350 --> 09:26.750
You just click that and that should launch it.

09:26.750 --> 09:29.250
It will download like all the files that you need.

09:29.250 --> 09:32.280
And then you should see this like running on URL.

09:32.280 --> 09:35.220
And it's the same URL that you get on a Mac.

09:35.220 --> 09:37.530
So that is, that should work on your computer

09:37.530 --> 09:39.060
if you have a GPU.

09:39.060 --> 09:42.150
But there's also like multiple methods if you have issues.

09:42.150 --> 09:44.940
So you can do the get clone method like you do on Mac.

09:44.940 --> 09:47.880
You know, you can also use it on Linux

09:47.880 --> 09:49.920
and there's different instructions by the way,

09:49.920 --> 09:54.420
if you have an NVIDIA GPU versus if you have an AMD GPU.

09:54.420 --> 09:57.030
So just pay attention to that.

09:57.030 --> 09:58.560
But in general, it's always the same thing.

09:58.560 --> 10:00.237
You get the code from GitHub

10:00.237 --> 10:03.634
and then run like the windows, the .bat file

10:03.634 --> 10:08.310
or the .sh file, depending on which file system you're on.

10:08.310 --> 10:10.800
All right. Hopefully that's helpful

10:10.800 --> 10:12.000
because this is open source,

10:12.000 --> 10:13.650
you can literally just have a look

10:13.650 --> 10:15.934
and see how they do different things.

10:15.934 --> 10:19.730
If you want to know how do they do the, you know,

10:19.730 --> 10:23.190
how do they input all the prompts from a file,

10:23.190 --> 10:25.547
you can just see, you know, how they've done that.

10:25.547 --> 10:28.920
That's the code here, you can make updates to this

10:28.920 --> 10:30.270
if you want just locally

10:30.270 --> 10:32.820
or you could actually push 'em back into the main repository

10:32.820 --> 10:35.010
for other people to benefit from.

10:35.010 --> 10:36.570
So yeah, lots of fun.

10:36.570 --> 10:38.700
You're gonna learn a lot about image generation

10:38.700 --> 10:40.680
when you can just generate these images for free

10:40.680 --> 10:42.810
on your own computer and just leave it running.

10:42.810 --> 10:44.640
So you know, have fun with it,

10:44.640 --> 10:46.350
check out all the different parameters,

10:46.350 --> 10:48.360
look up the definitions of what they are.

10:48.360 --> 10:50.430
And that's really the best way to learn.

10:50.430 --> 10:52.470
It's just experimentation.

10:52.470 --> 10:54.370
All right. Hopefully that was helpful.
