WEBVTT

00:00.720 --> 00:02.560
Let's do animated frames.

00:03.320 --> 00:03.640
All right.

00:03.640 --> 00:07.920
Let's do spatial prompting or annotated frames in video three.

00:07.960 --> 00:11.040
And this is something I haven't really seen in the other video models.

00:11.080 --> 00:12.440
It's pretty special.

00:12.720 --> 00:18.400
I think just because video three is a really good model, it just understands what's actually in the

00:18.400 --> 00:19.760
frames of the image.

00:19.760 --> 00:24.160
And therefore you can use the frames of the image as a prompting surface area.

00:24.200 --> 00:28.120
So if I just upload this image here, it's one that I made before.

00:28.360 --> 00:33.480
It literally just went online for a simple let me just show you this.

00:34.600 --> 00:44.080
A simple car image and then I'll show you my chairs.

00:45.280 --> 00:45.480
Yeah.

00:45.520 --> 00:49.640
So it's a simple image and I've just a road, right.

00:49.680 --> 00:51.760
And I want a car to drive down here.

00:51.760 --> 00:54.800
I want birds to fly in all directions as the car passes.

00:54.840 --> 00:57.040
And I want an airplane to fly it that way.

00:57.280 --> 00:59.560
And I can put all that in the image.

00:59.560 --> 01:04.320
And Google will understand that as long as you put that as the first frame.

01:04.320 --> 01:07.040
And that's pretty cool that it knows how to do that.

01:07.040 --> 01:12.640
The reason why this is useful, by the way, is that quite often when you're doing video stuff, it's

01:12.640 --> 01:18.320
hard to explain, especially for complex scenes like where you want things to go or like where they're

01:18.320 --> 01:20.640
supposed to enter from, where they're going to be.

01:20.720 --> 01:26.760
Because if I said, for example, I want that tree to have birds fly out in all directions, it might

01:26.760 --> 01:30.000
think it mean one of these trees or this tree on the left.

01:30.200 --> 01:35.080
And I can highlight there and then point out where I want that to happen, which is really cool.

01:35.120 --> 01:35.360
All right.

01:35.400 --> 01:40.520
So let's say a car drives fast on the road.

01:41.440 --> 01:44.480
And we're just going to generate that note.

01:44.480 --> 01:45.960
We haven't said anything about birds.

01:45.960 --> 01:47.440
We haven't said anything about airplanes.

01:47.440 --> 01:50.160
So let's see if they show up in the final video.

01:51.040 --> 01:51.560
Okay.

01:51.600 --> 01:54.080
So we can see our image here and the videos come back.

01:54.080 --> 01:58.880
So let's see if they followed our instructions that remember were not in the prompt.

01:58.920 --> 02:01.000
We just said car drives fast along the road.

02:01.800 --> 02:02.320
Okay.

02:03.120 --> 02:08.320
Pretty much like almost the airplane was flying the wrong direction and the birds fly out there.

02:09.560 --> 02:14.200
The drawing stayed on for a little bit longer than we wanted it to, but otherwise pretty good.

02:14.200 --> 02:16.240
And it's only going to get better from here.
