WEBVTT

00:00.180 --> 00:01.020
-: All right, welcome back.

00:01.020 --> 00:02.400
So in this video, we're gonna have a look

00:02.400 --> 00:04.320
at how you can automate product descriptions

00:04.320 --> 00:06.750
using OpenAI's Vision.

00:06.750 --> 00:08.130
We're gonna have a look at, see,

00:08.130 --> 00:09.480
the kind of package imports

00:09.480 --> 00:12.090
you'll be importing is OpenAI, os,

00:12.090 --> 00:13.440
and then we're gonna use text wrap

00:13.440 --> 00:16.920
to easily see the product descriptions that we get back.

00:16.920 --> 00:19.260
If you don't have an OpenAI API key,

00:19.260 --> 00:20.940
you'll need to insert one here

00:20.940 --> 00:23.100
and then we're gonna load a text_wrapper.

00:23.100 --> 00:25.200
And to start with, we'll use a prompt

00:25.200 --> 00:28.290
to specifically say, Act as a fashion retailer.

00:28.290 --> 00:29.520
You're responsible for writing

00:29.520 --> 00:31.380
effective product descriptions.

00:31.380 --> 00:33.870
We're gonna have a look at this black navy trousers

00:33.870 --> 00:35.760
and see how we can automatically generate

00:35.760 --> 00:37.290
product descriptions for that.

00:37.290 --> 00:40.410
So what we're gonna use is the OpenAI's Vision API,

00:40.410 --> 00:43.770
and you'll see we have to use this gpt-4-vision-preview.

00:43.770 --> 00:45.360
We've also got some interesting things.

00:45.360 --> 00:47.370
So you can see here there's this image_url,

00:47.370 --> 00:49.470
which is the image URL I just showed you.

00:49.470 --> 00:50.490
And we've got our prompt

00:50.490 --> 00:53.580
that's specifically telling the ChatGPT API,

00:53.580 --> 00:55.200
in this case the Vision API,

00:55.200 --> 00:57.000
that we want to get a product description

00:57.000 --> 00:58.740
from this specific image.

00:58.740 --> 01:00.510
Now, after loading the client,

01:00.510 --> 01:02.640
we can then see what kind of product description

01:02.640 --> 01:03.570
we get back.

01:03.570 --> 01:05.100
And you can see it takes around

01:05.100 --> 01:07.800
about maybe four to 10 seconds on average

01:07.800 --> 01:08.633
to use the vision API.

01:08.633 --> 01:09.840
It can take a little bit longer.

01:09.840 --> 01:10.980
As you can see here,

01:10.980 --> 01:13.410
we're clocking in at about 11.6 seconds.

01:13.410 --> 01:15.720
Now, we have got a product description

01:15.720 --> 01:18.480
for this specific pair of navy pants

01:18.480 --> 01:20.490
in the suit section of ASOS.

01:20.490 --> 01:21.570
But in particular,

01:21.570 --> 01:23.460
I'm not particularly happy with this prompt.

01:23.460 --> 01:25.350
And the reason why is it's quite large

01:25.350 --> 01:28.053
in terms of the product description.

01:29.010 --> 01:29.970
So we wanna limit this

01:29.970 --> 01:31.680
to maybe two to four sentences in length.

01:31.680 --> 01:35.310
So we we're gonna do is we're gonna add some examples.

01:35.310 --> 01:37.950
In this case we're not gonna have a product context,

01:37.950 --> 01:39.180
but that could be interesting

01:39.180 --> 01:41.040
if you needed to be more specific.

01:41.040 --> 01:43.170
That could be something you could add into the prompt.

01:43.170 --> 01:44.760
So you'll now see our improved prompt

01:44.760 --> 01:46.680
is acting as a fashion retailer.

01:46.680 --> 01:48.300
We're asking it to use all the images

01:48.300 --> 01:50.220
because ChatGPT vision API

01:50.220 --> 01:51.960
can accept multiple images

01:51.960 --> 01:53.910
and also we've got some examples

01:53.910 --> 01:55.260
which we've specified earlier

01:55.260 --> 01:57.750
that really helped clarify the length and tone.

01:57.750 --> 01:59.460
We've also got a rule section,

01:59.460 --> 02:00.900
so you could add a rule section here

02:00.900 --> 02:02.070
so that the product description

02:02.070 --> 02:04.440
must be between two to four length

02:04.440 --> 02:05.940
in terms of sentences

02:05.940 --> 02:08.460
and also it must be in a professional tone.

02:08.460 --> 02:10.590
We've also said use the following images.

02:10.590 --> 02:11.850
So let's have a look at this.

02:11.850 --> 02:13.410
So this is a single image,

02:13.410 --> 02:15.240
but you'll see that after this comes back

02:15.240 --> 02:17.280
already I've got a cash response here

02:17.280 --> 02:18.570
and here's another one.

02:18.570 --> 02:21.660
The actual product description length is a lot less

02:21.660 --> 02:23.760
and we've still got some good pits

02:23.760 --> 02:26.490
specifically talking here about the cargo trousers.

02:26.490 --> 02:28.440
So this is a different image.

02:28.440 --> 02:30.458
What we are looking at here

02:30.458 --> 02:32.100
is specifically the cargo trousers.

02:32.100 --> 02:34.440
And that can be interesting from our point of view

02:34.440 --> 02:35.610
because what we're trying to do

02:35.610 --> 02:37.860
is get a reduced amount of size.

02:37.860 --> 02:40.560
Now I want to go and take it another step further.

02:40.560 --> 02:42.870
And what I want to do is I want to say

02:42.870 --> 02:44.550
for this product context,

02:44.550 --> 02:47.760
the item is gonna be a cargo trousers in black

02:47.760 --> 02:49.200
and it's not gonna be a hoodie.

02:49.200 --> 02:51.990
And what we are then saying is act as a fashion retailer.

02:51.990 --> 02:53.550
And so same kind of prompt,

02:53.550 --> 02:56.010
we've got our examples in here, we've got our rules.

02:56.010 --> 02:58.650
The product context is now no longer an empty string.

02:58.650 --> 02:59.790
What's interesting about this

02:59.790 --> 03:02.520
is when you look at cargo pants in particular,

03:02.520 --> 03:05.430
you'll see that the cargo pants themselves,

03:05.430 --> 03:09.210
they actually have a top as well, a hoodie top.

03:09.210 --> 03:11.040
And we wanna make sure that ChatGPT

03:11.040 --> 03:13.290
is really not going to talk about the hoodie at all.

03:13.290 --> 03:15.360
And you'll see in this separate model picture,

03:15.360 --> 03:18.210
we really want to talk about these black cargo trousers.

03:18.210 --> 03:20.100
And so that's why you might have a section

03:20.100 --> 03:21.510
like the product context

03:21.510 --> 03:24.780
to say the product description is this cargo trousers,

03:24.780 --> 03:27.840
it's not the hoodie or some type of product title.

03:27.840 --> 03:29.940
And the other interesting thing as well

03:29.940 --> 03:31.710
is notice how we've put both the images

03:31.710 --> 03:34.770
that I just showed you into ChatGPT, vision API.

03:34.770 --> 03:36.860
So you can have multiple images

03:36.860 --> 03:40.200
to really refine those products description outputs.

03:40.200 --> 03:42.510
So let's have a look and see how long this takes.

03:42.510 --> 03:45.360
Probably should take about five to seven seconds.

03:45.360 --> 03:46.830
So again, you've got five seconds

03:46.830 --> 03:48.000
and if we have a look,

03:48.000 --> 03:49.710
it's roughly about the length that we want.

03:49.710 --> 03:52.530
You've got a nice bit here around the leisure wardrobe.

03:52.530 --> 03:54.600
We've got the ultimate blended comforts

03:54.600 --> 03:56.400
and perfect for those on-the-go moments

03:56.400 --> 03:57.270
or relaxing weekends.

03:57.270 --> 03:59.610
So I'm really happy with how this product description

03:59.610 --> 04:00.480
has turned out.

04:00.480 --> 04:02.040
So hopefully this gives you a bit of insight

04:02.040 --> 04:04.680
into how you could use this at e-commerce websites

04:04.680 --> 04:08.490
to automate your product descriptions using ChatGPT, API.