WEBVTT

00:00.330 --> 00:03.000
-: I'm gonna show you how to use ChatGPT

00:03.000 --> 00:05.280
to automate your prompting.

00:05.280 --> 00:06.113
And that's crazy

00:06.113 --> 00:07.560
because we're using prompting

00:07.560 --> 00:09.150
to automate every other industry.

00:09.150 --> 00:12.450
Why not, if PowerShell is good at all these other tasks

00:12.450 --> 00:15.300
and why wouldn't it be good at prompt engineering as well?

00:15.300 --> 00:19.050
So this is screenshot from a paper here

00:19.050 --> 00:22.440
and they called it Automatic Prompt Engineer

00:22.440 --> 00:26.447
and they basically used, I think it was ChatGPT

00:26.447 --> 00:30.240
or GPT Three back then to basically infer

00:30.240 --> 00:34.170
what the instructions were if they give example responses

00:34.170 --> 00:37.500
and then if they specific instructions,

00:37.500 --> 00:38.940
what would the instructions be.

00:38.940 --> 00:41.880
And so it comes back with the log probabilities.

00:41.880 --> 00:44.760
So these are the different potential responses

00:44.760 --> 00:46.860
and then it basically then ranks them

00:46.860 --> 00:49.200
and I found that it was a human level

00:49.200 --> 00:51.240
in terms of promised engineering.

00:51.240 --> 00:53.310
Then there've been a few other papers

00:53.310 --> 00:54.720
that have come out since then

00:54.720 --> 00:59.460
including one that found a better prompt

00:59.460 --> 01:01.440
than let's think step by step.

01:01.440 --> 01:03.570
It does pretty well there in terms of accuracy.

01:03.570 --> 01:06.210
And then there's a few other ones as well.

01:06.210 --> 01:08.640
If you remember there was going around the internet,

01:08.640 --> 01:10.800
this whole like take a deep breath

01:10.800 --> 01:14.550
and then think step by step prompt that people were sharing

01:14.550 --> 01:15.930
that was also using

01:15.930 --> 01:18.270
like an automatic prompt engineer option.

01:18.270 --> 01:21.600
So anyway, this is like pretty interesting.

01:21.600 --> 01:23.280
I think in some cases it might work

01:23.280 --> 01:25.230
better than humans at prompt engineering.

01:25.230 --> 01:27.150
What I found is at the very least,

01:27.150 --> 01:28.650
it's very good at getting you

01:28.650 --> 01:30.750
up to above average prompt straight away

01:30.750 --> 01:33.090
without you having to think very much.

01:33.090 --> 01:35.760
Let me show you, I have a couple of these

01:35.760 --> 01:39.180
but I'll start with this one.

01:39.180 --> 01:43.140
I put this together based on a lot of testing,

01:43.140 --> 01:45.150
but what I found was useful

01:45.150 --> 01:47.820
and the system prompt is to actually tell it

01:47.820 --> 01:50.100
you're an expert in prompt engineering.

01:50.100 --> 01:52.470
Previously that wasn't very helpful

01:52.470 --> 01:54.780
but now it knows what prompt engineering is.

01:54.780 --> 01:56.040
That's with the latest updates.

01:56.040 --> 01:58.920
It's updated I think to April last year.

01:58.920 --> 02:00.780
So now it knows what prompt engineering is

02:00.780 --> 02:02.130
and previously this didn't work

02:02.130 --> 02:04.710
but now it does and we just basically tell

02:04.710 --> 02:06.330
that you'll be given a prompt template,

02:06.330 --> 02:07.800
one or more test cases.

02:07.800 --> 02:10.127
Your job is to optimize the prompt template

02:10.127 --> 02:11.580
using prompt engineering best practices.

02:11.580 --> 02:13.920
What I've done here is I've gone a little bit further

02:13.920 --> 02:16.110
and given it the best practices

02:16.110 --> 02:17.790
and I've told it what prompt engineering is

02:17.790 --> 02:19.380
just to get it in the right part

02:19.380 --> 02:22.950
of latent space without making sure it's not thinking

02:22.950 --> 02:24.780
of some other prompt engineering.

02:24.780 --> 02:27.840
And then these are the actual best practices

02:27.840 --> 02:31.410
and these are ones that I came up with as part of my book

02:31.410 --> 02:32.730
that I'm writing for O'Reilly Media.

02:32.730 --> 02:35.820
So these are the kind of the first three principles.

02:35.820 --> 02:37.350
I found that they're quite useful.

02:37.350 --> 02:39.090
And then the other thing you have to put in here

02:39.090 --> 02:41.220
is the standard stuff like respond only

02:41.220 --> 02:42.840
with your optimized prompt and nothing else,

02:42.840 --> 02:44.880
don't cheat by including test cases

02:44.880 --> 02:46.140
and the prompter, that sort of thing.

02:46.140 --> 02:49.440
The place I got this, well at least the basics

02:49.440 --> 02:51.180
of this without the principles

02:51.180 --> 02:54.190
is there's a tool called Prompts Royale

02:55.462 --> 02:57.690
and I think that they reference another paper

02:57.690 --> 03:00.690
that kind of uses this, I looked into their code,

03:00.690 --> 03:03.870
which is open source on GitHub and saw

03:03.870 --> 03:07.470
what was working for them for system analysis.

03:07.470 --> 03:09.240
Feel free to take a look at that as well.

03:09.240 --> 03:12.270
They've got a bit of a different approach to this.

03:12.270 --> 03:15.720
Anyways, what I found is if I put in the prompt template

03:15.720 --> 03:17.310
and what's different with mine as well

03:17.310 --> 03:20.100
as I put like these actual variables in here,

03:20.100 --> 03:23.010
which is I think really helpful for me at least.

03:23.010 --> 03:25.350
And the way this would work is if I want a list

03:25.350 --> 03:29.400
of product names for this product then

03:29.400 --> 03:31.530
and then these are like some examples.

03:31.530 --> 03:34.890
What I found with the test cases

03:34.890 --> 03:37.710
is that you couldn't put them in JSON very easily

03:37.710 --> 03:40.260
or you had to put it in more in a YAML format

03:40.260 --> 03:41.220
or markdown format.

03:41.220 --> 03:43.950
Here we have the product description a variable

03:43.950 --> 03:46.020
in the template and we were just trying to give

03:46.020 --> 03:47.700
kind of example values.

03:47.700 --> 03:51.240
So here I've put like a shoe that can fit a foot size

03:51.240 --> 03:52.952
and then the response would be these.

03:52.952 --> 03:55.410
And I've just given this is one shot

03:55.410 --> 03:57.480
so I've just given it one example.

03:57.480 --> 04:00.330
Then the other thing that I found worked really well

04:00.330 --> 04:02.910
was adding these criteria evaluation.

04:02.910 --> 04:05.670
You can put whatever evaluation criteria you want in here.

04:05.670 --> 04:07.590
It actually handles multiple if you want,

04:07.590 --> 04:09.330
but I've just used the default which is,

04:09.330 --> 04:11.940
is the submission helpful, insightful and appropriate?

04:11.940 --> 04:13.860
That I got from LangChain,

04:13.860 --> 04:16.980
That's the default evaluation criteria at LangChain

04:16.980 --> 04:18.060
when using their evals.

04:18.060 --> 04:20.790
I also reminded that prompt engineering

04:20.790 --> 04:22.410
best practices should be used

04:22.410 --> 04:25.530
and then I found it's very helpful to include

04:25.530 --> 04:27.060
in here any variables,

04:27.060 --> 04:29.520
and you could build this programmatically like I have this

04:29.520 --> 04:32.940
for one of my libraries I use for testing locally

04:32.940 --> 04:35.190
where it just automatically builds this prompt up for me

04:35.190 --> 04:37.560
and adds them based on anything

04:37.560 --> 04:39.060
that's in curly brackets here.

04:39.930 --> 04:40.920
Cool, so respond only

04:40.920 --> 04:42.540
with your optimized prompt template.

04:42.540 --> 04:45.303
I'm just gonna hit submit, I found it only really works

04:45.303 --> 04:48.540
with GTP Four and only works quite well

04:48.540 --> 04:50.160
with temperature of one.

04:50.160 --> 04:52.800
A temperature higher than that it starts to get a bit crazy.

04:52.800 --> 04:54.720
Temperature lower, tends to get a bit boring

04:54.720 --> 04:56.730
but that's usually what works for me.

04:56.730 --> 04:57.810
Here we go, it's pretty smart

04:57.810 --> 05:00.060
because you see here, imagine you're head

05:00.060 --> 05:01.710
of a creative marketing team, tasks coming,

05:01.710 --> 05:04.590
catching opinion names, the product is blank,

05:04.590 --> 05:06.330
you're looking for unique and innovative names

05:06.330 --> 05:09.750
akin to how brand names like Google, Nike or Tesla

05:09.750 --> 05:10.950
the respective industry.

05:10.950 --> 05:11.880
So it's used some examples here

05:11.880 --> 05:15.150
and it hasn't given a structure in the prompt

05:15.150 --> 05:17.400
but I think it's done a pretty good job

05:17.400 --> 05:22.350
and we could take this, just gonna copy this

05:22.350 --> 05:24.300
and then we're gonna set up another one

05:26.953 --> 05:29.273
and then we're just gonna put this as a pair

05:31.110 --> 05:36.110
of shoes that fit any foot size

05:36.390 --> 05:38.370
that was the product we wanted.

05:38.370 --> 05:40.330
And then I'm just gonna submit there

05:41.370 --> 05:44.970
and it comes up with pretty good names, so there you go.

05:44.970 --> 05:46.140
It's created a prompt.

05:46.140 --> 05:49.590
Now I didn't stop there, we have another template

05:49.590 --> 05:51.750
'cause I found that you don't always have

05:51.750 --> 05:54.690
like test cases and sometimes you just want be able

05:54.690 --> 05:58.260
to just do it just from a task specifically.

05:58.260 --> 06:00.870
This one also follows something from Prompts Royale

06:00.870 --> 06:05.040
but you can see that it's using very similar system message

06:05.040 --> 06:07.320
and then we've still got the the different

06:07.320 --> 06:08.700
best practices there.

06:08.700 --> 06:10.236
But the difference here is

06:10.236 --> 06:13.050
that we say the prompt template must take context

06:13.050 --> 06:14.940
from the user, the form of relevant input variables

06:14.940 --> 06:17.460
surrounded by curly brackets, IE this.

06:17.460 --> 06:18.900
That was like the hardest part to get.

06:18.900 --> 06:21.390
Once it does that, it does a pretty good job.

06:21.390 --> 06:23.820
And and the other thing I had to kind of mention

06:23.820 --> 06:25.920
is these placeholders should be labeled in the template

06:25.920 --> 06:27.690
as they will be replaced the values.

06:27.690 --> 06:29.790
'cause one thing I found that it would do

06:29.790 --> 06:32.130
is here it say please take the product category

06:32.130 --> 06:34.080
that's done correctly, that's labeled it.

06:34.080 --> 06:36.090
But sometimes it would just say please take the,

06:36.090 --> 06:39.510
and then product category and then it wouldn't label it.

06:39.510 --> 06:42.330
So when we insert a product category in here,

06:42.330 --> 06:44.160
it wouldn't work, right,

06:44.160 --> 06:46.680
because it would be referencing product category down here

06:46.680 --> 06:48.750
but that text wouldn't be in here anymore,

06:48.750 --> 06:50.545
if that makes sense.

06:50.545 --> 06:52.057
Just to show you an example there.

06:52.057 --> 06:54.280
So I'm just gonna get rid of this

06:56.100 --> 07:00.970
and then so if you put in product category, shoes

07:03.030 --> 07:05.553
and then we're gonna say target audience.

07:06.912 --> 07:08.914
(person typing)

07:08.914 --> 07:09.747
Say.

07:15.027 --> 07:19.270
and then a distinct feature is just say

07:26.760 --> 07:30.030
extra comfortable and then it knows

07:30.030 --> 07:33.450
because it can see that like the target audience is up here

07:33.450 --> 07:35.880
so then it can infer from these examples

07:35.880 --> 07:37.456
if that makes sense.

07:37.456 --> 07:39.487
That was. And then here we've got some more stuff.

07:39.487 --> 07:41.536
We have to repeat the product category here

07:41.536 --> 07:45.490
in this template, put it in shoes

07:49.680 --> 07:52.690
And then target audience say at home

07:53.580 --> 07:56.020
and then this would

07:56.889 --> 08:00.660
be cool.

08:00.660 --> 08:02.550
So see how that works.

08:02.550 --> 08:04.000
Okay, cozy comfort homemaker,

08:05.689 --> 08:07.273
oh god, it's called it a homemaker.

08:07.273 --> 08:08.474
That's pretty terrible.

08:08.474 --> 08:09.674
But you get the idea.

08:09.674 --> 08:11.670
This was a task to generate product names.

08:11.670 --> 08:13.890
We could change whatever task we want here.

08:13.890 --> 08:17.520
This could be rewrite some text

08:17.520 --> 08:21.323
in the style of the author

08:22.380 --> 08:24.693
and then see what template comes up with.

08:28.150 --> 08:30.300
And you can see it's using all the different principles.

08:30.300 --> 08:32.790
The really cool thing here is that it actually inserts

08:32.790 --> 08:35.670
these variables in so it makes it a real template

08:35.670 --> 08:37.870
and it does a pretty good job of doing that.

08:42.990 --> 08:45.360
Cool, yeah, hopefully that will make sense.

08:45.360 --> 08:47.400
There's a lot of prompt engineering principles

08:47.400 --> 08:49.170
baked into this as well

08:49.170 --> 08:50.910
and you can see that it does a good job

08:50.910 --> 08:53.160
because we've given it a lot of handholding here

08:53.160 --> 08:54.917
and I think I used to try and do this back

08:54.917 --> 08:58.169
before the update and because I didn't really know

08:58.169 --> 08:59.640
what prompt engineering was,

08:59.640 --> 09:01.140
it wasn't particularly good at it.

09:01.140 --> 09:03.420
But now it does because it's in the training data.

09:03.420 --> 09:05.460
So this opens up, I think a lot

09:05.460 --> 09:06.870
of prompt engineers

09:06.870 --> 09:08.400
haven't really tried this again recently.

09:08.400 --> 09:11.520
And this opens up quite a lot of stuff you can do.

09:11.520 --> 09:14.250
You can also, you can go even further, like you can get it

09:14.250 --> 09:17.160
to generate cases like these things that go in here.

09:17.160 --> 09:19.311
You can get it to generate good examples.

09:19.311 --> 09:21.900
You could also get it to rate the responses as well.

09:21.900 --> 09:23.640
So that's something I've been working on

09:23.640 --> 09:27.065
for my thumb library, which should be out soon.

09:27.065 --> 09:28.915
All right, hopefully that was useful.
