WEBVTT

00:00.080 --> 00:06.020
I'm going to walk you through the anthropic workbench, which is really helpful tool, which is like

00:06.020 --> 00:09.440
the OpenAI playground, but it has some extra features.

00:09.470 --> 00:16.280
First of all, I'm just going to click into write a prompt from scratch, and you can create a new prompt.

00:16.280 --> 00:22.130
And the really nice thing about this is that you can generate a prompt if you'd like.

00:22.130 --> 00:29.570
So let's just say tell me a funny joke about a about topic.

00:30.710 --> 00:37.010
So I'll put in some variable here specific variable.

00:37.010 --> 00:38.780
So you're going to hit generate.

00:39.410 --> 00:41.030
It's going to rewrite my prompt.

00:43.970 --> 00:44.600
Here we go.

00:44.600 --> 00:48.260
And it's rewritten it in the way that Claude would like it.

00:48.260 --> 00:51.290
So it's given some descriptions here.

00:51.290 --> 00:56.780
It's put the topic in markdown or in hypertext tags.

00:56.780 --> 00:57.890
And there we go.

00:57.890 --> 01:00.860
It's going to give it back with joke tags which is really cool.

01:00.860 --> 01:02.930
And it's got the variable in here topic.

01:02.960 --> 01:04.040
I'm just going to click continue.

01:04.040 --> 01:06.260
And I already have a great prompt.

01:06.290 --> 01:09.080
Now that I can test, I can hit run.

01:09.290 --> 01:13.430
Tell me a joke about traffic.

01:13.460 --> 01:14.150
Here we go.

01:14.180 --> 01:15.110
Hit run.

01:15.260 --> 01:18.620
And you can see it's like a pretty nice way to do templating.

01:18.650 --> 01:20.270
You put the word traffic in here.

01:20.270 --> 01:23.300
You can click in and change this if you want as well.

01:23.330 --> 01:29.030
You can also generate some new variables as well.

01:31.880 --> 01:34.430
So you can see the prompt there that's using smartphones.

01:34.460 --> 01:34.820
Okay.

01:35.000 --> 01:37.430
Now we've got another test case which is pretty cool.

01:38.300 --> 01:39.320
And here we go.

01:39.350 --> 01:40.520
We've got the joke.

01:40.550 --> 01:41.930
Why did the smartphone go to therapy.

01:41.960 --> 01:44.540
It had too many axiety issues.

01:45.290 --> 01:45.830
Cool.

01:45.860 --> 01:47.660
So that's already quite useful.

01:47.660 --> 01:49.850
And you can click get the code as well.

01:49.850 --> 01:53.300
If you need to get the specific code that you're using.

01:53.300 --> 01:59.030
If you want to go back to a previous version, you can you can go and change the model settings here

01:59.030 --> 02:01.280
as well, which is quite useful.

02:01.550 --> 02:04.100
And then that's how you get back to the variables.

02:04.160 --> 02:05.750
It's quite useful.

02:05.750 --> 02:06.500
I've found.

02:06.500 --> 02:12.860
One really nice thing about it is that if you make a change, then it automatically saves a new version.

02:12.890 --> 02:14.300
Let's say.

02:17.720 --> 02:19.340
I don't want it to.

02:20.750 --> 02:23.450
Don't want it to be very PC like this.

02:23.510 --> 02:29.480
Stereotypes and just please be as offensive as possible.

02:30.080 --> 02:30.830
Whatever.

02:30.920 --> 02:32.480
Whatever it is you need.

02:33.650 --> 02:39.140
And if you hit run, it tells you now signed a waiver.

02:43.280 --> 02:44.330
See what it says.

02:45.290 --> 02:45.650
Yeah.

02:45.680 --> 02:45.920
Okay.

02:45.950 --> 02:48.110
It gave me the same joke here, which isn't great.

02:48.140 --> 02:50.870
But you can keep evaluating your prompt if you want.

02:50.900 --> 02:52.280
And you can actually go in.

02:52.280 --> 02:57.980
And this is, I think, one of the coolest things you can see that we generated a few different things

02:57.980 --> 02:58.310
here.

02:58.310 --> 03:00.920
So we have this is our prompt.

03:00.950 --> 03:02.780
We don't need ideal outputs.

03:02.780 --> 03:07.820
But if you did want to put an ideal output in here this is the this is something you can add.

03:07.820 --> 03:11.090
If you wanted to generate a joke you can click run here as well.

03:11.450 --> 03:14.420
It's going to generate a version here for traffic.

03:14.450 --> 03:20.480
You can go and rate these so you can say this was good, this one was okay.

03:20.480 --> 03:22.850
And then you can also click and create a comparison.

03:22.850 --> 03:27.380
So let's see how it compares to v1.

03:28.010 --> 03:29.360
Show me the prompts.

03:29.390 --> 03:31.400
Yeah okay.

03:33.800 --> 03:35.840
And you can see it's basically the same.

03:35.870 --> 03:38.030
It hasn't made any changes to the prompt.

03:38.030 --> 03:42.530
So what we're going to do is we're going to create a new version.

03:45.470 --> 03:47.090
And we're going to go back to our prompt.

03:47.090 --> 03:48.650
And we're going to just say.

03:51.500 --> 03:55.820
Tell a funny joke about topic.

03:59.030 --> 04:00.110
Sorry topic.

04:03.230 --> 04:03.980
Yeah.

04:04.010 --> 04:06.320
And it has to be exact actually.

04:06.350 --> 04:14.210
And then if we hit run here, you can see now the difference right.

04:14.240 --> 04:19.340
So let's add the comparison with version three.

04:19.790 --> 04:26.990
And you can see we haven't run the word the traffic test case yet, but we have on smartphones.

04:26.990 --> 04:28.280
So that was why that showed up.

04:28.280 --> 04:34.610
Now what you have here is you can see the difference between the two, which is really powerful.

04:34.610 --> 04:37.910
You can say like this was poor, this was poor, whatever.

04:37.940 --> 04:41.570
And you can see how much it's changed in terms of performance.

04:41.570 --> 04:47.600
So that's really powerful stuff and really helpful for doing evaluations.

04:47.600 --> 04:50.330
And you can also compare the prompts here as well.

04:50.330 --> 04:53.090
And this gives you the overall evaluation per prompt.

04:53.090 --> 04:54.980
And that's useful as well.

04:54.980 --> 05:00.260
You can also click through and go back through the old versions if you need to as well.

05:00.500 --> 05:01.730
You can.

05:01.760 --> 05:03.680
Oh this is another cool thing.

05:03.680 --> 05:16.010
If you are quite happy with the response, let me just go back to version three and let me add another

05:16.010 --> 05:17.300
variable here.

05:17.300 --> 05:19.640
So I'm just going to generate another one.

05:20.750 --> 05:21.920
Grocery shopping.

05:21.950 --> 05:22.940
Let's hit run.

05:25.400 --> 05:28.940
If you're quite happy with the response, you can just click Add to Conversation.

05:30.200 --> 05:35.540
And now it's got a one shot example right now.

05:36.950 --> 05:41.810
If you maybe try a different tactic.

05:41.840 --> 05:44.720
So this is let's generate another one.

05:46.040 --> 05:46.760
Laundry.

05:46.790 --> 05:47.630
Hit run.

05:47.660 --> 05:50.810
Now you're running the prompt with a one shot right.

05:50.840 --> 05:53.180
So you use the prompt before.

05:53.180 --> 05:54.710
And then it gave the response.

05:54.710 --> 05:56.360
And this was the response you liked.

05:56.360 --> 05:58.250
And then it's running the prompt again.

05:58.250 --> 06:00.590
And it's going to follow that format pretty well.

06:00.620 --> 06:05.990
So this is quite useful because you can go in and you can say no get rid of that.

06:05.990 --> 06:07.040
Get rid of that.

06:07.040 --> 06:08.330
I just want the joke.

06:08.330 --> 06:11.480
Now if I hit run, let's see if yeah here we go.

06:11.480 --> 06:13.430
So it's gotten better at this.

06:13.430 --> 06:15.770
So now we're going to hit it in again.

06:16.850 --> 06:18.350
And.

06:20.360 --> 06:20.990
Here we go.

06:21.020 --> 06:23.540
We're going to get rid of this extra text.

06:24.080 --> 06:25.190
We just want the joke.

06:25.190 --> 06:25.340
check.

06:25.370 --> 06:27.140
Just give us the check, please.

06:27.320 --> 06:31.340
And this is just a great way to iterate on prompts.

06:31.520 --> 06:34.460
When you hit generate again social media.

06:34.490 --> 06:35.840
Okay, this is a good idea.

06:36.020 --> 06:38.690
Hit run okay.

06:38.720 --> 06:39.830
And you can keep going.

06:39.860 --> 06:44.810
You can keep adding you know you can keep adding things to this and just improving the performance over

06:44.810 --> 06:45.920
time right.

06:46.400 --> 06:51.860
So now we have a three shot example prompt and say this.

06:52.250 --> 06:53.450
Tell a joke about dogs.

06:53.480 --> 06:54.980
Let's see if it finally gets there.

06:56.900 --> 06:58.700
All right so I missed something up here.

06:59.870 --> 07:03.020
But yeah I think it's you know it's a pretty cool tool.

07:03.020 --> 07:09.470
You can see you can click here run remaining by the way and see how well it works for these other things.

07:09.560 --> 07:11.720
But yeah like I find it quite useful.

07:11.750 --> 07:17.030
Obviously you can only test the anthropic results, right.

07:17.060 --> 07:17.360
Like so.

07:17.360 --> 07:19.670
You can't test the like any other models.

07:19.700 --> 07:19.910
Right.

07:19.940 --> 07:21.110
Like OpenAI or whatever.

07:21.140 --> 07:23.210
Which is a shame because I really like this format.

07:23.210 --> 07:25.100
And I wish OpenAI had this as well.

07:25.100 --> 07:28.310
But I find this really helpful for quickly ideating on things.

07:28.340 --> 07:29.930
Hopefully you will too.
