WEBVTT

00:00.000 --> 00:00.833
-: Hey, welcome back.

00:00.833 --> 00:02.070
So, in this video, we're gonna have a look

00:02.070 --> 00:05.100
at the Responses API in a little bit more detail.

00:05.100 --> 00:06.240
What I would recommend doing is

00:06.240 --> 00:07.980
going into the GitHub repository,

00:07.980 --> 00:10.440
going to the openai_features_and_functionality.

00:10.440 --> 00:11.310
You'll be working through

00:11.310 --> 00:13.800
the responses_api_and_messages notebook.

00:13.800 --> 00:16.020
Now, I'm gonna actually recommend that if you want,

00:16.020 --> 00:18.150
you can create a new notebook and you can follow along

00:18.150 --> 00:20.910
and get that muscle memory from typing some of the commands.

00:20.910 --> 00:22.110
We'll keep some of these commands in here,

00:22.110 --> 00:24.723
like the pip installs, so feel free to run those.

00:26.670 --> 00:30.090
We're installing tiktoken and openai.

00:30.090 --> 00:32.700
And we are also then gonna import the OpenAI package

00:32.700 --> 00:36.450
and we'll set the model name to gpt-4.1-mini.

00:36.450 --> 00:38.760
Now, at this point, you will need to set your api_key.

00:38.760 --> 00:39.960
So remember to do that.

00:39.960 --> 00:41.370
Okay, so the first thing we're gonna do is

00:41.370 --> 00:42.810
we're gonna have our message history.

00:42.810 --> 00:44.640
We're gonna create a Python list

00:44.640 --> 00:47.760
and in there we are going to have one specific entry,

00:47.760 --> 00:49.020
which will be a dictionary.

00:49.020 --> 00:54.020
And you're gonna have a "role:" of "user", "content":.

00:54.240 --> 00:58.200
We're gonna put that to "tell me a joke!"

00:58.200 --> 00:59.400
Then we're gonna call the API

00:59.400 --> 01:04.320
with response = client.responses.create().

01:04.320 --> 01:07.380
And then here we're gonna put the model=MODEL.

01:07.380 --> 01:09.190
We're also going to put the input

01:10.470 --> 01:12.220
is gonna be equal to our history

01:13.110 --> 01:16.203
and we're gonna have this store=False.

01:17.550 --> 01:20.610
Now, to tell you what the store=False means,

01:20.610 --> 01:24.210
it means that we won't naturally save the message history.

01:24.210 --> 01:26.980
So, now, we can print the response.output_text

01:31.380 --> 01:33.450
and you'll see that you get back a response.

01:33.450 --> 01:34.290
So, here we go.

01:34.290 --> 01:35.670
Sure! Here's one for you:

01:35.670 --> 01:37.200
Why did the scarecrow win an award?

01:37.200 --> 01:40.320
Because he was outstanding in his field. Okay, great.

01:40.320 --> 01:42.458
If we are doing the store=False, this means

01:42.458 --> 01:46.170
that the message history won't be stored on OpenAI server.

01:46.170 --> 01:47.730
Okay, so the first thing that we wanna do is

01:47.730 --> 01:51.390
we want to update the message history

01:51.390 --> 01:55.800
with the assistant output/message.

01:55.800 --> 01:59.040
And we can do that by updating our history variable

01:59.040 --> 02:00.810
with a += sign.

02:00.810 --> 02:02.460
And we'll do a list comprehension

02:02.460 --> 02:05.730
over all of the response.output.

02:05.730 --> 02:10.380
So we're gonna go for el in response.output.

02:10.380 --> 02:13.170
We are then going to create a certain number of dictionaries

02:13.170 --> 02:16.830
for each of these, and we'll have the "role":

02:16.830 --> 02:19.470
being equal to el.role.

02:19.470 --> 02:23.283
And we will have the "content": being equal to el.content.

02:24.540 --> 02:28.680
After this, then we need to ask for another joke.

02:28.680 --> 02:31.500
So we'll do history.append

02:31.500 --> 02:36.193
and we will put in here "role": "user", "content":

02:39.907 --> 02:41.427
"tell me another".

02:44.220 --> 02:47.290
We will then do a second response

02:48.480 --> 02:50.040
by doing a new variable called

02:50.040 --> 02:55.040
second_response = client.responses.create.

02:56.040 --> 02:58.830
We're gonna put the model=MODEL.

02:58.830 --> 03:03.207
We're gonna put the input=history

03:03.207 --> 03:06.423
and we're also going to put store=False.

03:07.830 --> 03:11.007
Then we're going to print "Second joke",

03:13.446 --> 03:15.779
second_response.output_text.

03:20.370 --> 03:21.630
So we've got that joke

03:21.630 --> 03:23.583
and then we have our second joke here.

03:24.420 --> 03:26.040
Cool, so you'll see here

03:26.040 --> 03:28.320
that when you are using the store=False,

03:28.320 --> 03:31.050
you need to make sure that you update the message history

03:31.050 --> 03:33.960
which came out from the previous assistant message.

03:33.960 --> 03:36.510
We also then added another user message

03:36.510 --> 03:39.330
and then we did our second_response against OpenAI.

03:39.330 --> 03:41.220
If you're doing the store=False,

03:41.220 --> 03:43.650
you will have to handle the message history.

03:43.650 --> 03:46.080
Okay, so let's say we're gonna store the results.

03:46.080 --> 03:46.913
This changes things.

03:46.913 --> 03:49.290
So now, we're gonna use the previous_response_id

03:49.290 --> 03:53.310
because now we're gonna store everything on OpenAI servers.

03:53.310 --> 03:54.720
So, for example, we'll do

03:54.720 --> 03:58.830
responses = client.responses.create().

03:58.830 --> 04:01.023
We're gonna do model=MODEL,

04:01.920 --> 04:05.157
and then we'll say input='tell me a joke'.

04:06.690 --> 04:09.150
After this, I want you to then print out

04:09.150 --> 04:11.043
the response.output_text.

04:12.240 --> 04:15.360
So, once you've done that, then you'll get your joke.

04:15.360 --> 04:18.930
Now, instead of actually storing it in our own Python list,

04:18.930 --> 04:22.410
we're gonna have the second_response =

04:22.410 --> 04:25.590
client.responses.create().

04:25.590 --> 04:27.690
And then in here, we're then gonna have

04:27.690 --> 04:30.570
the model=MODEL, just like we've been doing before.

04:30.570 --> 04:32.970
And that model variable is defined at the top.

04:32.970 --> 04:36.210
We're gonna have a new thing called the previous_response_id

04:36.210 --> 04:38.610
as a new parameter into this method.

04:38.610 --> 04:42.330
And then what we're gonna do is provide the old response.id.

04:42.330 --> 04:43.950
So, when you get a response back

04:43.950 --> 04:47.940
from a client of responses.create, you have a .id on there,

04:47.940 --> 04:50.943
which is a unique identifier for this response.

04:52.020 --> 04:54.720
So, we're gonna have a new input, which is gonna be a list,

04:54.720 --> 04:58.830
and we'll say the "role": is equal to "user",

04:58.830 --> 05:03.830
and the "content": is "explain why this is funny".

05:04.020 --> 05:07.210
And then we're then gonna print out the second response

05:08.070 --> 05:09.423
here just below.

05:11.130 --> 05:13.710
And we're gonna do the .output_text from that.

05:13.710 --> 05:15.420
Cool. And then rerun this.

05:15.420 --> 05:18.600
And what you'll see is we get a joke being made,

05:18.600 --> 05:22.260
and then after that we end up with a explanation

05:22.260 --> 05:24.810
of why this joke is funny, right?

05:24.810 --> 05:26.130
So, the key point here is

05:26.130 --> 05:28.500
if you are not using the store=False,

05:28.500 --> 05:31.140
you can reference the old response.id

05:31.140 --> 05:33.420
and that will give you a unique identifier

05:33.420 --> 05:36.240
so that you can manually stitch conversations

05:36.240 --> 05:38.700
between different responses calls

05:38.700 --> 05:40.920
without having to have a Python list

05:40.920 --> 05:42.270
and storing the state yourself.

05:42.270 --> 05:43.680
In the next video, we'll have a look

05:43.680 --> 05:45.690
at the chat completions API

05:45.690 --> 05:48.360
and how that defers versus the Responses API.

05:48.360 --> 05:51.030
You might still sometimes use the chat completions API,

05:51.030 --> 05:52.290
because OpenAI have said

05:52.290 --> 05:55.080
that they will support this endpoint indefinitely.

05:55.080 --> 05:56.930
Cool, I'll see you in the next video.