Welcome back! In this video, we'll learn to unlock the chat capabilities from the OpenAI API, which underpin popular applications like ChatGPT. Let's get started!

The Chat Completion endpoint allows us to have multi-turn conversations with a model, so we can build on our previous prompts depending on how the model responds. Additionally, chat models often perform just as well as Completions models on single-turn tasks.

Compared to the Completions endpoint, Chat Completion allows for better customizability of the response through the use of roles, which we'll discuss in a moment.

Finally, there's currently a substantial cost benefit to using chat models over completions.

The cost benefit and flexibility of being able to have multi-turn conversations, means that developers quite often choose a chat model when building applications on the OpenAI API.

Roles are at the heart of how chat models function.

There are three roles: the system, the user, and the assistant.

The system role allows the user to specify a message to control the behavior of the assistant. For example, for a customer service chatbot, we could provide a system message stating that the assistant is a polite and helpful customer service assistant.

The user provides an instruction to the assistant, and the assistant responds.

One of the interesting things about chat models, is that the user can also provide assistant messages. These are often utilized to provide examples to help the model better understand the user's desired response.

We'll discuss multi-turn conversations in the next video; for now, we'll get familiar with using chat models for single-turn tasks.

Making a request to the ChatCompletion endpoint is very similar to the Completions endpoint. Instead of calling the create method on the Completion class, we call it on ChatCompletion class; there's also different models for these two endpoints.

The main difference is in the way that prompts are provided. For the Completions endpoint, the prompt is passed as a string for the model to complete. Due to the greater customizability of chat models through the use of roles, the prompt is provided in a different way.

The prompt is set up by creating a list of dictionaries, where each dictionary provides content to one of the roles.

The messages often start with the system role followed by alternating user and assistant messages.

The system role here instructs the assistant to act as a data science tutor that speaks concisely.

Let's add these messages into the request code and print the response.

Like the Completions endpoint, we receive a JSON response, where the assistant's text response is nested inside the choices key.

To extract the text, we use very similar dictionary and list subsetting as we did for the Completions endpoint, except instead of the message being attached to a key called text, it is nested inside another dictionary inside the message key.

To extract the text from the choices key at the top,

we extract the value from the key with square brackets, which returns a list with a single element.

Next, we'll subset the first element from that list,

which returns a nested dictionary.

Finally, we can access the content by subsetting the values from first the message key, and then the content key.

We can see that the assistant stayed true to the system message - only using two sentences in its concise explanation.

In the next video, you'll learn how to extend this to multi-turn tasks, but for now, time for some practice!