WEBVTT

1
00:00:00.000 --> 00:00:03.120
Great work so far!

2
00:00:03.120 --> 00:00:03.120


3
00:00:03.120 --> 00:00:03.120


4
00:00:03.120 --> 00:00:08.320
Despite being renowned for its GPT series of chat models, OpenAI

5
00:00:08.320 --> 00:00:13.800
hosts a diverse array of models capable of performing many different tasks.

6
00:00:13.800 --> 00:00:13.840


7
00:00:13.840 --> 00:00:13.840


8
00:00:13.840 --> 00:00:20.280
In this course, we'll be focusing primarily on OpenAI's text-based models, but later in the

9
00:00:20.280 --> 00:00:27.480
course, we'll also take a look at the audio transcription and translation capabilities of the Whisper model.

10
00:00:27.480 --> 00:00:27.480


11
00:00:27.480 --> 00:00:27.520


12
00:00:27.520 --> 00:00:33.800
For now, however, let's take a closer look at the text capabilities available through the API.

13
00:00:33.800 --> 00:00:33.800


14
00:00:33.800 --> 00:00:33.800


15
00:00:33.800 --> 00:00:38.080
The Completions endpoint allows users to send a prompt and receive a

16
00:00:38.080 --> 00:00:43.600
model-generated response that attempts to complete the prompt in a likely and consistent way.

17
00:00:43.600 --> 00:00:43.600


18
00:00:43.600 --> 00:00:43.600


19
00:00:43.600 --> 00:00:51.680
Completions is used for so-called single-turn tasks, as there is a single prompt and response.

20
00:00:51.680 --> 00:00:57.760
However, the models available via this endpoint are extremely flexible, and are capable of answering

21
00:00:57.760 --> 00:01:06.680
questions, performing classification tasks, determining text sentiment, explaining complex topics, and much more.

22
00:01:06.680 --> 00:01:06.680


23
00:01:06.680 --> 00:01:06.680


24
00:01:06.680 --> 00:01:11.160
The Completions endpoint is available via the openai Completion class.

25
00:01:11.160 --> 00:01:11.160


26
00:01:11.160 --> 00:01:12.280


27
00:01:12.280 --> 00:01:18.360
The Chat endpoint can be used for applications that require multi-turn tasks, including assisting

28
00:01:18.360 --> 00:01:26.200
with ideation, customer support questions, personalized tutoring, translating languages, and writing code.

29
00:01:26.200 --> 00:01:26.200


30
00:01:26.200 --> 00:01:26.200


31
00:01:26.200 --> 00:01:31.120
Chat models also perform well on single-turn tasks, so

32
00:01:31.120 --> 00:01:34.760
many applications are built on top of chat models for flexibility.

33
00:01:34.760 --> 00:01:35.920


34
00:01:35.920 --> 00:01:35.920


35
00:01:35.920 --> 00:01:40.120
The openai package provides the ChatCompletion class for accessing

36
00:01:40.120 --> 00:01:44.440
the Chat endpoint, but we'll cover how to use Chat later in the course.

37
00:01:44.440 --> 00:01:45.320


38
00:01:45.320 --> 00:01:45.320


39
00:01:45.320 --> 00:01:48.880
The Moderation endpoint is used to check whether content violates

40
00:01:48.880 --> 00:01:54.280
OpenAI's usage policies, such inciting violence or promoting hate speech.

41
00:01:54.280 --> 00:01:54.280


42
00:01:54.280 --> 00:01:55.160


43
00:01:55.160 --> 00:01:59.200
The sensitivity of the model to different types of violations can be

44
00:01:59.200 --> 00:02:04.640
customized for specific use cases that may require stricter or more lenient moderation.

45
00:02:04.640 --> 00:02:04.640


46
00:02:04.640 --> 00:02:05.960


47
00:02:05.960 --> 00:02:12.640
For business use cases with frequent requests to the API, it's important to manage usage across the business.

48
00:02:12.640 --> 00:02:12.640


49
00:02:12.640 --> 00:02:12.640


50
00:02:12.640 --> 00:02:22.880
Setting up an organization for the API allows for better management of access, billing, and usage limits to the API.

51
00:02:22.880 --> 00:02:22.880


52
00:02:22.880 --> 00:02:22.880


53
00:02:22.880 --> 00:02:29.400
Users can be part of multiple organizations and attribute requests to specific organizations for billing.

54
00:02:29.400 --> 00:02:29.400


55
00:02:29.400 --> 00:02:29.400


56
00:02:29.400 --> 00:02:36.560
To attribute a request to a specific organization, we only need to add one more line of code.

57
00:02:36.560 --> 00:02:36.560


58
00:02:36.560 --> 00:02:36.560


59
00:02:36.560 --> 00:02:42.480
Like the API key, the organization ID can be set before the request.

60
00:02:42.480 --> 00:02:42.480


61
00:02:42.480 --> 00:02:42.480


62
00:02:42.480 --> 00:02:50.840
API rate limits are another key consideration for companies building features on the OpenAI API.

63
00:02:50.840 --> 00:02:55.120
Rate limits are a cap on the frequency and size of API requests.

64
00:02:55.120 --> 00:02:55.120


65
00:02:55.120 --> 00:02:55.120


66
00:02:55.120 --> 00:02:58.840
They are put in place to ensure fair access to the API,

67
00:02:58.840 --> 00:03:03.920
prevent misuse, and also manage the infrastructure that supports the API.

68
00:03:03.920 --> 00:03:03.920


69
00:03:03.920 --> 00:03:03.920


70
00:03:03.920 --> 00:03:11.520
For many cases, this may not be an issue, but if a feature is exposed to a large user base, or the

71
00:03:11.520 --> 00:03:17.200
requests require generating large bodies of content, they could be at risk of hitting the rate limits.

72
00:03:17.200 --> 00:03:17.200


73
00:03:17.200 --> 00:03:17.200


74
00:03:17.200 --> 00:03:22.560
Much of this risk can be mitigated by, instead of running multiple features

75
00:03:22.560 --> 00:03:24.280
under the same organization,

76
00:03:24.280 --> 00:03:24.280


77
00:03:24.280 --> 00:03:27.960
having separate organizations for each business

78
00:03:27.960 --> 00:03:33.400
unit or product feature, depending on the number of features built on the OpenAI API.

79
00:03:33.400 --> 00:03:33.400


80
00:03:33.400 --> 00:03:33.400


81
00:03:33.400 --> 00:03:40.640
In this example, we've created separate OpenAI organizations for three different AI-powered

82
00:03:40.640 --> 00:03:47.920
features: a customer service chatbot, a content recommendation system, and a video transcript generator.

83
00:03:47.920 --> 00:03:47.920


84
00:03:47.920 --> 00:03:47.920


85
00:03:47.920 --> 00:03:53.800
This distributes the requests to reduce the risk of hitting the rate limit.

86
00:03:53.800 --> 00:03:58.600
It also removes the single failure point, so an issue to one organization,

87
00:03:58.600 --> 00:04:03.280
such as a billing issue, will only result in the failure of a single feature.

88
00:04:03.280 --> 00:04:10.280
Product-separated organizations also provides more granular insights into usage and billing.

89
00:04:10.280 --> 00:04:10.280


90
00:04:10.280 --> 00:04:10.280


91
00:04:10.280 --> 00:04:14.240
Let's practice!

