WEBVTT

00:00.000 --> 00:01.140
-: Hey, welcome, and in this lesson,

00:01.140 --> 00:03.660
we're gonna have a look at orchestrator workers.

00:03.660 --> 00:05.131
This is basically a common principle

00:05.131 --> 00:09.420
where you have an LLM which acts as the orchestrator,

00:09.420 --> 00:12.930
and that LLM will then decide to generate specific types

00:12.930 --> 00:15.750
of LLM calls or specific types of tasks.

00:15.750 --> 00:18.150
We then have those that get synthesized,

00:18.150 --> 00:19.470
and then we have an output.

00:19.470 --> 00:21.660
Basically, you can think of this like the LLM

00:21.660 --> 00:25.260
will break down the tasks into various subtasks.

00:25.260 --> 00:28.320
These are dynamically determined based on the input.

00:28.320 --> 00:30.090
These are then executed in parallel,

00:30.090 --> 00:32.550
and then the orchestrator synthesizes those

00:32.550 --> 00:34.200
into the final results.

00:34.200 --> 00:36.510
Some use cases are breaking down a coding problem

00:36.510 --> 00:37.650
into subtasks,

00:37.650 --> 00:39.840
using an LLM to generate code for each subtask,

00:39.840 --> 00:43.380
and then making sure that that entire solution works,

00:43.380 --> 00:46.236
or it could be, for example, dividing a data analysis task

00:46.236 --> 00:49.770
into things like cleaning the data, identifying the trends,

00:49.770 --> 00:51.870
and generating the visualizations.

00:51.870 --> 00:55.137
Each step is separately handled by a separate work LLM,

00:55.137 --> 00:56.670
and the orchestrator, basically,

00:56.670 --> 00:58.279
will just integrate their findings

00:58.279 --> 01:00.510
into the complete analytical report.

01:00.510 --> 01:01.770
Okay, so the first thing we're gonna do

01:01.770 --> 01:03.102
is we're gonna create two Pydantic models,

01:03.102 --> 01:06.054
one called subtask, which will have the name

01:06.054 --> 01:11.040
and description of the specific type of keys,

01:11.040 --> 01:12.390
so you can see these properties here.

01:12.390 --> 01:14.516
We've got the name, which is the type of string,

01:14.516 --> 01:15.349
and that has a description of the name of the subtask.

01:16.740 --> 01:20.160
We also have the description colon string,

01:20.160 --> 01:21.643
and that is also a required field.

01:21.643 --> 01:23.550
Notice the triple dots,

01:23.550 --> 01:25.095
and then we're gonna have a brief description

01:25.095 --> 01:26.340
of the subtask.

01:26.340 --> 01:29.670
We're also going to create a separate Pydantic model

01:29.670 --> 01:33.960
called orchestrator output, and that is also a base model,

01:33.960 --> 01:36.570
and inside of that, we will have both an objective key,

01:36.570 --> 01:40.170
which is a type of string and is a required field,

01:40.170 --> 01:42.120
then we will have a description,

01:42.120 --> 01:44.610
and then a summary of the coding task.

01:44.610 --> 01:46.980
We are specifically making an orchestrator for coding.

01:46.980 --> 01:49.710
We will also have a subtask property,

01:49.710 --> 01:52.740
which will have a list of subtasks,

01:52.740 --> 01:55.020
and then that will be a required field.

01:55.020 --> 01:56.640
Notice the triple dots again,

01:56.640 --> 01:58.650
and then we'll also give this a description

01:58.650 --> 02:00.510
and say that this is a list of subtasks

02:00.510 --> 02:01.890
to solve the coding task.

02:01.890 --> 02:03.330
Okay, cool, now that we have that,

02:03.330 --> 02:05.220
we're gonna create three separate prompts,

02:05.220 --> 02:06.510
so the first prompt that we're gonna make

02:06.510 --> 02:08.610
is called an orchestrator prompt,

02:08.610 --> 02:10.934
and that's basically gonna break down our task

02:10.934 --> 02:15.330
into a variety of, "You are a skilled software engineer.

02:15.330 --> 02:16.885
Read the coding problem and break it down

02:16.885 --> 02:19.770
into subtasks in JSON format.

02:19.770 --> 02:22.020
Summarize the objective of the task,

02:22.020 --> 02:24.900
and then list the subtasks to solve the coding task.

02:24.900 --> 02:28.350
Provide your answer in JSON format with these fields."

02:28.350 --> 02:31.500
The second one that we're gonna make is the worker prompt,

02:31.500 --> 02:34.493
and for the worker prompt, we're basically gonna tell it,

02:34.493 --> 02:36.690
"You are a skilled software engineer.

02:36.690 --> 02:40.680
Read the subtask and generate code to solve the subtask,"

02:40.680 --> 02:44.760
and we've got here subtask name and the subtask description

02:44.760 --> 02:46.740
with those in there as well.

02:46.740 --> 02:48.886
We'll also tell it, "Return only the code.

02:48.886 --> 02:53.886
Make sure that the code is valid Python code."

02:54.480 --> 02:57.600
Then for the final prompt, this is the aggregator prompt,

02:57.600 --> 03:00.660
and this prompt here is specifically going to be useful

03:00.660 --> 03:03.406
for synthesizing all of those subtasks

03:03.406 --> 03:06.150
that have been completed into a final aggregation source,

03:06.150 --> 03:10.620
so, "You are an experienced integrator of code.

03:10.620 --> 03:14.340
We have code snippets from different subtasks.

03:14.340 --> 03:16.320
Your job is to integrate the code snippets

03:16.320 --> 03:17.700
into a complete solution,"

03:17.700 --> 03:20.437
and we've given it the subtasks' code, and we say,

03:20.437 --> 03:21.990
"Return only the complete code.

03:21.990 --> 03:24.630
Do not include any other text or comments."

03:24.630 --> 03:26.130
So we have our three prompts,

03:26.130 --> 03:27.930
and then we're gonna need three different functions

03:27.930 --> 03:30.180
to call each of those individual prompts,

03:30.180 --> 03:31.620
so we're gonna have the first one,

03:31.620 --> 03:35.520
which is a asynchronous def call orchestrator,

03:35.520 --> 03:37.920
which will have a problem and a model,

03:37.920 --> 03:40.380
and it will return a orchestrator output.

03:40.380 --> 03:42.180
Now, the easy thing that we can do is just say,

03:42.180 --> 03:45.090
prompt is equal to orchestrator prompt dot,

03:45.090 --> 03:46.650
and we'll just use the format,

03:46.650 --> 03:50.160
and then we will also then create a ChatGPT call here,

03:50.160 --> 03:54.900
but we will use the client.beta.chat.completions.parse,

03:54.900 --> 03:57.420
and then what we will put in here is we'll put the model,

03:57.420 --> 03:58.470
we'll put the messages,

03:58.470 --> 04:01.470
and we will also put the response format,

04:01.470 --> 04:02.730
so you can see here,

04:02.730 --> 04:05.760
and then we have done some simple validation,

04:05.760 --> 04:10.290
if not response.choices, square bracket, 0.message.content,

04:10.290 --> 04:15.290
or response.choices.message.parse is not available there,

04:15.720 --> 04:17.070
then we raise an error.

04:17.070 --> 04:20.310
Else, we just get the parsed orchestrator output out.

04:20.310 --> 04:22.380
So you'll see that here is the type

04:22.380 --> 04:23.902
of orchestrator output when you hover

04:23.902 --> 04:28.830
over the dot parse in VSCode or cursor or some other IDE.

04:28.830 --> 04:30.501
Okay, cool, so we've done the orchestrator call.

04:30.501 --> 04:33.480
The next one we're gonna want is the def call worker,

04:33.480 --> 04:35.730
which will basically take a name,

04:35.730 --> 04:38.340
it will take a description of a task,

04:38.340 --> 04:40.410
and it will also have a model,

04:40.410 --> 04:44.100
which in this case, is our model, and it returns a string.

04:44.100 --> 04:46.410
Now, in this scenario, we're just getting the worker prompt,

04:46.410 --> 04:48.930
giving it the name of the task, the description of the task,

04:48.930 --> 04:52.380
and also then making a chat completions call,

04:52.380 --> 04:54.600
but we aren't using the beta client

04:54.600 --> 04:57.150
because we don't need a responsive format out

04:57.150 --> 04:58.110
from structured outputs,

04:58.110 --> 04:59.730
so we're just using the standard client,

04:59.730 --> 05:02.400
and we're returning either a string or none type.

05:02.400 --> 05:03.660
Now, the function that we need

05:03.660 --> 05:06.030
is the call aggregator function,

05:06.030 --> 05:08.280
which will take the subtasks' code,

05:08.280 --> 05:10.140
and it'll take a model as the arguments.

05:10.140 --> 05:13.080
It will format the aggregated prompt with the subtask code,

05:13.080 --> 05:14.730
it will make a call to ChatGPT,

05:14.730 --> 05:16.470
and it will return the message.

05:16.470 --> 05:18.990
Okay, now that we have all of these functions,

05:18.990 --> 05:20.370
the next thing that we're gonna need to do

05:20.370 --> 05:22.650
is define a workflow for this,

05:22.650 --> 05:25.950
so we'll say def call orchestrator,

05:25.950 --> 05:28.260
and then basically, what this is gonna do

05:28.260 --> 05:31.416
is it's gonna call all of the orchestrator flow

05:31.416 --> 05:35.340
with the problem, and the model is gonna be passed in,

05:35.340 --> 05:37.080
and then the first thing we're gonna do,

05:37.080 --> 05:40.770
so we'll just break this down, is we'll say step one,

05:40.770 --> 05:44.340
the orchestrator breaks the main problem into subtasks,

05:44.340 --> 05:47.160
step two, we have the parallel workers work

05:47.160 --> 05:48.690
and handle each subtask,

05:48.690 --> 05:50.880
and then we aggregate those results,

05:50.880 --> 05:52.426
and then you'll see the next bit here,

05:52.426 --> 05:54.900
we've got an entry point for this,

05:54.900 --> 05:56.220
and let me just change this,

05:56.220 --> 05:57.669
so we don't want call orchestrator,

05:57.669 --> 06:02.280
we actually want this to be orchestrator workers flow,

06:02.280 --> 06:03.863
and then let's also go and change

06:03.863 --> 06:06.930
and put this code out inside of a new cell,

06:06.930 --> 06:08.160
and let's give this a run,

06:08.160 --> 06:09.714
and okay, so we've run into one error,

06:09.714 --> 06:11.427
and what I've noticed with this error

06:11.427 --> 06:14.742
is to make sure there isn't a comment after this model,

06:14.742 --> 06:16.830
'cause that was causing an error,

06:16.830 --> 06:19.980
and then after that, then you can run your code,

06:19.980 --> 06:23.520
and so what's happening is we call the orchestrator,

06:23.520 --> 06:25.470
we made it to this point here,

06:25.470 --> 06:28.290
and then after that, we will have the orchestrator output,

06:28.290 --> 06:30.690
which will have a bunch of subtasks,

06:30.690 --> 06:32.190
and then each of those subtasks

06:32.190 --> 06:34.950
will get executed in parallel,

06:34.950 --> 06:37.410
and then we aggregate those solutions

06:37.410 --> 06:38.880
so that we have the final code.

06:38.880 --> 06:40.320
So if we go and have a look at this,

06:40.320 --> 06:43.620
you can see this is the code that it decided to create,

06:43.620 --> 06:46.830
and then it's put everything together all in one block.

06:46.830 --> 06:48.845
So again, the whole point of this

06:48.845 --> 06:51.840
is that the orchestrator is responsible

06:51.840 --> 06:53.580
for breaking down the problem

06:53.580 --> 06:57.300
into a series of orchestrator-outputted subtasks.

06:57.300 --> 06:59.437
Those subtasks will then be executed

06:59.437 --> 07:02.010
in a different LLM worker.

07:02.010 --> 07:03.769
This LLM could have access to tools

07:03.769 --> 07:06.510
to be able to deal with those various subtasks,

07:06.510 --> 07:08.940
and then once all of those subtasks completed,

07:08.940 --> 07:10.470
you'll call the aggregator

07:10.470 --> 07:12.510
to then aggregate all those tasks.

07:12.510 --> 07:14.250
There are other different types of frameworks

07:14.250 --> 07:15.866
that attempt to try and do this, so BabyAGI

07:15.866 --> 07:18.750
was one of the original frameworks

07:18.750 --> 07:23.750
which allowed for task creation and then execution of tasks,

07:24.060 --> 07:26.355
so that's originally where one of these ideas

07:26.355 --> 07:28.530
for the orchestrator came about.

07:28.530 --> 07:30.653
All right, cool. See you in the next video.