WEBVTT

00:00.160 --> 00:00.960
Welcome!

00:00.960 --> 00:05.280
In this tutorial, we are going to build a personal assistant that can make phone calls for you.

00:05.560 --> 00:06.840
So let's get started.

00:06.840 --> 00:13.240
You have already seen this AI voice assistant built in and integrated into an annotation workflow in

00:13.240 --> 00:18.000
one of the demos, but this time we are going to do something slightly different.

00:18.600 --> 00:20.880
First we are going to watch the full demo.

00:20.920 --> 00:26.520
Then I'm going to give you an overview of the setup and explain how it works, note by note.

00:26.520 --> 00:31.400
And instead of building it completely from scratch, we are going to use the template provided in the

00:31.400 --> 00:36.600
resources of this lesson and I will show you exactly what you need to change to make it work for you.

00:37.360 --> 00:43.000
This way you will learn how to take advantage of the templates we provide, and also how to adapt blueprints

00:43.120 --> 00:44.320
to fit your own needs.

00:44.760 --> 00:49.680
Because personally, I believe that the best way to learn any ten after you've got the basics is by

00:49.680 --> 00:55.880
modifying, recreating, and rebuilding templates so that the supervisor agent from your workflow can

00:55.880 --> 00:57.840
connect to it and take action.

00:57.840 --> 00:59.320
So now let's watch the demo.

00:59.560 --> 01:00.280
Hi everyone.

01:00.410 --> 01:04.330
Now I'm going to show you my personal assistant agent in action.

01:06.530 --> 01:07.050
Hi, Emily.

01:07.050 --> 01:12.890
Can you please call Mark and ask if we can meet tomorrow at 7 a.m. in the office to discuss the onboarding

01:12.890 --> 01:14.130
of two new clients.

01:14.570 --> 01:18.850
And if that works for him, send him a confirmation email and also update my calendar.

01:21.890 --> 01:24.850
So I should receive a call in a second from my personal assistant.

01:29.770 --> 01:36.570
So the main agent is looking for Mark's mobile number in the contacts data stored in a Google sheet,

01:37.370 --> 01:39.170
and is calling a subagents.

01:43.410 --> 01:43.810
Yep.

01:48.290 --> 01:50.850
Hi, I'm Emily, Damien's personal assistant.

01:51.370 --> 01:51.930
Hi, Emily.

01:51.930 --> 01:52.930
How can I help you?

01:54.370 --> 01:54.890
Hi, Mark.

01:54.890 --> 02:00.180
Damien asks if you can meet tomorrow at 7 a.m. in the office to discuss the onboarding of two new clients.

02:00.860 --> 02:01.180
Yes.

02:01.180 --> 02:01.740
No problem.

02:01.780 --> 02:03.140
That works for me.

02:03.180 --> 02:03.620
Great.

02:04.700 --> 02:05.260
Great.

02:05.500 --> 02:06.540
I'll inform Damien.

02:07.100 --> 02:07.620
Thank you.

02:07.660 --> 02:11.100
Can you ask him if I need to prepare anything?

02:12.820 --> 02:15.220
I will check with Damien and get back to you shortly.

02:15.780 --> 02:16.260
Thank you.

02:16.260 --> 02:16.780
Bye bye.

02:19.500 --> 02:19.900
Yes.

02:19.900 --> 02:24.060
So now the supervisor agent is calling the Subagents.

02:24.060 --> 02:25.780
So email and calendar agents.

02:27.900 --> 02:31.700
And I should get a confirmation email and have a meeting booked in my calendar.

02:33.540 --> 02:35.020
So let's check my inbox.

02:38.340 --> 02:38.580
Yeah.

02:38.580 --> 02:39.300
That's great.

02:39.300 --> 02:40.300
And my calendar.

02:41.940 --> 02:42.220
Yeah.

02:42.220 --> 02:42.900
That's awesome.

02:43.740 --> 02:44.300
All right.

02:44.300 --> 02:48.700
So now I'm going to walk you through each node and explain everything.

02:49.620 --> 02:55.020
I communicate with Emily via telegram because it's the easiest way and it's also free.

02:55.300 --> 03:01.910
But you can easily integrate with other apps such as WhatsApp, slack, etc. as you can see here in

03:01.910 --> 03:06.670
telegram, I got a confirmation from my agent summarizing everything.

03:07.110 --> 03:15.350
She also passed on what I asked her as Mark confirmed that she sent an email to Mark, updated my calendar,

03:15.350 --> 03:18.070
and even added a link to make it easy to access.

03:18.110 --> 03:25.670
Apart from the three subagents so phone call, calendar and email agents, I've added a vector database

03:25.670 --> 03:27.550
as another tool for Emily.

03:27.990 --> 03:32.070
So in short, Emily is trained on Google Docs stored in my Google Drive.

03:34.190 --> 03:38.710
So this is my company's knowledge base.

03:39.070 --> 03:43.390
So if I ask her, for example, what is adaptive AI?

03:48.670 --> 03:54.150
She knows exactly what's included in that document and can answer straight away.

03:55.070 --> 04:01.520
This means she can remind me about my agency SOPs past project details, training materials and much

04:01.520 --> 04:01.840
more.

04:02.560 --> 04:03.720
So she is very helpful.

04:04.240 --> 04:11.520
You can add more tools to Emily like want to do research or analyze competitors or create reports.

04:11.560 --> 04:15.800
The very first step is this trigger node which listens for incoming events.

04:16.440 --> 04:22.400
So every time I send a voice or text message to Emily, the workflow is activated thanks to that node.

04:22.520 --> 04:29.160
So when a new event is detected, the trigger captures it and pass it to it along to the rest of the

04:29.160 --> 04:29.800
workflow.

04:30.680 --> 04:34.440
So any interaction with Emily on Telegram automatically starts the process.

04:34.800 --> 04:35.120
Great.

04:35.160 --> 04:39.320
Now, with this in place, Emily is officially ready to listen to us on telegram.

04:40.840 --> 04:49.840
So next, once our telegram trigger has captured an event, this node determines content type.

04:50.200 --> 04:58.650
So acts as a decision maker and analyzes the incoming messages to figure out whether it's a text, voice

04:58.650 --> 05:00.290
note or something else.

05:00.810 --> 05:08.170
So this node uses simple rules to check for key properties in the coming data.

05:08.610 --> 05:15.530
For example, if the message contains a text field, we know it's a regular text message, and if it

05:15.530 --> 05:21.930
has a file ID in the voice field, it's an audio message and so on.

05:24.250 --> 05:27.730
So based on these rules, the workflow branches out.

05:28.330 --> 05:30.730
Text messages go one way.

05:33.010 --> 05:33.970
Audio another.

05:34.890 --> 05:39.250
And errors like unsupported content types take a separate path.

05:40.970 --> 05:43.410
This ensures Emily can handle every input.

05:43.890 --> 05:51.530
For example, if it's a voice message, the workflow prepares to download and transcribe it.

05:52.850 --> 05:56.930
If it's text, it moves forward to process the message.

05:56.930 --> 06:00.660
Search, so every type of input is handled in the right way.

06:01.220 --> 06:13.460
Now when someone sends a voice message download voice file node, takes the file ID from the incoming

06:13.620 --> 06:19.740
audio message and known as the actual audio directly from the telegram.

06:22.660 --> 06:23.060
Great.

06:23.060 --> 06:29.460
So now when the voice message is ready for the next phase of the workflow, the file is prepared so

06:29.460 --> 06:36.900
it can be transcribed into text and the AI agent can understand its content.

06:37.420 --> 06:45.820
This node uses OpenAI's whisper model to to do the transcription, and this model works great even if

06:45.820 --> 06:48.420
there is an accent or some background noise.

06:48.940 --> 06:55.220
So the node takes the audio file, processes it, and spits out plain text.

06:57.100 --> 07:03.700
For example, if I send a voice message saying what's on my calendar tomorrow, the note turns that

07:03.700 --> 07:04.900
into a text message.

07:06.380 --> 07:11.540
This text is what Emily uses to figure out what you want her to do next.

07:13.820 --> 07:14.940
So now let's move on.

07:16.980 --> 07:18.020
The next note.

07:19.060 --> 07:25.740
Combine content so you have the input and it can be text or a transcribed voice message right.

07:26.180 --> 07:28.740
So this note pull everything together.

07:28.780 --> 07:30.020
Add some labels.

07:31.900 --> 07:37.860
So it combines it into a single variable called combined message.

07:38.980 --> 07:44.980
This way no matter how the message was sent, Emily processed it in the same way.

07:45.660 --> 07:53.340
So it figures out the type of message, like if it's a text query or a voice message, and tags it with

07:53.540 --> 07:54.380
message type.

07:55.380 --> 08:05.670
So if the message was --, it adds a little label in source type to know that now that everything's

08:05.710 --> 08:09.710
sorted, we are ready to send it over to Emily's brain.

08:10.590 --> 08:12.110
So the main agent.

08:14.790 --> 08:18.910
Now, Emily processes the message and decides what needs to be done.

08:19.470 --> 08:29.190
So basically, Emily works as a as a supervisor agent and delegates tasks to the three sub agents working

08:29.190 --> 08:29.830
under her.

08:31.430 --> 08:34.430
So first make phone call agent.

08:35.310 --> 08:41.710
If you ask Emily to make a call, she passes the details like the name, phone number and what to say

08:42.110 --> 08:44.350
to this agent to handle it.

08:45.870 --> 08:47.350
Next calendar agent.

08:47.870 --> 08:53.190
So for anything related to your schedule like checking availability, adding, or checking events,

08:53.590 --> 08:55.550
Emily hands it over to this agent.

08:56.760 --> 08:57.720
And the last one?

08:58.080 --> 08:59.000
Female agent.

08:59.600 --> 09:01.000
This one handles emails.

09:01.520 --> 09:05.360
So whether it's sending, replying or setting up drafts.

09:07.160 --> 09:16.560
But Emily is smart, so she she always checks the contacts data, a Google sheet in Google Drive to

09:16.560 --> 09:19.920
make sure the email address is valid before sending anything.

09:21.280 --> 09:29.560
Now, really important part of this agent and how it works is the prompt.

09:30.680 --> 09:36.320
So it's basically the set of instructions that tells Emily how to handle different tasks.

09:37.280 --> 09:44.000
So the prompt is super specific and even includes variables like the current date and others.

09:44.000 --> 09:47.120
So Emily always has the right context for what she is doing.

09:49.760 --> 09:51.200
So as you can see, it's very detailed.

09:53.520 --> 09:57.370
For example, it reminds her to check the contact list.

09:59.410 --> 10:06.850
Before sending an email, or to make sure phone numbers are formatted correctly before making a call,

10:07.210 --> 10:08.810
such as adding a country code.

10:09.610 --> 10:17.810
So once again, Emily works as the supervisor agent and delegates tasks to the three sub agents, and

10:19.010 --> 10:23.570
each of these sub agents is included in separate workflows.

10:24.250 --> 10:27.530
So this keeps everything modular and easy to manage.

10:27.570 --> 10:29.130
Like in object oriented programming.

10:30.650 --> 10:33.690
So let's start with a make phone call agent.

10:34.610 --> 10:40.970
This agent uses VPI, which is a voice API platform to make the calls.

10:41.570 --> 10:47.330
When Emily needs to handle a phone call request, she gathers all the necessary details like the name

10:47.330 --> 10:55.540
of the person and the type of contact, like if it's a friend or business, and specific instructions

10:55.540 --> 10:59.860
for the call, such as ask if they are available to me tomorrow.

11:02.460 --> 11:10.860
So once Emily has all this information, she passes it to the make one call agent, which then triggers

11:10.900 --> 11:13.660
a workflow called make a phone call path.

11:14.340 --> 11:16.660
So this is the make a phone call agent.

11:19.300 --> 11:23.020
This agent uses the API to handle the actual call.

11:23.580 --> 11:27.940
And after the call is made, the response from the workflow comes back to Emily.

11:28.140 --> 11:31.340
So she knows what happened and can continue with the task.

11:32.500 --> 11:41.180
So this setup ensures that each Subagent focuses on its specific job, while Emily oversees everything

11:41.180 --> 11:44.020
and makes sure it all works together as expected.

11:45.940 --> 11:48.980
So next let's look at the calendar agent.

11:50.260 --> 11:52.260
So this is our calendar agent.

11:52.940 --> 11:58.110
So when Emily handles anything related to scheduling, like checking availability or adding events,

11:58.510 --> 12:00.310
she passes the details to this agent.

12:01.270 --> 12:05.790
Basically, this separate workflow is dedicated to managing your calendar.

12:06.190 --> 12:12.390
It processes the request, updates or retrieves the event information and sends the results back to

12:12.430 --> 12:12.950
Emily.

12:14.030 --> 12:19.910
From there, Emily uses the response to provide you with the updates or confirmations.

12:20.910 --> 12:22.110
Now the email agent.

12:23.070 --> 12:30.630
So Emily first checks the contacts data, a Google sheet to make sure the recipient's email address

12:30.670 --> 12:31.350
is valid.

12:32.070 --> 12:38.710
And once verified, the email agent triggers a separate workflow to handle the email task.

12:39.230 --> 12:45.070
So the workflow sends the email and then returns a confirmation and response back to Emily.

12:45.670 --> 12:48.470
Now let's look at the knowledge base node.

12:50.630 --> 12:59.520
So this node is connected to find Convector store, which allows Emily to access and retrieve specific

12:59.520 --> 13:01.960
information stored in database.

13:03.000 --> 13:05.840
So you can store all sorts of documents in pinecone.

13:06.320 --> 13:09.480
This could be Google Docs, spreadsheets like Google Sheets.

13:09.520 --> 13:18.400
Company policies, etc. so you can give Emily access to, for example, frequently ask questions, client

13:18.440 --> 13:23.920
details, SOPs, training materials, or any other knowledge base.

13:25.000 --> 13:33.280
So Emily can retrieve this content and fetch accurate answers by pulling directly from your stored information.

13:34.600 --> 13:41.800
For example, if you ask Emily something about your company or any stored knowledge, this node searches

13:41.800 --> 13:50.360
through the vector database for the most relevant information and sends it back, and the response is

13:50.360 --> 13:51.440
then passed to Emily.

13:51.440 --> 13:54.130
And based on that, she can provide a relevant answer.

13:57.130 --> 14:00.770
So the pinecone vector star node is where all the knowledge is kept.

14:01.170 --> 14:06.410
It stores the information in a way that makes it easy to search and find later.

14:06.730 --> 14:12.410
So all the knowledge is converted into numerical representations called embeddings.

14:14.770 --> 14:16.490
So I can understand it.

14:18.210 --> 14:24.690
Embeddings are small, easy to search versions of your data that can help large language models find

14:24.690 --> 14:25.890
exactly what they need.

14:27.010 --> 14:35.690
And when Emily sends something, this node searches through the embeddings to find the best match and

14:35.690 --> 14:37.930
sends the results to the knowledge base back.

14:38.290 --> 14:45.330
So from there, it's formatted and passed to Emily so she can use it to answer a question.

14:45.610 --> 14:51.010
So this node generates embeddings for the data stored in a Python vector store.

14:52.100 --> 14:58.060
And whenever you add new information to the database, this node processes it, creating embeddings

14:58.060 --> 15:00.580
that make it easy to search and retrieve later.

15:00.900 --> 15:03.740
So it's a key part of keeping the knowledge base up to date.

15:05.980 --> 15:09.700
Now let's talk about the window buffer memory node.

15:11.740 --> 15:12.300
This one.

15:14.460 --> 15:20.820
This node helps Emily keep track of the context during conversations by storing short term memory.

15:21.500 --> 15:27.380
It allows Emily to remember recent interactions so she can handle follow up questions without losing

15:27.420 --> 15:29.340
track of what you are talking about.

15:29.500 --> 15:34.180
For example, if you ask what's what's on my calendar tomorrow?

15:34.900 --> 15:38.380
And then follow up with can you add a meeting at 3:00 pm?

15:38.780 --> 15:42.100
Emily can connect those requests and respond.

15:43.500 --> 15:47.780
However, this is not a persistent memory like the Pinecone Vector store.

15:48.780 --> 15:52.510
Once the session ends, the memory from this note is cleared.

15:52.790 --> 15:57.150
So it's designed for short term use within the same conversation.

15:58.750 --> 16:06.030
Now, this OpenAI chat model is connected directly to Emily and acts as her brain.

16:06.350 --> 16:12.830
So when Emily needs to process a request like understanding a message or deciding what to do next,

16:13.030 --> 16:20.950
this node uses OpenAI large language model to interpret the input and provide a response.

16:21.430 --> 16:27.830
For example, if you ask Emily to send an email, this node helps analyze the message and figure out

16:27.830 --> 16:30.550
the context and decide what to do next.

16:31.830 --> 16:37.870
I've got a quick five minute tutorial on my channel showing how to connect all OpenAI models to your

16:38.110 --> 16:38.710
agents.

16:38.870 --> 16:40.150
So be sure to check it out.

16:43.510 --> 16:47.710
While this OpenAI chat model works with the knowledge base.

16:49.910 --> 16:59.560
When the pinecone Vector store finds the information Emily needs, this node helps it into a clear response

16:59.560 --> 17:00.760
she can share with you.

17:01.280 --> 17:10.080
For example, if you ask about company policies, this node takes the data from pinecone, processes

17:10.080 --> 17:13.800
it, and turns it into an easy to understand answer.

17:17.280 --> 17:21.080
And in this setup, both openly chat model nodes.

17:22.080 --> 17:28.440
This one and this one use GPT for Omni.

17:28.920 --> 17:30.280
So it's very cheap to run.

17:30.840 --> 17:33.320
I managed to achieve this with a complex prompt.

17:33.920 --> 17:36.680
Now this is the response to me node.

17:37.880 --> 17:45.240
So after Emily completes a task or finds an answer, this node sends a response back to you on telegram.

17:45.720 --> 17:50.490
Now go ahead and download all the files from the resources section of this lesson.

17:50.690 --> 17:56.970
Once you've got them, import the first file called your personal assistant into a new workflow.

17:57.370 --> 18:01.530
If you miss the lesson on how to import workflows into an A-10, no worries.

18:01.530 --> 18:05.010
Just simply go to just simply create a new workflow.

18:05.050 --> 18:08.810
Go to these three dots and import from file.

18:11.410 --> 18:12.650
You should see this workflow.

18:12.650 --> 18:18.330
And now the first thing you'll need to do is connect your credentials for some of the nodes in this

18:18.330 --> 18:18.810
setup.

18:18.810 --> 18:21.890
So especially all the telegram nodes.

18:22.210 --> 18:27.010
So if you have watched the previous lessons, you should already know how to connect your credentials

18:27.010 --> 18:31.810
like telegram, OpenAI and any others using this workflow.

18:31.850 --> 18:38.770
In most cases, Naa10 should auto detect and update those for you as long as you have set them up before.

18:38.810 --> 18:45.890
But if anything's missing or not working, just go to the Setup credentials section in this course and

18:45.890 --> 18:51.500
you will find short tutorials showing exactly how to connect each specific service step by step.

18:51.660 --> 18:57.300
So the first node we need to connect is this node to listen for incoming events.

18:57.300 --> 19:01.020
So make sure this node is connected to your telegram account.

19:01.980 --> 19:08.900
Then if we have successfully connected your telegram simply double click on all of the telegram nodes.

19:09.220 --> 19:12.660
So N810 will automatically refresh the credentials for you.

19:18.660 --> 19:21.420
Make sure you connect the same account.

19:28.420 --> 19:30.180
And you have an output node.

19:30.900 --> 19:32.500
So respond message.

19:33.860 --> 19:34.180
All right.

19:34.180 --> 19:34.740
Great.

19:35.420 --> 19:38.820
Now make sure your OpenAI account is also connected.

19:41.380 --> 19:45.700
So we can use the whisper model to transcribe the recording.

19:47.580 --> 19:48.420
Yeah great.

19:49.780 --> 19:55.100
And also here is a brain for our supervisor agent.

19:57.540 --> 19:57.820
All right.

19:57.820 --> 19:58.340
Awesome.

20:00.700 --> 20:03.860
Also disciplinary note to use the vector database.

20:05.780 --> 20:06.260
Great.

20:09.460 --> 20:16.220
Now what we need to do we need to download these three sub workflows, import them separately and then

20:16.220 --> 20:19.180
connect it to these nodes.

20:19.420 --> 20:21.020
So our personal assistant.

20:21.220 --> 20:24.500
So supervisor agent will be able to use them as tools.

20:25.100 --> 20:28.100
First let's import the phone call agent workflow.

20:30.020 --> 20:30.740
All right.

20:32.180 --> 20:33.700
Now let's rename it to.

20:35.900 --> 20:36.660
Agent.

20:42.460 --> 20:43.540
And save it.

20:43.580 --> 20:45.620
Now let's go back to our main workflow.

20:45.940 --> 20:46.300
Alright.

20:46.300 --> 20:46.620
Great.

20:46.660 --> 20:52.070
Now we should be able to choose this workflow in the Execute workflow node.

20:52.950 --> 20:55.110
So is this one for Focal Agent.

20:59.110 --> 21:00.870
So workflow from list.

21:03.710 --> 21:04.670
And we call it.

21:07.870 --> 21:09.470
Agent workflow.

21:10.310 --> 21:11.110
Alright great.

21:11.670 --> 21:15.630
Now you have to do the same thing with the two sub workflows.

21:17.150 --> 21:18.830
So I'm going to create a workflow.

21:25.590 --> 21:27.110
Import the calendar agent.

21:28.830 --> 21:29.670
Do the same thing.

21:29.670 --> 21:30.630
Import from file.

21:32.950 --> 21:33.390
Calendar.

21:33.390 --> 21:34.030
Agent.

21:35.310 --> 21:36.110
Alright great.

21:36.670 --> 21:38.230
Now just double click on this node.

21:38.230 --> 21:44.990
If you have connected your open account area it will be automatically updated.

21:45.030 --> 21:46.550
The same for Google Calendar.

21:48.520 --> 21:52.560
Make sure you are connected to the correct Google Calendar account.

21:59.440 --> 22:01.880
Okay, now let's rename this workflow.

22:02.440 --> 22:05.040
Let's call it Personal

22:07.000 --> 22:08.400
Assistant.

22:16.760 --> 22:17.560
And hit save.

22:19.080 --> 22:21.000
Now let's go back to our workflow.

22:25.400 --> 22:31.280
And let's search for the workflow we just created in this execute workflow node.

22:40.160 --> 22:42.760
Is this one personal assistant calendar node tool.

22:44.360 --> 22:45.760
So create a new workflow.

22:58.530 --> 22:59.570
Import profile.

23:06.170 --> 23:14.850
Updated credentials and make sure you are connected to the Gmail account you want to use for this setup.

23:19.450 --> 23:20.370
Let's call it.

23:24.410 --> 23:25.210
Personal.

23:31.130 --> 23:32.010
Email tool.

23:34.650 --> 23:35.330
Itself.

23:36.930 --> 23:38.690
And let's go back to our workflow.

23:40.290 --> 23:41.610
And the last step workflow.

23:41.610 --> 23:42.610
So email agent.

23:47.540 --> 23:51.060
We call this workflow personal.

23:54.300 --> 23:58.740
Tool by perfect itself.

24:02.540 --> 24:06.540
Now the next step is to create a Google Sheet with contacts data.

24:06.860 --> 24:09.300
So please go create a Google Sheet.

24:10.100 --> 24:11.540
And mine looks like that.

24:13.180 --> 24:17.260
So I have a name, column email address and phone number.

24:17.420 --> 24:24.100
So our supervisor agent will be able to look through this Google sheet and and then find the phone number

24:24.100 --> 24:25.500
we want to call to.

24:28.020 --> 24:28.340
Alright.

24:28.340 --> 24:28.860
Perfect.

24:29.340 --> 24:30.820
Now this is really important.

24:30.820 --> 24:35.980
Also make sure you are connected to the correct Google sheet account.

24:36.020 --> 24:40.340
So our supervisor agent will be able to retrieve this contacts.

24:40.780 --> 24:45.660
Then also make sure you you choose the correct document from the list.

24:45.780 --> 24:47.350
So mine is contact database.

24:47.710 --> 24:49.590
This is how I name this Google sheet.

24:53.590 --> 24:53.990
All right.

24:54.030 --> 24:54.830
Now let's move on.

24:57.510 --> 25:00.510
Now let's start configuring our agent.

25:00.550 --> 25:07.670
So this node agent is really important because it's one of the, uh, is the one responsible for making

25:07.710 --> 25:08.390
phone calls.

25:08.870 --> 25:09.830
So let's open it.

25:14.110 --> 25:14.990
And let's scroll down.

25:18.590 --> 25:22.110
As you can see here, I added a little bit of code, but don't worry.

25:22.350 --> 25:24.110
This is just a simple code snippet.

25:24.470 --> 25:30.510
So this node triggers a separate workflow that's dedicated to handling the phone call using the API.

25:30.870 --> 25:35.710
And we already imported this workflow and connect it to our supervisor agent.

25:35.950 --> 25:43.870
So here we send it a JSON object with all the important details such as who to call, the phone number,

25:44.390 --> 25:48.120
what to say and how the voice should sound.

25:49.480 --> 25:52.240
So, for example, Tom here friendly as a friendly.

25:53.520 --> 26:00.240
So these are dynamic variables which will be passed to our AI voice agent on platform.

26:00.440 --> 26:06.920
So it passes this info to the phone call agent workflow, which makes the actual call through the API

26:07.200 --> 26:08.920
and sends the response back.

26:09.360 --> 26:16.560
So this node acts like a bridge between your supervisor agent and the AI voice agent on VPI.

26:16.880 --> 26:24.320
So it delegates the task and waits for the result and lets the rest of the automation continue based

26:24.320 --> 26:26.280
on what happened in the call.

26:26.560 --> 26:26.920
Perfect.

26:26.920 --> 26:28.280
Let's close it for now.

26:30.320 --> 26:31.160
Oh, actually.

26:34.080 --> 26:38.440
You can change the callback response because for now it's.

26:39.440 --> 26:42.400
I'm not sure, but I'll check with Damien and get back to you.

26:43.600 --> 26:45.970
So in case our agent is not.

26:50.370 --> 26:57.570
Sure how to respond to a specific question that in the call it will say this so you can change it to

26:57.610 --> 26:58.090
your name.

26:58.130 --> 26:58.450
All right.

26:58.490 --> 26:58.970
Awesome.

26:59.690 --> 27:00.130
All right.

27:00.170 --> 27:06.210
Now let's jump into VPI and create our AI voice agent, the one that will actually make phone calls

27:06.210 --> 27:06.770
for us.

27:06.770 --> 27:13.850
So we are going to set up a new advanced agent, which means it will be triggered by our workflow to

27:13.890 --> 27:21.690
call someone exactly by this workflow, say what we tell it to say and send the response back.

27:21.690 --> 27:27.890
Now, in order to create your account on VPI, please use the link included in the PDF called Essential

27:27.930 --> 27:29.050
Tools for this course.

27:29.370 --> 27:32.730
You will find it in the resources section of this lesson.

27:33.290 --> 27:40.330
So by signing up through that link, you will get 1003 minutes to use with your agents on VPI, which

27:40.330 --> 27:46.340
is more than enough to build, test and run a lot of AI voice agents Once you have signed up and logged

27:46.340 --> 27:51.500
in, you will be ready to create your agent and connect it to your workflow.

27:51.500 --> 27:52.180
So let's move on.

27:52.220 --> 27:56.700
Once you create your account on VPI and then log in, you will see this dashboard.

27:56.700 --> 27:59.540
So now on the left sidebar click on assistance.

27:59.540 --> 28:03.020
So here you will build your first agent.

28:07.340 --> 28:08.700
Now let's create assistant.

28:09.780 --> 28:10.780
So hit that button.

28:12.900 --> 28:13.820
And give it a name.

28:14.540 --> 28:17.420
For example agent a voice agent.

28:20.180 --> 28:20.820
Anything.

28:25.700 --> 28:30.460
And eight n and we want to start from a blank template.

28:31.860 --> 28:33.100
So create assistant.

28:36.580 --> 28:36.940
Alright.

28:36.940 --> 28:37.380
Great.

28:37.540 --> 28:38.900
You will land on this dashboard.

28:38.900 --> 28:41.580
And let me quickly walk you through what you are seeing.

28:42.060 --> 28:50.390
So at the top you will see the cost per minute and the latency which shows how fast your agent responds.

28:51.190 --> 28:54.350
So for most use cases, these defaults are totally fine.

28:54.750 --> 29:01.710
And below that, under the model section you will choose the brain of your assistant.

29:01.830 --> 29:07.670
So here we are using OpenAI as the provider and GPT four as the model, which is very powerful.

29:07.670 --> 29:13.830
And I recommend to use this model for the best results, but you can easily switch to other, more affordable

29:13.830 --> 29:14.430
models.

29:14.830 --> 29:18.430
For now, let's stick with GPT four for now.

29:19.150 --> 29:24.630
The first message is what your agent says when a call starts.

29:25.190 --> 29:31.830
So, as you can see, the first message that the assistant will say, this can also be a URL to containerize

29:31.830 --> 29:32.630
audio file.

29:34.870 --> 29:37.990
So something like hi.

29:41.590 --> 29:42.350
I'm Emily.

29:45.720 --> 29:46.800
Personal assistant.

29:50.280 --> 29:53.160
And the most important part is the system prompt.

29:53.400 --> 30:00.040
This is where you tell your agent how to act, what tone to use, and how to handle different situations.

30:00.240 --> 30:06.840
And we also will be using um, dynamic variables we defined in the workflow.

30:07.520 --> 30:13.600
Now what I'm going to do I'm going to use the prompt I've added to the resources section of this lesson.

30:14.560 --> 30:21.800
So the prompt and copy it to the agent system prompt, we just we just created on the platform.

30:23.320 --> 30:23.800
All right.

30:23.800 --> 30:29.000
So as a role we have your Emily a voice personal assistant for Damien.

30:29.880 --> 30:34.520
And now remember the code snippet we have in our phone call agent sub workflow.

30:35.320 --> 30:35.720
So.

30:40.920 --> 30:49.370
So in this node that JSON defines all the key variables needed to make the call, like who does supervisor

30:49.370 --> 30:56.250
agent is calling, what the purpose of the call is, and how he should speak to the supervisor.

30:56.250 --> 31:03.290
Agent passes those variables to this node, which then triggers a separate workflow.

31:03.290 --> 31:07.330
So this one that handles the actual call on VPI.

31:07.650 --> 31:15.530
Now these fields are automatically filled by N810 at runtime based on what you ask supervisor to do.

31:15.810 --> 31:24.770
So now we are going to use those same dynamic variables in our system prompt to make our agent sound

31:24.770 --> 31:26.490
smart and personal on every call.

31:26.770 --> 31:29.090
So let's go back to our system prompt.

31:32.050 --> 31:36.090
So by using these variables in both your workflow.

31:36.570 --> 31:46.260
So in phone call agent node and your prompt you create a Are pretty dynamic and intelligent agent that

31:46.260 --> 31:49.620
responds to exactly what the user asked.

31:49.900 --> 31:56.660
So you will be speaking with and we are using a variable with a type.

31:56.660 --> 31:58.220
And the purpose of this call is.

31:59.060 --> 32:04.220
So we use the call purpose dynamic variable and so on.

32:04.660 --> 32:09.380
And as additional nodes we want to keep responses in a specific response style.

32:10.140 --> 32:13.420
We want to maintain a tone throughout the conversation.

32:13.460 --> 32:16.380
We want to keep pauses for natural flow.

32:16.980 --> 32:19.820
Start by introducing yourself and stating the purpose.

32:20.020 --> 32:21.740
Be concise to the point.

32:22.020 --> 32:27.180
If unsure about something, say and we want to use this fallback response.

32:27.500 --> 32:29.460
So when you check the workflow.

32:32.340 --> 32:38.260
You can see we want our voice agent to say something like I'm not sure, but I will check with them

32:38.260 --> 32:39.220
and get back to you.

32:39.820 --> 32:42.150
Do not mention actions outside the call.

32:42.190 --> 32:46.710
Like emails, calendar updates, etc. do not repeat yourself and be very polite and friendly.

32:47.310 --> 32:51.550
Now, just to make it clear, this is just the schema.

32:51.670 --> 32:58.590
So the template and when the workflow runs and it will automatically fill in most of these fields using

32:58.590 --> 33:04.990
data pulled from earlier steps like the contact lookup or the telegram message.

33:06.030 --> 33:14.030
But you can set defaults here, like I've done with the tone and the speaking style and the fallback

33:14.030 --> 33:14.630
response.

33:14.830 --> 33:20.390
So every time a call is made, the agent knows to speak in a friendly tone.

33:20.590 --> 33:26.910
Keep responses concise and pause briefly between sentences, while things like first name, phone number,

33:26.910 --> 33:31.510
and instructions will be dynamically filled based on the context of your request.

33:31.750 --> 33:33.070
Now we are back in Vaapi.

33:33.990 --> 33:34.870
Just briefly.

33:34.910 --> 33:38.870
Here you can change the transcriber so you can switch to a different provider.

33:39.270 --> 33:43.470
But I like Deepgram program also can change the language and model.

33:44.190 --> 33:49.790
And here invoice section you can change the provider of of the voice.

33:49.950 --> 33:55.030
So you can switch to 11 labs and others OpenAI for example.

33:55.030 --> 34:02.150
And here you can pick your favorite voice right here you can test your voice agent and talk to it to

34:02.190 --> 34:03.830
check if you like the voice, for example.

34:04.150 --> 34:10.470
Now let's actually look at how to set up a phone number in Wapi so your agent can actually make calls.

34:11.390 --> 34:13.230
So click on phone numbers.

34:16.590 --> 34:17.590
Create phone number.

34:18.870 --> 34:22.830
Now if you are based in the US you can use the free Wapi number.

34:23.430 --> 34:27.230
So just enter area code and click create.

34:27.230 --> 34:29.390
And these numbers are completely free.

34:29.390 --> 34:33.670
And you can have up to ten of them in your account.

34:33.950 --> 34:39.790
So if you are outside of the US like me, you will need to use the Import Twilio tab.

34:40.030 --> 34:42.720
So you we need to go to Twilio.

34:42.920 --> 34:47.000
Create an account, purchase a number and then import it here.

34:47.760 --> 34:54.320
So once it's connected, you need to assign it to your voice agent in the phone number section like

34:54.320 --> 34:55.360
I've done with this one.

34:55.360 --> 35:02.120
And I'm going to show you how to create an account on Twilio and purchase a number in a separate tutorial.

35:02.360 --> 35:06.280
So please go and find the tutorial in this section of the course.

35:06.320 --> 35:15.560
So once you buy the phone number on Twilio and the phone numbers in for you section, scroll down.

35:18.000 --> 35:19.360
You should see account info.

35:20.200 --> 35:22.080
Just copy this account ID.

35:26.040 --> 35:27.200
And paste it right here.

35:29.760 --> 35:33.080
The same for auth token.

35:34.400 --> 35:35.240
Copy it.

35:37.600 --> 35:38.440
Paste it here.

35:43.530 --> 35:45.170
And your telephone number.

35:49.930 --> 35:52.410
Then simply click on import from Twilio and that's it.

35:53.330 --> 35:54.170
Alright, great.

35:54.690 --> 36:02.090
Once you successfully import the number from Twilio, you have to assign your voice agent you just created

36:02.130 --> 36:02.810
right here.

36:03.050 --> 36:05.850
Alright, now let's go back to our workflow.

36:05.890 --> 36:07.050
Alright, perfect.

36:07.090 --> 36:08.850
Now we've got the full setup complete.

36:09.370 --> 36:17.570
We have built the domain and workflow, connected the three sub workflows for email, calendar and phone

36:17.570 --> 36:21.890
calls, and linked everything to our AI agent in Waapi.

36:22.050 --> 36:27.010
There is just one more important part I want to walk you through, and that's the system prompt for

36:27.010 --> 36:28.450
our supervisor agent.

36:28.690 --> 36:29.770
So let's have a look.

36:33.530 --> 36:34.570
Let's expand it.

36:36.370 --> 36:39.860
Now this prompt is what tells the supervisor agent Exactly.

36:39.900 --> 36:41.100
So the role is your.

36:41.140 --> 36:41.580
Emily.

36:42.140 --> 36:43.500
That means personal assistant.

36:43.540 --> 36:47.060
Your primary role is to delegate tasks to the appropriate tool.

36:47.380 --> 36:49.380
You don't complete tasks yourself.

36:49.380 --> 36:55.220
Instead, you ensure that each request is routed correctly with all necessary details passed to the

36:55.220 --> 36:56.740
right tool for execution.

36:57.860 --> 36:59.100
Now, we listed all the tools.

36:59.100 --> 37:05.060
So phone call agent which initiates phone calls by following provided instructions ensuring clear and

37:05.060 --> 37:08.700
effective communication contacts data.

37:08.980 --> 37:14.260
So for any task involving communication calls, email scheduling, always retrieve and verify contact

37:14.260 --> 37:18.620
details using the contact data tool before passing them to the relevant agent.

37:18.620 --> 37:25.260
So we want our supervisor agent to use contacts Data Google Sheet to retrieve all the contact details

37:25.260 --> 37:26.660
to perform all the tasks.

37:28.420 --> 37:29.260
Email agents.

37:29.260 --> 37:34.780
So use this for handling all email related actions such as sending messages, setting up replies, or

37:34.780 --> 37:35.980
for forwarding emails.

37:35.980 --> 37:40.910
Only send emails to verified email The addresses found in the contacts list.

37:40.910 --> 37:44.470
Always sign emails as Damian and never use placeholders like your name.

37:45.070 --> 37:46.590
So of course you can modify this prompt.

37:46.590 --> 37:48.630
You can change it to your name.

37:49.630 --> 37:55.150
If an email address for a recipient cannot be found in the contact list, do not attempt to send the

37:55.150 --> 38:00.990
email or use placeholder addresses like example.com calendar agent.

38:01.030 --> 38:04.270
Use this for scheduling, updating, or managing calendar events.

38:04.790 --> 38:08.430
And now very important how to use the phone call agent.

38:08.430 --> 38:12.390
If the user wants to make a phone call to get the phone call agent tool.

38:12.710 --> 38:16.430
Ensure the following details are passed to the calling agent.

38:17.070 --> 38:20.270
First name so the name of the person or business being called.

38:20.310 --> 38:29.310
Type so the contact type, instructions, phone number, call purpose, response style, tone, etc..

38:29.310 --> 38:34.750
So these are the exact same variables we saw earlier in the phone call agent note.

38:34.990 --> 38:40.480
And that's because the supervisor agent is responsible for passing those values to the tool.

38:40.680 --> 38:42.160
When it's time to make the call.

38:42.160 --> 38:47.360
So the calling agent already understands it is a personal assistant making the call, so only provide

38:47.360 --> 38:48.120
essential details.

38:48.120 --> 38:50.360
Avoid unnecessary contacts or explanations.

38:50.680 --> 38:55.880
And here we are providing examples for all the variables.

38:59.480 --> 39:01.200
You also have rules and best practices.

39:01.440 --> 39:02.920
So always respond in English.

39:02.920 --> 39:08.000
Always delegate tasks using the correct tool and never complete them manually.

39:09.760 --> 39:10.560
And so on.

39:11.520 --> 39:18.920
And when a task requires using one or more of these tools, make sure to identify which tool is most

39:18.920 --> 39:19.640
appropriate.

39:19.840 --> 39:23.920
Pass along the relevant details and execute the actions needed to complete the task.

39:24.240 --> 39:29.160
Your goal is to be proactive, precise, and organized in managing these resources to provide a smooth

39:29.160 --> 39:30.400
experience for the user.

39:30.440 --> 39:31.920
And lastly, we also provide.

39:31.960 --> 39:33.720
Here is the current time and date.

39:34.160 --> 39:34.560
All right.

39:34.600 --> 39:35.080
Awesome.

39:35.920 --> 39:37.040
Let's close it for now.
