WEBVTT

00:00.080 --> 00:00.480
Okay.

00:00.520 --> 00:02.760
So I'm going to go over to my workflow here.

00:02.760 --> 00:09.200
And I am going to go to the webhook and I'm going to take the test URL.

00:09.200 --> 00:10.440
I'm going to copy this.

00:10.760 --> 00:17.760
I'm now going to go over here, I'm going to paste in there that is the webhook URL.

00:17.960 --> 00:20.000
And now I'm going to start recording.

00:20.920 --> 00:21.960
Well hi there.

00:22.800 --> 00:24.160
Oh I have to say it loud while I use this site.

00:24.200 --> 00:24.640
Hang on.

00:25.000 --> 00:25.880
Well hi there.

00:25.920 --> 00:27.160
What is two plus two.

00:29.120 --> 00:29.880
Let's try this.

00:30.720 --> 00:31.600
Well hi there.

00:31.600 --> 00:32.800
What is two plus two?

00:33.600 --> 00:35.280
That seems to record quite nicely.

00:35.440 --> 00:37.960
And now I'm going to say send to N810.

00:37.960 --> 00:43.840
But first of course go back to N810 and go back to the beginning and say execute workflow.

00:43.840 --> 00:46.560
So it starts listening waiting for you to call the test.

00:46.840 --> 00:49.840
Go over here, send to N810 back over here.

00:51.040 --> 00:53.000
And we've got a problem right away.

00:53.000 --> 00:54.720
And it is a problem we're expecting.

00:54.920 --> 00:57.200
Cannot read properties of undefined.

00:57.240 --> 00:59.160
Let's double click and see what happens.

00:59.160 --> 01:04.480
So in came something called audio and the file that was looking for was called data.

01:04.520 --> 01:05.560
That's not right.

01:05.560 --> 01:07.840
So we have to change this to say audio.

01:08.000 --> 01:11.560
And now it knows that it's looking for a file that's called audio.

01:11.960 --> 01:12.480
Okay.

01:12.520 --> 01:13.400
Let's save it.

01:13.720 --> 01:14.520
Back we go.

01:14.760 --> 01:15.480
We go back in.

01:15.480 --> 01:18.720
Oh we have to press execute workflow back here.

01:18.720 --> 01:20.960
And now we can just press center N810 again.

01:22.040 --> 01:22.320
Back.

01:22.320 --> 01:22.960
Here we go.

01:23.440 --> 01:26.360
And now we've got a problem in node AI agent.

01:26.360 --> 01:27.640
It's going to be the same thing.

01:27.680 --> 01:29.440
All right so I double click here.

01:29.480 --> 01:30.400
Let's have a look.

01:30.680 --> 01:34.280
You see source it thinks connected chat trigger node.

01:34.280 --> 01:35.160
That's no good.

01:35.200 --> 01:37.720
We have to click here and say define below.

01:38.000 --> 01:41.560
Because as before we are going to give it some details of where to look.

01:41.560 --> 01:42.960
We're not going to tell it something fixed.

01:43.000 --> 01:45.320
We want it to evaluate an expression.

01:45.760 --> 01:48.240
And we're going to get the expression from over here.

01:48.240 --> 01:49.760
We want to take the text.

01:49.760 --> 01:50.360
Well hi there.

01:50.360 --> 01:51.520
What is two plus two?

01:51.680 --> 01:52.520
That's what we want.

01:52.560 --> 01:53.080
Isn't it cool.

01:53.080 --> 01:55.840
That is the transcribed text that came from 11 labs.

01:56.040 --> 01:58.440
Uh, so uh, it works.

01:58.440 --> 02:01.160
So we're going to take that and drop that in there.

02:01.160 --> 02:03.950
Or you could think in your mind, what do you think the expression is going to be?

02:04.550 --> 02:08.670
See, if you were right, it is simply JSON text as an expression.

02:08.670 --> 02:12.110
You can see the result that is now going to work.

02:12.430 --> 02:12.990
Okay.

02:13.310 --> 02:14.110
Escape back.

02:14.110 --> 02:19.470
We go to the canvas and now we're going to try it one more time okay.

02:19.590 --> 02:22.030
So we are going to press Execute workflow.

02:22.070 --> 02:23.390
We're going to go back here.

02:23.390 --> 02:26.750
We're going to press send to end and come back here off.

02:26.750 --> 02:27.390
It's going off.

02:27.390 --> 02:28.590
It goes off it goes.

02:28.630 --> 02:29.590
Two plus two is four.

02:30.150 --> 02:30.590
Ha.

02:31.710 --> 02:32.430
There we go.

02:32.550 --> 02:33.550
Worked first time.

02:33.750 --> 02:36.830
Uh and because that happened so fast I'm going to do it one more time.

02:36.830 --> 02:38.310
I wasn't expecting so fast.

02:38.510 --> 02:39.950
Uh, let's try this again.

02:39.950 --> 02:42.310
Let's, uh, press execute workflow.

02:42.910 --> 02:43.870
I'm going to go here.

02:44.510 --> 02:46.670
I'm going to go send to na n.

02:50.230 --> 02:50.670
Hello.

02:50.670 --> 02:51.790
Two plus two is way four.

02:52.110 --> 02:53.270
How can I help you today?

02:55.750 --> 02:56.910
Fabulous.

02:57.110 --> 02:58.230
So there you have it.

02:58.270 --> 03:01.470
We have a working audio app.

03:01.710 --> 03:06.030
It was a little bit of work, but it was great for you to see this as I promised.

03:06.070 --> 03:08.110
You see that Ann is in control.

03:08.110 --> 03:09.870
You've got the full workflow here.

03:09.870 --> 03:15.470
You've got a webhook, which is where the audio is posted in from our web page that gets transcribed

03:15.470 --> 03:18.830
with an API call and a text that goes through our AI agent.

03:18.830 --> 03:19.830
We're using Gemini.

03:19.870 --> 03:26.430
This time, Gemini converts the text back to speech again, and this node ensures that what gets replied

03:26.430 --> 03:34.070
to in this web request is in fact the audio file version of this, and that's why it replies with the

03:34.070 --> 03:34.710
audio.

03:34.830 --> 03:39.870
Congratulations, you've just made your first audio voice agent.

03:39.990 --> 03:41.790
I call it an audio voice agent.

03:41.790 --> 03:43.550
There are other kinds of voice agents.

03:43.550 --> 03:43.910
It is.

03:43.950 --> 03:45.430
It is a voice agent.

03:45.630 --> 03:49.550
Uh, and now, you know, the mission for you is to experiment with this.

03:49.590 --> 03:53.390
Obviously, I didn't put in a memory because we don't have, like, a chat history ID, so you'd have

03:53.390 --> 03:54.910
to futz around with that a bit.

03:54.910 --> 03:58.310
But do that, give it some memory, maybe even give it a tool or two.

03:58.350 --> 04:04.700
You're a pro with tools and then have a good old conversation Via this web page.

04:04.900 --> 04:10.420
And so the idea is that this web page that we've set up, this kind of fake web page could be anything.

04:10.420 --> 04:12.900
You could you could embed this in your WordPress site.

04:12.900 --> 04:14.780
You could turn this into something fancier.

04:14.820 --> 04:17.540
ChatGPT would have an easier time making this look prettier.

04:17.540 --> 04:22.140
And you can have it be something that will allow you to speak to your end flow.

04:22.140 --> 04:23.100
And you should.

04:23.100 --> 04:24.980
This is great experience.

04:24.980 --> 04:28.220
It's of course not the way I recommend we do it, which is what we're going to do next.

04:28.220 --> 04:31.380
But it's still a good way to see APIs in action.

04:31.380 --> 04:36.140
And if you had a bigger flow with lots of other things going on, but at some point you wanted to call

04:36.140 --> 04:38.380
out and have the audio side of it.

04:38.380 --> 04:42.620
This is exactly how you would do it as part of a bigger workflow.

04:43.460 --> 04:47.780
Okay, so that wraps up technique approach number one.

04:47.980 --> 04:49.100
And approach number two.

04:49.140 --> 04:51.740
As I said, it's more complicated in some ways.

04:51.740 --> 04:56.100
It's also going to be more simple, quicker for us to do perhaps because we've we've already got some

04:56.100 --> 04:57.180
experience with it.

04:57.180 --> 05:04.180
But let's, let's put put this one to bed and dive in to approach number two when 11 labs calls the

05:04.180 --> 05:04.860
shots.
