WEBVTT

00:00.120 --> 00:00.440
Well.

00:00.440 --> 00:01.120
Hello there.

00:01.120 --> 00:02.920
Today is a blue day.

00:02.960 --> 00:07.560
Today is completing the Business Commercial project as part of week two.

00:07.840 --> 00:11.200
All about accelerating your business or your client's business.

00:11.440 --> 00:12.840
I put it to you.

00:12.840 --> 00:18.200
I believe that today is going to be a day that is going to be relatively low on challenge and relatively

00:18.200 --> 00:21.440
high on satisfaction, but you'll be the judge of that.

00:21.440 --> 00:22.080
Let's see.

00:22.120 --> 00:29.000
Today we are wrapping up our expert product for being an expert for your client.

00:29.000 --> 00:34.800
And we're going to be building our rag agent and our voice agent, of course, bringing back the return

00:34.840 --> 00:36.680
of 11 labs and voice.

00:37.080 --> 00:42.600
And of course, I have to begin one more time with a recap on rag, and this will be the last one for

00:42.600 --> 00:43.000
rag.

00:43.200 --> 00:45.640
But it's always, always good stuff to go through.

00:45.640 --> 00:49.840
There's a couple of times repetition is the best way to learn, but you're pretty familiar with this

00:49.840 --> 00:50.520
stuff now.

00:50.560 --> 00:53.080
The user asks a chat question.

00:53.080 --> 00:59.640
It comes into our code, perhaps running in n810 and the big idea behind rag.

00:59.640 --> 01:04.950
We want to try and look up some useful information, and the way we do it is we use a different kind

01:04.950 --> 01:12.710
of alarm called an encoder, or an embedding model, or a vector embedding model, or, uh, any, any

01:12.710 --> 01:13.550
of those things.

01:13.710 --> 01:18.590
And it takes the text and it turns it into a set of numbers that we call a vector.

01:18.790 --> 01:28.110
And if you're using the very popular OpenAI embedding small, then it comes back with 1536 numbers.

01:28.550 --> 01:29.510
Correct me if I'm wrong.

01:29.710 --> 01:36.390
Uh, and, uh, you can think of that as being like a point in 1536 dimensional space.

01:36.550 --> 01:42.790
And it's done such that you can then go to all of your data sitting in a type of database that we sometimes

01:42.790 --> 01:43.790
call a knowledge base.

01:43.950 --> 01:45.070
And you have indexed it.

01:45.070 --> 01:50.550
You've got all of the bits of data that have also already had vectors calculated for them.

01:50.550 --> 01:51.590
That's what we did yesterday.

01:51.750 --> 01:55.950
So we've got all of these 1536 dimensional vectors.

01:56.150 --> 01:57.990
And we can find the ones that are closest.

01:57.990 --> 02:03.830
And that's likely to give us some material that is going to be relevant to answering the question and

02:03.830 --> 02:08.350
we take that material, we take the original text associated with it, and we send that to the LLM.

02:08.350 --> 02:14.870
And back comes a knowledgeable response, giving the illusion that the LLM itself is, is knows about

02:14.870 --> 02:19.430
everything in the database, because whatever question you ask, we bring back relevant context for

02:19.470 --> 02:19.990
that.

02:19.990 --> 02:24.150
And we learned about chunking and the fact that you typically take your documents and divide them up.

02:24.150 --> 02:27.910
But as it happened, none of our of our documents were less than 1000 characters.

02:27.910 --> 02:33.550
So we ended up with exactly one chunk per row in the Google sheet that we used as our source.

02:33.550 --> 02:35.070
And there are two different phases of Rag.

02:35.110 --> 02:40.950
There's the data ingest, which is what we did yesterday, and data ingest is where you have some source

02:40.950 --> 02:41.670
of data.

02:41.670 --> 02:44.510
And you first of all, extract your data from it.

02:44.510 --> 02:47.230
You transform it in our case with the field mapping node.

02:47.270 --> 02:53.430
And then and then you can chunk it and vectorize it or form vector embeddings from it.

02:53.430 --> 02:56.870
And then you load that into your vector store.

02:56.870 --> 03:03.910
In our case we used a Postgres database in Superbase, the managed Postgres service.

03:04.070 --> 03:08.300
And this is something that can run once as a way to load in your vector store.

03:08.300 --> 03:12.140
But it's common to have this running as kind of live data pipes.

03:12.140 --> 03:14.300
And you could do that in through a few different ways.

03:14.300 --> 03:19.780
You could do it by by having it run on a schedule so that it brings in things on some schedule or the

03:19.780 --> 03:21.420
thing that you could do, should you wish.

03:21.420 --> 03:26.860
And I will really equip you to do it next week is have something that like, listens on a shared drive,

03:26.860 --> 03:32.460
like a Google folder, and then when a new document is dropped in there, it automatically kicks this

03:32.460 --> 03:37.540
whole process off, loads in that document, and then puts all of that document and all of its chunks

03:37.540 --> 03:38.820
into your vector store.

03:38.860 --> 03:40.420
That would be that would be really cool.

03:40.420 --> 03:45.260
And that's the kind of thing you'd actually have running live in production, constantly keeping your

03:45.260 --> 03:46.940
vector store up to date.

03:46.940 --> 03:52.140
And of course, the other side of this is the thing that actually answers the user's question.

03:52.180 --> 03:58.700
The user asks a question, we have an AI agent that can answer it, and the agent has an LLM powering

03:58.700 --> 04:03.620
it that that is used both to actually give the answer, but also to orchestrate how to answer it.

04:03.620 --> 04:08.940
And as part of doing that, it has access to a tool, and that tool is a tool that allows it to vectorize

04:08.940 --> 04:13.940
the question that's being asked, and look it up in the vector store to find relevant content.

04:13.940 --> 04:16.820
And this is, of course, what we call a genetic rag.

04:16.820 --> 04:20.620
And you could also equip it with other tools to dig through the data as well.

04:20.860 --> 04:24.300
And so that is the second phase, which is what we'll be focusing on today.

04:24.340 --> 04:30.660
And as well, you know, the the business challenge that I set for this week is this kind of classic

04:30.900 --> 04:38.020
business, uh, commercial opportunity for for genetic AI, for generative AI, which is an expert question

04:38.020 --> 04:45.100
answerer that's able to to answer questions as if the LLM knows all about your business or in our case,

04:45.100 --> 04:51.300
all about the products that your client offers and can answer any kind of question about these products.

04:51.540 --> 04:57.220
And as I say, so classic, it's so clear to see how you can use this to accelerate a business, to

04:57.220 --> 05:02.620
be able to have workers effectively at the business that are able to do the more manual, more menial

05:02.660 --> 05:03.420
tasks.

05:03.740 --> 05:10.020
By manual, I mean the more sort of manually intensive Tasks that should be ripe for this kind of solution.

05:10.260 --> 05:13.380
And for for our solution, we're using Supabase.

05:13.380 --> 05:20.700
It is a provider, a cloud provider of Postgres, which is a very common, scalable relational database

05:20.700 --> 05:23.980
used by, by by, by many enterprises.

05:24.180 --> 05:29.900
It's, uh, we're using some sort of advanced approaches, but we have to we have to copy and paste

05:29.900 --> 05:32.460
in that code, that slightly janky code.

05:32.460 --> 05:37.580
And I promised you you could just generate that with ChatGPT if you had to do it on your own, and it

05:37.580 --> 05:38.260
would do it for you.

05:38.260 --> 05:41.700
It could explain it to you, it can fix it if there are problems.

05:41.700 --> 05:44.700
So whilst this is meant to be a low code, it's not a no code.

05:44.700 --> 05:46.100
It's a low code course.

05:46.100 --> 05:48.780
So I hopefully don't do too much coding.

05:48.780 --> 05:52.860
That was one moment when we looked at it, but it's it's cookie cutter stuff.

05:52.860 --> 05:54.660
You can use it anywhere.

05:55.020 --> 05:58.820
And we worked on the data ingest side of the house yesterday.

05:58.980 --> 06:01.220
We did some interesting work to do that, that field mapping.

06:01.220 --> 06:06.340
If you remember, transforming data from one format in the Google Sheet to the format with a content

06:06.340 --> 06:10.170
and category that we wanted to put in our vector data store.

06:10.410 --> 06:11.570
And that is done.

06:11.570 --> 06:16.730
And we've loaded up our rag, our knowledge base in our Rag project in Super Base.

06:16.730 --> 06:22.010
It has 60 rows, which all have vectors with 1536 dimensions.

06:22.010 --> 06:22.970
I hope I'm getting that right.

06:23.050 --> 06:25.690
Uh, and we've got all that in there.

06:25.770 --> 06:27.650
Remember those dimensions?

06:27.650 --> 06:32.210
The clever part comes from the encoder, from the OpenAI embeddings.

06:32.210 --> 06:33.450
Small model.

06:33.690 --> 06:35.050
That's what did it.

06:35.090 --> 06:41.410
And our mission for today, of course, is to add the questioning, questioning question answering part

06:41.410 --> 06:45.330
of this and then add a voice agent and then we will declare victory.

06:45.330 --> 06:50.090
Well, here we are back in town, and it's time for us to take a final look at this beautiful ingest

06:50.090 --> 06:53.730
pipeline and then make sure that you've saved it.

06:53.850 --> 07:01.330
And then go back to this screen and it's time to create a new workflow.

07:01.410 --> 07:06.650
This is going to be the home of our question answerer, our Agentic AI.

07:07.050 --> 07:13.480
And we're going to start this time by having a first step that is going to be a good old on chat message.

07:13.880 --> 07:16.120
Our original, our original and favorite.

07:16.200 --> 07:17.000
Uh, there it is.

07:17.000 --> 07:18.120
When chat message is received.

07:18.120 --> 07:18.720
We know this one.

07:18.720 --> 07:20.280
Well, that's how we will begin.

07:20.440 --> 07:21.360
And you know the deal.

07:21.360 --> 07:24.160
The next thing we do is we hit tab to add a node.

07:24.160 --> 07:29.920
We want to add an AI node AI agent and then press escape.

07:29.920 --> 07:31.000
And here it is.

07:31.200 --> 07:36.400
And for once it's good to leave that uh, if I double click, it's fine to keep this connected chat

07:36.440 --> 07:38.040
trigger node and JSON input.

07:38.040 --> 07:39.120
That's exactly what we want.

07:39.160 --> 07:41.960
You now have a better idea about what that means and why it says that.

07:42.080 --> 07:45.360
Uh, so, uh, let's leave that go to chat model.

07:45.400 --> 07:46.960
I guess we'll go with Gemini again.

07:46.960 --> 07:48.360
Gemini chat model.

07:48.360 --> 07:50.120
Of course, you could pick whatever chat model you want.

07:50.120 --> 07:52.240
Use open router if you like that.

07:52.400 --> 07:59.880
Uh, and we will go with models slash Gemini three flashcards.

08:00.240 --> 08:01.080
Preview.

08:01.120 --> 08:02.120
There it is.

08:02.600 --> 08:05.880
And, uh, that's the model we're picking here.

08:06.040 --> 08:07.960
Memory will just use a simple memory.

08:08.400 --> 08:11.520
And it's time for us to add in the tool.
