WEBVTT

00:00.120 --> 00:05.760
And with that, it gives me enormous pleasure to introduce you to this week's business challenge.

00:06.000 --> 00:10.880
This week, of course, all about accelerating your business or your client's business.

00:10.880 --> 00:17.000
And this is such a classic, classic use case, such as something that you can immediately go out there

00:17.000 --> 00:18.760
and implement for clients.

00:18.800 --> 00:24.680
The setup is we need to build an expert that is more than more than a chatbot.

00:24.760 --> 00:30.080
This has got to be something that can answer detailed, in-depth questions for your client about their

00:30.080 --> 00:36.480
business and the client in this case for this setup, has a lot of information about a lot of products

00:36.600 --> 00:43.240
in a big Google Sheet, huge document, and they want something that can quickly answer questions that

00:43.240 --> 00:45.600
pertain to any of this information.

00:45.760 --> 00:48.680
And of course, we'll be working with a fairly big but not massive massive.

00:48.720 --> 00:54.000
The idea is this needs to be scalable so that you could be working with so much data.

00:54.000 --> 00:59.600
You could be working with enterprise level data if you wanted, or if you're a startup, then your company,

00:59.600 --> 01:00.920
your clients data.

01:00.930 --> 01:04.810
so it could be able to work with huge quantities of data.

01:04.810 --> 01:07.410
Classic Rag business setup.

01:07.450 --> 01:14.050
Okay, so for our solution, as I mentioned yesterday, we're going to be using this managed database

01:14.050 --> 01:15.730
provider called Superbase.

01:15.930 --> 01:17.610
There are easier ways we could do this.

01:17.610 --> 01:22.930
There are built in databases and with with nodes that make life much easier for us.

01:22.970 --> 01:25.610
And I've made life a bit harder for us by choosing Superbase.

01:25.610 --> 01:29.010
And I had a reason for that, which is that Superbase is very popular.

01:29.010 --> 01:33.370
It's a great third party database, and it's something that many other people use.

01:33.370 --> 01:38.730
And so by giving you the skills to integrate with Superbase, it's like a big tick in the integrations

01:38.730 --> 01:41.010
list for you to have worked with Superbase.

01:41.210 --> 01:45.650
And it means that it's another set of credentials that will be set up, and it's something that you'll

01:45.650 --> 01:48.370
be able to use and extend for many other purposes.

01:48.370 --> 01:51.250
So I think it's an important infrastructure step.

01:51.250 --> 01:56.450
And that's why I picked a slightly harder challenge in saying we would integrate with a Postgres database

01:56.450 --> 01:57.330
on Superbase.

01:57.330 --> 02:02.420
It has one downside, which is there is going to be one point where we have to run some code, which

02:02.420 --> 02:08.660
I'm going to give you and which you will execute in supabase to set up the database, and you will be

02:08.660 --> 02:11.580
in your rights to think, ah, this is meant to be a low code.

02:11.580 --> 02:12.740
No code course.

02:12.900 --> 02:14.820
How would I be able to do this myself?

02:14.820 --> 02:16.020
And I will talk you through that.

02:16.020 --> 02:17.940
At the time, I'm not going to talk through the code.

02:17.940 --> 02:21.060
I'm going to explain how you could generate this yourself.

02:21.220 --> 02:25.340
And it's super easy and how you could handle things like problems with it.

02:25.340 --> 02:28.780
But that's so that's just going to be an obstacle for us to get through.

02:28.780 --> 02:31.940
But apart from that, Supabase is going to be fabulous.

02:31.940 --> 02:36.340
And it's certainly very much a heavyweight, scalable solution.

02:36.340 --> 02:41.100
And when I say heavyweight, I mean in the kind of like enterprise grade sense, it's actually quite

02:41.100 --> 02:43.020
a lightweight, easy to use platform.

02:43.020 --> 02:44.340
Supabase I think you're gonna like it.

02:44.580 --> 02:45.220
All right.

02:45.260 --> 02:48.420
Today is all about data ingest.

02:48.420 --> 02:53.060
And look, in one way, I'm going to dive in a bit deeper and build up some of your skills.

02:53.060 --> 03:00.180
We're going to do some data transformation using a node in an called edit fields used to be called set.

03:00.300 --> 03:00.580
Uh.

03:00.580 --> 03:04.900
And it's such a crucial node and this is really good core skills.

03:04.900 --> 03:08.140
So I've I've done well there in another way.

03:08.140 --> 03:14.820
I'm taking a bit of a shortcut in that the source of data we're going to use is going to be a Google

03:14.820 --> 03:15.460
sheet.

03:15.460 --> 03:17.980
And you know we've done this a few times now.

03:17.980 --> 03:20.060
We've used Google Sheets as our input data a lot.

03:20.060 --> 03:22.460
And you might be thinking, ah, that's a bit boring.

03:22.620 --> 03:24.740
Like I was originally thinking.

03:24.740 --> 03:26.340
It would be pretty cool to have it.

03:26.340 --> 03:32.420
So there'd be a Google Drive folder, and every time you drop documents into that Google Drive folder,

03:32.420 --> 03:35.780
it would automatically kick off our data ingest pipeline.

03:35.780 --> 03:37.940
And so that would indeed be cooler.

03:38.020 --> 03:43.620
Uh, but it turns out there is a little bit of lift associated with that particularly there's some credentials

03:43.700 --> 03:44.780
authentication stuff.

03:44.780 --> 03:49.780
It's a big OAuth two thing which which I think would be too much of a distraction right now.

03:49.820 --> 03:51.700
We're going to do all of that next week.

03:51.700 --> 03:54.300
We're going to set it up so that you can do that.

03:54.300 --> 04:00.860
And so you could absolutely then come back and beef this up so that it's just got that extra step that

04:00.860 --> 04:06.430
you don't just update a Google sheet, but rather you can just drop a document in a Google Drive in

04:06.430 --> 04:10.270
a folder and that would automatically kick off your workflow.

04:10.310 --> 04:11.870
And I do think that would be quite cool.

04:11.870 --> 04:16.950
So so I would encourage you, after we've, we've built up those skills next week, that you could come

04:16.950 --> 04:22.070
back and further improve this, but for now, it seemed to me like that was it was too much of a distraction.

04:22.070 --> 04:27.950
It would be like a one hour sidebar to get all of that set up, and I would rather we focus on Rag and

04:27.950 --> 04:29.030
get this thing done.

04:29.030 --> 04:33.270
So the sort of shortcut is that our source of data is going to be Google Sheet, which we already know

04:33.270 --> 04:33.910
how to do.

04:33.910 --> 04:35.390
So that part's going to be easy.

04:35.390 --> 04:40.830
And we're going to be spending our main time focusing on the Agentic rag build out.

04:40.830 --> 04:46.230
And tomorrow we will also, of course, add in the question answering part of this.

04:46.230 --> 04:51.070
So that we've done the data in yesterday, we've added the question answering tomorrow and that will

04:51.070 --> 04:54.510
complete a genetic rag, a commercial build.

04:54.830 --> 04:57.790
And for sure we're going to add in a voice agent tomorrow.

04:57.790 --> 04:59.430
And then it's also going to connect with you.

04:59.430 --> 05:06.000
I think it will land as to why I've talked about building the the business logic in N810 and then having

05:06.000 --> 05:10.240
11 labs be built as the voice agent that drives the workflow.

05:10.280 --> 05:11.800
I think that's going to make a lot of sense.

05:12.040 --> 05:16.600
We're doing that that second approach for how we integrate 11 labs with N810.

05:16.680 --> 05:20.040
The N810 workflow focuses on the business logic.

05:20.400 --> 05:23.240
All right, enough with the talk.

05:23.560 --> 05:24.720
Time for the doing.

05:24.920 --> 05:28.560
Let's go and set up our data ingest.

05:28.600 --> 05:29.080
Okay.

05:29.080 --> 05:33.280
Let me start by introducing you to some data.

05:33.600 --> 05:36.920
The data that is sitting right here this.

05:37.120 --> 05:40.320
This is a Google sheet that your client has shared with you.

05:40.440 --> 05:41.520
In our example.

05:41.520 --> 05:48.720
And it's a sheet full of the products that they sell in their online store selling computer accessories.

05:48.920 --> 05:55.960
And you can see the names of the products, their their categories, their SKUs, uh, and the price

05:55.960 --> 05:59.120
and what they, what they are description.

05:59.360 --> 06:00.800
And there's a bunch of them.

06:00.800 --> 06:02.290
There is, in fact, not that many.

06:02.330 --> 06:06.810
There's 60 because I don't want to, like, like, uh, do something that's super.

06:07.050 --> 06:08.970
That's expensive in any way.

06:09.210 --> 06:11.050
I want to give this as an example.

06:11.050 --> 06:16.450
The point I want to show you is that this is 60, but it could easily be 60,000.

06:16.490 --> 06:19.930
What we're going to do is going to be incredibly scalable.

06:19.930 --> 06:21.210
And that's the point.

06:21.210 --> 06:25.690
And should you wish to, to generate more data or use your own, I would love that.

06:25.730 --> 06:27.370
That's very much to be encouraged.

06:27.410 --> 06:31.930
We're using Google Sheets for this, as I say, because it's something we've already integrated with.

06:31.930 --> 06:34.210
So it's super easy for us to pull in this data.

06:34.570 --> 06:41.010
There would of course, be more, uh, definitely a more advanced mission for you would be to try an

06:41.010 --> 06:46.690
already solve that that next step problem of have it be any document dropped in a shared Google Drive.

06:46.690 --> 06:47.770
That would be really cool.

06:47.810 --> 06:51.130
But it's also great just to imagine you've been given this sheet.

06:51.130 --> 06:52.970
It's the kind of thing that really happens.

06:52.970 --> 06:54.810
And so that's what we're working with.

06:54.850 --> 06:56.530
This is the data we've got.

06:56.530 --> 07:02.420
And we should get used to it, because we're going to be reading it in and analyzing it and doing stuff

07:02.420 --> 07:03.140
with it.

07:03.580 --> 07:05.260
Let's go and get started with that right now.

07:05.380 --> 07:06.020
Okay.

07:06.060 --> 07:08.740
So we're going over to Nate and I.

07:09.340 --> 07:11.340
And we're going to press the sign in button.

07:11.340 --> 07:13.820
Come into our workspace here.

07:14.460 --> 07:17.420
And here we are in the home screen.

07:17.620 --> 07:21.660
And we're going to create a brand new workflow.

07:22.220 --> 07:22.940
Okay.

07:23.300 --> 07:26.100
So first of all what's going to trigger this workflow.

07:26.140 --> 07:29.380
Well there are a lot of interesting things we could do to trigger this workflow.

07:29.420 --> 07:34.420
Like when a sheet changes when a file is dropped in the Google Drive, I'm going to keep it quite boring

07:34.420 --> 07:35.620
and trigger it manually.

07:35.660 --> 07:40.300
We're going to have plenty of time to add more interesting triggers in the future, but we're focusing

07:40.300 --> 07:43.860
on Rag today, so we're just going to focus on the data pipeline.

07:43.860 --> 07:44.460
But feel free.

07:44.460 --> 07:48.580
You can make it so that when you send a slack message or a telegram message, that's what wakes it up.

07:48.580 --> 07:54.500
Whatever we want, this is going to kick off our data, ingest pipeline, load in the data from that

07:54.500 --> 07:58.460
Google sheet and put it in super base in a vector store.

07:58.580 --> 07:59.980
That's what we're building right now.

08:00.020 --> 08:05.940
And at this point I'm, I'm going to to roll up my sleeves because we're about to get deep into some

08:05.940 --> 08:08.100
data, which I always enjoy doing.

08:08.100 --> 08:09.020
And so should you.

08:09.180 --> 08:13.260
Uh, if you're wearing a short sleeve t shirt, then you can just imagine you can just go through the

08:13.260 --> 08:17.260
motion as we get into our data set.

08:17.260 --> 08:22.340
And we're going to start by having a Google Sheets in here.

08:22.340 --> 08:25.420
We are going to want to read our rows.

08:25.660 --> 08:28.380
Uh, so that is get rows in sheets.

08:28.700 --> 08:29.660
If you remember this.

08:29.660 --> 08:31.380
Well we're going to get rows.

08:31.380 --> 08:37.620
I'll, I'll fill this in so that we can pull in all of the stuff in that product sheet.

08:37.660 --> 08:38.060
Here we go.

08:38.060 --> 08:39.980
So I've selected products from here.

08:39.980 --> 08:42.260
And now it's obviously there is only one sheet.

08:42.260 --> 08:43.220
It's sheet one.

08:43.460 --> 08:46.340
And this all sounds great okay.

08:46.380 --> 08:48.820
That is going to get the rows from our Google Sheet.

08:48.820 --> 08:50.620
Let's press Execute Workflow.

08:50.620 --> 08:55.540
And if you've got this this manual when clicking on Execute workflow this manual trigger you just press

08:55.540 --> 08:56.780
the button and it runs.

08:56.780 --> 08:57.820
And we've just loaded this in.

08:57.820 --> 09:01.380
And so if we double click here you can see that it's run.

09:01.380 --> 09:05.190
And you see here an example the 60 items that were indeed 60 items.

09:05.190 --> 09:06.830
And this would be one of them.

09:06.990 --> 09:10.590
Uh, the Nova key tactile keyboard coming in like this.

09:10.790 --> 09:13.550
Uh, so, uh, great.

09:13.590 --> 09:15.030
We've brought in our data.

09:15.070 --> 09:15.910
What next?

09:15.950 --> 09:22.470
Well, this, of course, is the extract stage of ETL extract transform load, which means the next

09:22.470 --> 09:24.310
step is transform.

09:24.310 --> 09:29.190
And so we want to just massage this data to be in the format that's going to make most sense for what

09:29.190 --> 09:30.150
we have to do.

09:30.350 --> 09:33.150
And to do that we need to look up a node.

09:33.150 --> 09:36.750
In the previous version of this used to be called set.

09:36.950 --> 09:42.230
In the new version it is called edit Fields and look they put set in brackets afterwards because that's

09:42.230 --> 09:42.910
what it used to be called.

09:42.910 --> 09:44.270
This used to be the set node.

09:44.310 --> 09:46.990
Now it's the edit fields node.

09:47.310 --> 09:52.030
And so what we see here is that on the left we've got the data coming in.

09:52.190 --> 09:58.590
Uh and on the right is whatever we're going to massage it to uh, which would be the best possible format

09:58.590 --> 09:59.910
for our knowledge base.

09:59.910 --> 10:02.790
And so that that mapping is what we're going to do now.
