WEBVTT

00:00.510 --> 00:03.000
-: We're gonna go a little bit advanced with SAMMO.

00:03.000 --> 00:05.880
We're going to be talking about minibatching and metaprompts

00:05.880 --> 00:08.520
and this is really the way

00:08.520 --> 00:11.250
that people recommend using SAMMO.

00:11.250 --> 00:15.390
And we're gonna pull in here the OpenAIChat,

00:15.390 --> 00:16.380
which we need,

00:16.380 --> 00:19.000
and we're gonna pull in the

00:20.880 --> 00:21.790
DataTable

00:24.690 --> 00:26.887
and we're going to get,

00:28.053 --> 00:29.020
also need

00:33.300 --> 00:34.210
an EvaluationScore

00:42.460 --> 00:43.330
and

00:45.408 --> 00:46.770
os,

00:46.770 --> 00:47.603
pandas,

00:49.470 --> 00:50.303
pd.

00:50.303 --> 00:53.002
Okay, we're going to set up our runner, and this is,

00:53.002 --> 00:55.350
you know, there's no difference to how we do this.

00:55.350 --> 00:57.690
This is typical for SAMMO,

00:57.690 --> 01:01.530
but we're also gonna define a function which loads the data,

01:01.530 --> 01:04.140
because we're gonna pull our data from a spreadsheet.

01:04.140 --> 01:07.110
This is, more likely, where you're gonna have your data.

01:07.110 --> 01:10.440
In my experience, this is what I usually have,

01:10.440 --> 01:13.380
some sort of CSV that've gotten from team

01:13.380 --> 01:16.080
with all the inputs or outputs, examples,

01:16.080 --> 01:18.780
and then you're gonna pull that data back

01:18.780 --> 01:22.620
into a DataTable, and that puts it into the SAMMO format.

01:22.620 --> 01:26.130
I'm gonna say, oh, sorry, it's from_pandas,

01:26.130 --> 01:27.683
and then there's a few different things we need to do,

01:27.683 --> 01:30.800
is that one is, we need to add the input_fields,

01:30.800 --> 01:33.150
in this case, it's the "description".

01:33.150 --> 01:35.010
We're going to pull in...

01:36.930 --> 01:38.590
Actually, this doesn't need to be

01:41.190 --> 01:42.090
a list,

01:42.090 --> 01:45.000
can be a string, 'cause in this case we just have one,

01:45.000 --> 01:47.430
and then we're gonna put in the "classification",

01:47.430 --> 01:48.263
and then

01:50.112 --> 01:52.830
the other thing we're gonna add is the constants.

01:52.830 --> 01:55.170
So, and this is like stuff

01:55.170 --> 01:57.900
that you can pull into your prompt afterwards,

01:57.900 --> 01:59.310
that will always be the same

01:59.310 --> 02:01.020
for every row in your DataTable.

02:01.020 --> 02:03.660
So, in this case, we're gonna use it

02:03.660 --> 02:05.910
to put our "instructions" for our prompt,

02:05.910 --> 02:09.717
'Determine how to classify these transactions',

02:11.970 --> 02:16.020
Okay, and this is gonna make more sense in a minute.

02:16.020 --> 02:18.393
Let me see, return mydata,

02:19.229 --> 02:21.660
so, that's how we're gonna get our data,

02:21.660 --> 02:23.190
and then we just need one more function,

02:23.190 --> 02:24.900
which we're gonna use, and then we'll dive into it.

02:24.900 --> 02:27.840
So, we need an evaluation metric,

02:27.840 --> 02:30.060
which we're gonna call accuracy

02:30.060 --> 02:31.990
and we take in two different things.

02:31.990 --> 02:36.990
We're gonna take in y_true, which is gonna be a DataTable,

02:38.390 --> 02:39.810
then y_pred is gonna be a DataTable,

02:39.810 --> 02:43.800
and then we're gonna pass out an EvaluationScore at the end.

02:43.800 --> 02:45.600
And the way that will work,

02:45.600 --> 02:50.487
is we're gonna say y_true = y_true.outputs.values

02:57.150 --> 02:59.823
and then same for y_pred.

02:59.823 --> 03:00.810
So, we're just gonna get the values

03:00.810 --> 03:03.120
and then we're gonna do this EvaluationScore,

03:03.120 --> 03:06.150
well actually, we need to say how many of them are correct.

03:06.150 --> 03:09.993
So, we gonna say, and in this case it'll be calculate,

03:11.280 --> 03:15.390
and that would be n_correct, let's call it that.

03:15.390 --> 03:18.670
So, we need to sum up all the situations where

03:24.053 --> 03:26.890
y_p == y_t for y_p,

03:34.958 --> 03:35.791
y_t

03:39.030 --> 03:42.780
in a zip of the two things.

03:42.780 --> 03:44.430
So, if that doesn't make any sense,

03:44.430 --> 03:45.900
I'll just explain it again.

03:45.900 --> 03:49.298
We're gonna pull in the y_pred and y_true,

03:49.298 --> 03:52.500
then we're going zip that together, so they're in tuples,

03:52.500 --> 03:55.340
and then we're gonna grab the values of those tuples,

03:55.340 --> 03:58.080
y_p and y_t and we're just gonna check if they're the same.

03:58.080 --> 04:00.510
So, simple, kind of output here.

04:00.510 --> 04:03.540
And then we're just gonna return the EvaluationScore,

04:03.540 --> 04:04.373
which is just

04:05.952 --> 04:10.257
n_correct / len(y_true),

04:11.190 --> 04:12.963
so that's how many are correct.

04:15.120 --> 04:15.953
Okay,

04:19.230 --> 04:20.430
great stuff.

04:20.430 --> 04:21.720
So, these are the two functions.

04:21.720 --> 04:23.220
That's what we need,

04:23.220 --> 04:26.313
and we've got everything we need now to do this.

04:27.683 --> 04:29.583
load_data, let's see what the data is.

04:31.860 --> 04:36.307
So you can see here, this is a bunch of transactions,

04:37.230 --> 04:39.060
cash deposit, withdrew money,

04:39.060 --> 04:41.790
insurance claim refund, whatever,

04:41.790 --> 04:43.020
and then some classifications.

04:43.020 --> 04:47.100
So this is what we want to call those transactions.

04:47.100 --> 04:49.800
We also have our instruction from the DataTable,

04:49.800 --> 04:52.860
so we're gonna use those prompt.

04:52.860 --> 04:54.420
All right, so how do we send up our prompts?

04:54.420 --> 04:56.520
So, this is how to do it manually.

04:56.520 --> 04:59.130
This is the standard way.

04:59.130 --> 05:01.530
Then I'm gonna show you the metaprompt way,

05:01.530 --> 05:05.100
which I think is much more powerful and easier to read.

05:05.100 --> 05:07.920
Don't worry if you find this a little bit hard to follow.

05:07.920 --> 05:09.700
We're gonna import Output

05:10.620 --> 05:13.327
and then we're gonna put the GenerateText.

05:15.260 --> 05:19.080
We're gonna get the extractor, the ExtractRegex.

05:19.080 --> 05:21.120
That's what we're going to use for this.

05:21.120 --> 05:22.500
And then we're gonna import Template,

05:22.500 --> 05:25.023
and we need to set up our prompt,

05:27.330 --> 05:28.950
prompt,

05:28.950 --> 05:31.830
and that's gonna just be the GenerateText thing,

05:31.830 --> 05:33.843
and it's gonna be a template inside,

05:35.160 --> 05:38.373
and I'm just gonna copy this template across.

05:42.180 --> 05:44.310
So, what do we have in our template?

05:44.310 --> 05:46.140
We have two things.

05:46.140 --> 05:47.210
We have...

05:50.550 --> 05:51.990
We have the instructions.

05:51.990 --> 05:54.570
We're gonna get the constants.instructions

05:54.570 --> 05:56.370
that comes from the DataTable,

05:56.370 --> 05:59.250
and then we've told it what output labels we want.

05:59.250 --> 06:01.110
These are the different classifications it could be.

06:01.110 --> 06:04.620
It could be Rent, Other, Food, Entertainment, Utilities.

06:04.620 --> 06:09.620
And then we've just added in this inputs here.

06:09.750 --> 06:13.050
This is basically iterating for each input.

06:13.050 --> 06:14.760
It's gonna create a prompt,

06:14.760 --> 06:19.760
and then it's gonna take that value and just place it.

06:19.830 --> 06:21.990
We're gonna say input and then the value.

06:21.990 --> 06:24.990
So, you know, if we iterate through here.

06:24.990 --> 06:28.230
It's gonna have like, cash deposit at local branch,

06:28.230 --> 06:29.370
withdrew money for rent payment,

06:29.370 --> 06:32.520
withdrew cash for weekend expenses, pull all those through

06:32.520 --> 06:34.530
and it's gonna iterate through them with this each.

06:34.530 --> 06:36.900
This is like a foreach function,

06:36.900 --> 06:39.330
but the cool thing it's gonna do is,

06:39.330 --> 06:40.620
'cause we set it up in this way,

06:40.620 --> 06:42.630
we can do a batch of them at the same time.

06:42.630 --> 06:44.670
We're not just doing one at a time.

06:44.670 --> 06:47.793
It's actually doing, you know, say 5 or 10 at a time.

06:48.660 --> 06:53.660
Let's just get our data.

06:56.160 --> 06:58.360
We're just gonna take a sample of that data,

07:03.300 --> 07:04.133
10 different

07:05.280 --> 07:08.490
and seed here, so you get the same response hopefully,

07:08.490 --> 07:10.990
and then we're gonna say, set the labeling_output

07:13.470 --> 07:16.583
and to do that, we just need to ExtractRegex.

07:16.583 --> 07:18.870
So we're gonna pass in the labelling_prompt.

07:18.870 --> 07:22.230
and then we need a RegEx, which we'll grab,

07:22.230 --> 07:25.860
with the different things that we've identified.

07:25.860 --> 07:27.213
So, in this case,

07:29.640 --> 07:30.870
it should be this.

07:30.870 --> 07:32.010
And that's the RegEx

07:32.010 --> 07:35.580
that can pull out the different responses here.

07:35.580 --> 07:36.850
And then, I'm gonna say

07:38.276 --> 07:40.193
minibatch_size=10.

07:44.892 --> 07:48.843
So, it's gonna do one batch of 10 all at the same time,

07:50.460 --> 07:52.503
and put that into result,

07:54.434 --> 07:57.657
and result looks like that.

08:03.430 --> 08:05.253
I put this in the wrong place,

08:06.210 --> 08:08.603
put it in the RegEx, it should be there.

08:16.110 --> 08:16.943
Here we go.

08:16.943 --> 08:21.440
So we've output here, the actual response from that batch

08:23.760 --> 08:25.710
and we've only run one batch here.

08:25.710 --> 08:28.710
So if we said we're gonna do a minibatch_size of one

08:28.710 --> 08:31.230
that will do 10 API calls

08:31.230 --> 08:33.090
and it's gonna run them all at the same time,

08:33.090 --> 08:36.060
or if we wanted to do all 10 at the same time,

08:36.060 --> 08:36.960
we can do that.

08:36.960 --> 08:38.940
This is a really powerful way

08:38.940 --> 08:42.240
of, basically, iterating through a lot of data

08:42.240 --> 08:44.130
in a quick response.

08:44.130 --> 08:46.017
So, that's minibatching

08:46.017 --> 08:49.500
and that's a really powerful way to do it, right?

08:49.500 --> 08:53.250
And then we can also figure out the accuracy as well,

08:53.250 --> 08:55.020
See if it's...

08:55.020 --> 08:58.290
So let's check, see the sample result.

08:58.290 --> 08:59.940
Yeah, it's a 100% accuracy,

08:59.940 --> 09:02.370
because we're passing the sample, passing the result,

09:02.370 --> 09:03.353
so that's really good.

09:03.353 --> 09:06.150
But we can scale this up as well if we wanted to,

09:06.150 --> 09:07.650
now that we have this program.

09:07.650 --> 09:09.120
We could increase the sample.

09:09.120 --> 09:11.973
We could change the minibatch_size.

09:11.973 --> 09:14.820
Now, let's look at the prompt itself.

09:14.820 --> 09:18.720
So, we wanna just call the plot trace on this.

09:18.720 --> 09:20.730
We can see what's going on.

09:20.730 --> 09:22.470
Right, here We've got the instructions.

09:22.470 --> 09:24.330
They're coming into that prompt.

09:24.330 --> 09:26.880
We already had the output labels and the prompt

09:26.880 --> 09:29.790
and then we have the input, cash withdrawal for a holiday,

09:29.790 --> 09:33.090
mortgage payment, ATM withdrawal, cash.

09:33.090 --> 09:35.970
So we've passed in all of these things in the minibatch

09:35.970 --> 09:38.343
and then it's given the output here,

09:39.518 --> 09:41.850
and it's come out in a non-standard structure,

09:41.850 --> 09:44.130
but that's fine 'cause our RegEx could handle that,

09:44.130 --> 09:47.040
and it's pulled through all these different answers,

09:47.040 --> 09:51.330
which is really great, and that's how it works.

09:51.330 --> 09:54.540
Alright, so that is how you would do it manually,

09:54.540 --> 09:57.180
but there's a much quicker and easier way to do this

09:57.180 --> 10:01.170
through metaprompting, which I find much more useful.

10:01.170 --> 10:02.820
This is essentially the same thing.

10:02.820 --> 10:04.950
I'm just gonna copy and paste this across,

10:04.950 --> 10:06.720
but there's a few different things that are different.

10:06.720 --> 10:11.190
One is, we're setting up this MetaPrompt structure.

10:11.190 --> 10:15.030
We get that in from the instructions.

10:15.030 --> 10:15.863
Great.

10:15.863 --> 10:17.640
So, from sammo.instructions we're getting MetaPrompt,

10:17.640 --> 10:20.280
Section, Paragraph, InputData, FewshotExamples.

10:20.280 --> 10:23.370
There's a bunch of different helper functions.

10:23.370 --> 10:25.080
There's a few things that are interesting.

10:25.080 --> 10:26.647
One is, we have this instructions section,

10:26.647 --> 10:29.340
and we're just getting that from the constants,

10:29.340 --> 10:30.630
so that's not different,

10:30.630 --> 10:33.330
but here we're adding in some FewshotExamples

10:33.330 --> 10:36.030
just to show you how quickly you can do this.

10:36.030 --> 10:39.780
It actually just pulls from the sample, from the data,

10:39.780 --> 10:42.510
and it pulls in these examples, which is really helpful

10:42.510 --> 10:45.240
because it's already formatted in the right way

10:45.240 --> 10:46.590
for that dataframe.

10:46.590 --> 10:49.980
We're also adding Paragraph to show the Output labels

10:49.980 --> 10:53.340
and then making a placeholder for the InputData.

10:53.340 --> 10:55.590
We can render it as "markdown"

10:55.590 --> 10:57.636
and then we're adding a data_formatter,

10:57.636 --> 10:59.520
which is gonna do the RegEx for us.

10:59.520 --> 11:02.010
We don't have to know how to use RegEx, which is nice.

11:02.010 --> 11:04.950
We just put in our list of the different things it could be,

11:04.950 --> 11:07.830
and then we can also say what happens if there's a failure.

11:07.830 --> 11:09.360
We should get an "empty_result".

11:09.360 --> 11:11.370
So, what does this look like?

11:11.370 --> 11:14.490
If we just look at the program,

11:14.490 --> 11:16.740
it's, basically, exactly the same,

11:16.740 --> 11:18.810
but it's just pulling all these different things together

11:18.810 --> 11:20.250
into a MetaPrompt,

11:20.250 --> 11:23.820
and it has the FewshotExamples as well,

11:23.820 --> 11:26.100
which we're gonna see in a second.

11:26.100 --> 11:28.613
Alright, so let's run it and just make sure we get the same.

11:30.420 --> 11:32.433
To run it, we're gonna say result,

11:37.320 --> 11:39.980
and I'm gonna paste and put in...

11:41.088 --> 11:42.633
Let's just do five this time.

11:43.735 --> 11:47.013
Let's say, on_error="empty_result".

11:47.013 --> 11:51.693
Let's say run(runner, sample), not the full data.

11:54.360 --> 11:55.890
And then, so that's done,

11:55.890 --> 11:58.563
we're just gonna check what's the result look like.

12:02.070 --> 12:03.090
Here we go.

12:03.090 --> 12:05.040
So, it worked exactly the same.

12:05.040 --> 12:06.480
In this case it did two batches,

12:06.480 --> 12:09.120
'cause we said minibatch_size=5,

12:09.120 --> 12:11.880
and here's just an example of how it worked.

12:11.880 --> 12:15.750
So, we have exactly the same prompt going through,

12:15.750 --> 12:18.720
except we added these FewshotExamples,

12:18.720 --> 12:21.210
but it's a much cleaner way of looking at it, right?

12:21.210 --> 12:25.380
Compare that to the way this is.

12:25.380 --> 12:26.940
It's very complicated to me.

12:26.940 --> 12:29.670
I'm like, it's got lots of variables in there,

12:29.670 --> 12:31.680
and it's like pretty hard for me to parse,

12:31.680 --> 12:33.360
whereas the nice thing about this is,

12:33.360 --> 12:35.250
I'm like, I can edit this so easy, right?

12:35.250 --> 12:37.380
It's just a list of different sections.

12:37.380 --> 12:39.750
I got the Instructions Section, I got the Examples,

12:39.750 --> 12:41.580
I got the Output labels,

12:41.580 --> 12:44.193
and then this is where the input stuff's gonna go.

12:45.060 --> 12:48.750
So, that's much nicer and easier in my experience,

12:48.750 --> 12:51.530
and this is how I tend to use samples

12:51.530 --> 12:53.400
to string together these examples.

12:53.400 --> 12:54.277
And then that way, if you're like,

12:54.277 --> 12:57.330
"Oh, I don't want the the FewShotExamples."

12:57.330 --> 12:59.670
Let's see how that works, now.

12:59.670 --> 13:01.710
It's just gonna be the same,

13:01.710 --> 13:03.900
but it's gonna be a slightly different prompt rate,

13:03.900 --> 13:07.047
and then, we're gonna... an error in this case (laughs),

13:09.180 --> 13:10.623
but I put those back in.

13:14.880 --> 13:16.110
Now it's working.

13:16.110 --> 13:19.106
But you can quickly swap different instructions in.

13:19.106 --> 13:20.707
You could say something like,

13:20.707 --> 13:25.707
"Make sure you use the right labels to the transactions."

13:26.340 --> 13:28.290
So you quickly change the prompt,

13:28.290 --> 13:29.640
and then it's gonna run again,

13:29.640 --> 13:32.040
and you can see if it's done the right thing.

13:32.040 --> 13:33.993
So, that's the benefit of this.

13:35.820 --> 13:37.920
You can just swap these sections if you need to.

13:37.920 --> 13:41.610
And then one more thing, which is quite helpful,

13:41.610 --> 13:42.570
you may not realize,

13:42.570 --> 13:45.902
and I find this is also useful for observability.

13:45.902 --> 13:50.610
In DSPy, I find it quite hard to see what was actually up,

13:50.610 --> 13:54.720
what where the LLM calls, different requests and things

13:54.720 --> 13:55.680
that were made,

13:55.680 --> 13:59.460
but it's much easier to access them with this.

13:59.460 --> 14:02.280
Here, I just put into the result,

14:02.280 --> 14:05.430
and then I've got the, you know, llm_requests.

14:05.430 --> 14:08.940
So, if you just go into that list you can see,

14:08.940 --> 14:11.310
okay, here's what's actually it's compiled into.

14:11.310 --> 14:13.020
So, we can see we've got the examples,

14:13.020 --> 14:15.267
see the FewshotExamples that that was in,

14:15.267 --> 14:16.740
got the output labels,

14:16.740 --> 14:18.720
and then these are the inputs

14:18.720 --> 14:21.330
that we want it to do something with.

14:21.330 --> 14:24.750
So, I find that's like much cleaner to be able to see.

14:24.750 --> 14:26.850
The other thing we can do, which is quite nice,

14:26.850 --> 14:30.120
is we can change the prompt relatively easy,

14:30.120 --> 14:32.523
modified_mprompt in this case.

14:33.810 --> 14:36.033
We just take the nprompt and we clone it,

14:37.260 --> 14:40.200
and then we have to use this function called rebind,

14:40.200 --> 14:44.460
but what that's gonna do is take the exact same prompt

14:44.460 --> 14:46.310
and then just change one aspect of it.

14:46.310 --> 14:49.972
So we're gonna copy in, change this to "data_formatter",

14:49.972 --> 14:52.539
to a JSON data format,

14:52.539 --> 14:55.907
and then we'll run it exactly the same.

14:55.907 --> 14:59.040
We can see that the results are the same,

14:59.040 --> 15:02.973
but when you actually look at the compiled prompt,

15:04.860 --> 15:05.910
it's JSON.

15:05.910 --> 15:08.100
It gave the JSON examples instead

15:08.100 --> 15:11.100
and it's passing in the JSON.

15:11.100 --> 15:13.890
This is really nice to swap things in and out.

15:13.890 --> 15:16.140
You could also swap models in and out as well,

15:16.140 --> 15:17.880
which is really helpful.

15:17.880 --> 15:20.070
One more thing I wanted to show you here

15:20.070 --> 15:22.230
is that you can actually do RAG as well.

15:22.230 --> 15:25.470
So they have support for local RAG.

15:25.470 --> 15:28.050
I haven't used in production. Take it with a grain of salt,

15:28.050 --> 15:29.810
but for a lot of experiments

15:29.810 --> 15:31.523
and things you're running locally,

15:31.523 --> 15:33.660
I think it's quite helpful.

15:33.660 --> 15:35.973
So, you just need to make a embedder,

15:37.140 --> 15:40.830
and then let's called OpenAIEmbedding model,

15:40.830 --> 15:41.763
and then you just pass in,

15:41.763 --> 15:44.010
this is just like creating a runner,

15:44.010 --> 15:46.110
pass the cache file, etc.

15:46.110 --> 15:47.710
I'm just gonna set a rate_limit.

15:50.610 --> 15:53.027
The rate_limit's gonna be 10.

15:54.589 --> 15:57.333
That's all I need, I think, for the embedder.

16:01.800 --> 16:03.430
Actually, let's call this

16:08.104 --> 16:09.521
"embeddings.tsv"

16:09.521 --> 16:11.610
and let's call this "EMBEDDING_FILE".

16:11.610 --> 16:14.550
So we do some conflict prevention, cool.

16:14.550 --> 16:16.470
Now it's gonna do, you know, the same thing.

16:16.470 --> 16:18.750
It will get like the embeddings.

16:18.750 --> 16:20.910
Instead of calling the OpenAI generative model,

16:20.910 --> 16:22.830
it's gonna call the text embedding model.

16:22.830 --> 16:25.923
It's gonna get the embeddings for your different things.

16:29.113 --> 16:30.300
Let's look at what data we have,

16:30.300 --> 16:32.577
just gonna find what's the len(mydata),

16:34.218 --> 16:38.460
and split it into fewshot and training.

16:38.460 --> 16:41.520
Here, we just split it into training data,

16:41.520 --> 16:43.500
which we're gonna use to kind of test against,

16:43.500 --> 16:46.110
and then fewshot data, which is the examples

16:46.110 --> 16:49.350
that we can pull into the prompt dynamically.

16:49.350 --> 16:51.750
Before we had these FewshotExamples,

16:51.750 --> 16:54.270
but we're not sure whether we're pulling the right ones in.

16:54.270 --> 16:55.670
Well, we then can actually do that.

16:55.670 --> 16:58.380
We can make that happen with embeddings.

16:58.380 --> 17:01.883
So, the way that we sort our data there,

17:02.943 --> 17:05.673
just if we wanna see what that data looks like.

17:06.810 --> 17:09.900
So, let's just slice that dataframe,

17:09.900 --> 17:11.710
But then we're gonna pull in the

17:15.255 --> 17:17.338
EmbeddingFewShotExamples,

17:18.270 --> 17:20.110
sorry, FewShotExamples,

17:20.970 --> 17:23.430
and instead of you doing the FewShotExamples

17:23.430 --> 17:26.250
that we had before, we're gonna swap that out.

17:26.250 --> 17:29.163
If we just grab our prompt from before.

17:32.910 --> 17:34.673
This is our MetaPrompt.

17:40.543 --> 17:43.290
So, we can just get that,

17:43.290 --> 17:45.930
but then instead of this FewShotExamples,

17:45.930 --> 17:50.433
instead we're gonna call that EmbeddingFewShotExamples,

17:51.270 --> 17:53.760
and what do we need to pass in here?

17:53.760 --> 17:57.570
We need to pass in the embedder first,

17:57.570 --> 18:00.600
and then the fewshot data,

18:00.600 --> 18:04.350
and then the examples.

18:04.350 --> 18:05.183
and then

18:06.840 --> 18:11.520
budget="relative",

18:11.520 --> 18:14.370
and with these parameters what it's gonna do

18:14.370 --> 18:16.260
is it's gonna to take the embedder,

18:16.260 --> 18:17.400
it's gonna find all the embeddings

18:17.400 --> 18:18.990
for everything on the dataset,

18:18.990 --> 18:22.260
and then it's gonna pull the three most relevant examples

18:22.260 --> 18:23.760
for these inputs.

18:23.760 --> 18:26.850
So, chances are you actually get the same transaction,

18:26.850 --> 18:29.850
that's already been classified, which is really helpful.

18:29.850 --> 18:33.270
So, that's what this does.

18:33.270 --> 18:35.477
And then we're just gonna call this,

18:36.750 --> 18:38.550
instead of mprompt_parsed,

18:38.550 --> 18:41.845
we're gonna call this mprompt_rag_parsed,

18:41.845 --> 18:45.480
and rag as well, just to separate those two things.

18:45.480 --> 18:48.600
and if this works then we're gonna see

18:48.600 --> 18:51.033
that it's pulling in different examples here.

18:55.290 --> 18:58.620
So, that's run in exactly the same as we did before,

18:58.620 --> 18:59.973
but it's got,

19:00.870 --> 19:02.240
plot the thing,

19:02.240 --> 19:04.320
the EmbeddingFewShotExamples.

19:04.320 --> 19:08.310
You can see it's pulled in insurance claim refund, Other,

19:08.310 --> 19:11.160
bought coffee and snacks from cafe, food,

19:11.160 --> 19:13.320
pay for streaming service, entertainment.

19:13.320 --> 19:16.440
So, it's pulled those examples in, and that's good,

19:16.440 --> 19:18.990
because the input examples are insurance claim refund,

19:18.990 --> 19:20.610
bought coffee and snacks.

19:20.610 --> 19:25.350
So, it actually found a lot of the same transactions

19:25.350 --> 19:27.330
from the RAG database.

19:27.330 --> 19:29.670
So, the smart thing here is,

19:29.670 --> 19:32.190
that as you classify things correctly,

19:32.190 --> 19:34.650
it's gonna build a bigger RAG database,

19:34.650 --> 19:36.180
that you can pull from.

19:36.180 --> 19:37.590
So, the chances are,

19:37.590 --> 19:40.170
if you have a lot of corrected examples,

19:40.170 --> 19:41.490
you're pulling in these.

19:41.490 --> 19:45.420
Your FewShotExamples will match the inputs perfectly,

19:45.420 --> 19:46.410
which is really great,

19:46.410 --> 19:49.260
and it means that it's very unlikely to do something.

19:49.260 --> 19:51.570
So, that's a really powerful thing.

19:51.570 --> 19:53.010
I haven't explored it a lot,

19:53.010 --> 19:55.290
but to be able to just do RAG locally like that,

19:55.290 --> 19:57.720
without having to mess around with Face, or Chroma,

19:57.720 --> 19:59.373
or whatever, is really cool.
