WEBVTT

00:01.400 --> 00:02.800
Hey there, Eden here.

00:02.800 --> 00:06.920
And in this short video, we're going to be implementing the generation node.

00:06.920 --> 00:11.840
And the generation node is going to be the last node that is going to be executed.

00:12.320 --> 00:18.600
We execute this node after we already retrieve the information, the relevant documents, after we filtered

00:18.600 --> 00:25.520
out the documents that were not relevant to our query, and even performed a search for the question

00:25.520 --> 00:26.800
that we want to answer.

00:27.200 --> 00:31.920
So after we have all the documents, we can augment the original query.

00:32.200 --> 00:34.000
And now it's time to generate.

00:34.000 --> 00:39.480
So now it's time to simply stop everything and to send it to the LM to answer it.

00:39.880 --> 00:43.280
So this is what we're going to be implementing in this video.

00:43.320 --> 00:46.360
We're going to leverage the react prompt from the link chain hub.

00:46.560 --> 00:51.480
We're going to define a generation chain and to create a node that will call this chain.

00:51.480 --> 00:53.480
And of course we're going to be writing tests.

00:54.720 --> 00:56.480
All right let's go to the code.

00:56.680 --> 01:00.440
And we want to create a new file under our chains module.

01:00.600 --> 01:02.270
And we'll call it generation.

01:03.230 --> 01:05.550
And here we'll start with the imports.

01:05.550 --> 01:10.710
We want to import the hub from the chain because we're going to download the prompt from it.

01:10.950 --> 01:19.150
And we want also to import the str output parser, which is simply going to take our message.

01:19.150 --> 01:22.270
And it's going to get the content from it and turn it into a string.

01:23.590 --> 01:26.670
And we also want to import chat OpenAI.

01:27.150 --> 01:30.150
And let's create an LM instance from it.

01:30.590 --> 01:35.590
And now we want to pull the prompt from the LinkedIn hub.

01:36.030 --> 01:38.830
And this is a very standard racket prompt.

01:38.830 --> 01:46.670
That lens, smarting from the LinkedIn team, wrote, giving the LM the role of an assistant for question

01:46.670 --> 01:47.350
answering.

01:47.710 --> 01:53.990
Plugging in the context, which is going to be all the documents retrieved or web search that we saw

01:54.030 --> 01:57.470
from the earlier stages, and of course, the original question.

01:57.750 --> 02:00.030
So a very standard react prompt.

02:00.030 --> 02:01.630
So we're going to be using that.

02:02.220 --> 02:04.260
Our chain is going to be standard as well.

02:04.300 --> 02:11.100
We're going to create a generation chain where we pipe the prompt into the LM, and then we pipe the

02:11.100 --> 02:13.300
results into the STR output.

02:13.820 --> 02:20.740
So once we invoke this chain with the documents and the question, we're supposed to get the answer

02:20.740 --> 02:21.460
that we want.

02:22.860 --> 02:23.780
Alrighty.

02:23.780 --> 02:26.100
Let's go now and write some tests.

02:26.340 --> 02:31.340
So first I want to import the pretty print.

02:31.340 --> 02:34.660
And I want also to import the generation chain.

02:35.900 --> 02:41.820
And now let's go to the bottom of the file and let's create a test called test generation chain.

02:42.180 --> 02:47.500
And to be honest this isn't going to be an actual test that we are certain we're simply going to run

02:47.500 --> 02:51.780
this chain on a topic and we're going to see what's printed.

02:52.180 --> 02:56.860
It's just to give us a sanity check to see everything is working as expected.

02:56.900 --> 03:01.900
So we'll be querying agent memory and we'll retrieve the relevant documents.

03:02.260 --> 03:07.780
And now we want to run the generation chain with the context to be the documents we retrieved.

03:08.020 --> 03:09.380
And the question?

03:09.420 --> 03:10.580
The original question.

03:11.460 --> 03:12.540
And let's run it.

03:20.980 --> 03:22.660
And we can see it passed.

03:22.660 --> 03:26.380
And it gave us some summary about agent memory.

03:27.140 --> 03:29.140
So this looks fine.

03:29.620 --> 03:33.580
And we can even go to locksmith and to check out what happened.

03:33.580 --> 03:35.500
We can see we have here the retrieval.

03:35.860 --> 03:39.140
And we can see the documents that we retrieved from this query.

03:39.620 --> 03:43.100
And we can see even the documents content.

03:43.340 --> 03:45.300
And we can see it's relevant to memory.

03:45.700 --> 03:51.140
And if we go to the runnable sequence we can see the um the question that we asked.

03:51.140 --> 03:52.500
We can see it right over here.

03:52.540 --> 03:53.580
You're an assistant.

03:53.580 --> 03:58.420
And then we plugged in the context and we gave it the question of agent memory.

03:59.060 --> 04:03.810
And we got a response back which we output parsed into a string.

04:05.490 --> 04:10.970
All right, let's now run all the tests just to see that we didn't break anything.

04:18.890 --> 04:21.370
And we can see all of the tests pass.

04:21.370 --> 04:24.170
So I love seeing those green fee marks.

04:25.010 --> 04:27.530
Let's go and add a new file under nodes.

04:27.530 --> 04:29.890
And we want to call it generate.

04:30.330 --> 04:32.890
And here is going to be generate node.

04:33.130 --> 04:34.610
And you're guessing it.

04:34.610 --> 04:41.370
This node is going to simply take the question and take the documents from our state and simply run

04:41.370 --> 04:41.970
the chain.

04:42.530 --> 04:44.810
So it's going to be very straightforward.

04:46.170 --> 04:52.370
So all of the things we're doing here we already saw, but the last thing we're doing is updating the

04:52.370 --> 04:59.050
generation key in our graph state to be the generation the answer that the LLM responded.