WEBVTT

00:00.240 --> 00:01.680
Hey there Ethan here.

00:01.680 --> 00:06.720
And in this section we're going to build something which is called a reflection agent.

00:07.000 --> 00:13.480
It's going to extend our previous example of the reflection agent, but it's going to incorporate tools,

00:13.480 --> 00:19.280
for example, a search tool which is going to search online for real time data to enrich our answer.

00:19.640 --> 00:26.120
And we're going to review some advanced prompt engineering techniques in order for our reflection agent

00:26.160 --> 00:32.080
to be able to digest the feedback correctly and to really address and improve through the iterations,

00:32.320 --> 00:39.400
because to create a critique, it's not that hard, but to really leverage the LLM to incorporate that

00:39.400 --> 00:43.160
critique and to improve over time is something which is challenging.

00:43.160 --> 00:46.840
And for that, we're going to show you some very cool tricks that will help us do so.

00:47.240 --> 00:49.840
Now, where does this architecture come from?

00:49.840 --> 00:57.040
It comes from a paper which is called reflection, which is a joint paper from northeastern, MIT and

00:57.040 --> 00:57.720
Princeton.

00:58.000 --> 01:05.050
So the idea for this actually came from a lecture blog that their team created covering reflection agents

01:05.050 --> 01:07.450
and specifically covering the reflection paper.

01:07.490 --> 01:09.850
Implementing everything with Landgraaf.

01:10.490 --> 01:15.970
Now, I have to say that the link chain team and specifically Lance from Link Chain, they did an amazing

01:15.970 --> 01:16.410
work.

01:16.610 --> 01:23.570
However, their implementation is actually for me, it was a bit hard to understand, so it really took

01:23.570 --> 01:26.730
me a long time to understand what was going on.

01:26.930 --> 01:33.690
So what I did is take it and refactor it a bit in order to make it something which is much more easier

01:33.690 --> 01:39.490
to explain and easier to digest, because the implementation was quite hard to understand.

01:40.010 --> 01:43.850
And of course I'll be sharing the link in the video's resources section.

01:46.610 --> 01:53.290
The goal of the reflection agent is to give us a very detailed article about a topic that will give

01:53.290 --> 01:58.930
it, and we want the article to dynamically fetch relevant information from the web.

01:59.210 --> 02:02.650
And we want to have citations for the reference data.

02:02.650 --> 02:06.610
And of course we want to incorporate a quality right critique loop.

02:07.020 --> 02:09.620
We basically want to get a very high quality answer.

02:10.900 --> 02:18.060
And this is an example question right, about AI powered SoC, autonomous SoC problem domain startups

02:18.100 --> 02:20.420
that do that and raised capital.

02:20.980 --> 02:28.020
So I'm not sure if you're familiar, but AI powered SoC or Hyperautomation autonomous SoC like companies

02:28.020 --> 02:34.660
like talk is something which is exploding right now, and it's getting a lot of attention.

02:35.020 --> 02:41.860
And the idea here is to take the security operations center and to leverage AI agents in order to start

02:41.860 --> 02:46.260
resolving tier one tickets and security incidents.

02:46.260 --> 02:52.420
So those kind of tickets don't require a lot of reasoning, and they can leverage external tools in

02:52.420 --> 02:59.260
order to triage them and to resolve them, and freeing up a lot of work for SOC analysts already.

02:59.260 --> 03:02.900
This is an example response we got from the reflection agent.

03:03.140 --> 03:07.060
And we're going now to discuss the architecture of that agent.

03:08.540 --> 03:14.550
So the architecture looks very similar to our reflection agent we saw in the previous section.

03:15.110 --> 03:22.630
You can see when we start the agent flow, then we first have a responder node which is going to respond.

03:22.670 --> 03:23.750
The initial response.

03:23.790 --> 03:25.110
You can see it right over here.

03:25.710 --> 03:33.070
However it's not only generates the original response, but it also adds a critique to the response

03:33.070 --> 03:35.950
itself and a search term.

03:36.270 --> 03:43.510
So in the search term, the agent is going to come up with ideal search queries that would be beneficial

03:43.510 --> 03:45.150
to get a better response.

03:45.910 --> 03:53.150
It's going to help us ground the output that we gave with current events and external data that is available

03:53.190 --> 03:53.750
online.

03:54.430 --> 04:00.030
So after we have the search queries, we're going to execute the execute tools node, which is going

04:00.030 --> 04:01.310
to take our search queries.

04:01.310 --> 04:05.990
And simply going to use a search engine to retrieve us results in real time.

04:06.310 --> 04:10.710
So I'm going to be using Tavileh, which is an amazing third party.

04:10.750 --> 04:18.130
That is a search engine which is highly optimized for LM applications, and we can easily downstream

04:18.130 --> 04:20.930
the responses that we get to the LM.

04:21.290 --> 04:27.530
And now we're going to downstream the results, and we're going to go to the Revisor node.

04:27.530 --> 04:33.570
And the Revisor node is going to take the initial response, which has already an initial critique,

04:33.890 --> 04:37.290
and the results of the search engine execution.

04:37.290 --> 04:39.130
So it has also external data.

04:39.330 --> 04:42.330
It is very relevant to the topic that we're executing.

04:42.730 --> 04:46.370
It's going to revisit and change our original response.

04:46.530 --> 04:51.450
Now it's going to do that while incorporating the new data and the critique.

04:51.450 --> 04:57.650
So it's going to address the suggestion that we got in the previous step to articulate a better answer.

04:57.650 --> 05:04.370
But not only that, the Revisor is also going to supply us a new critique to the new revision of the

05:04.370 --> 05:11.050
article, and it's going to provide us new search terms that we want to now look up, according to the

05:11.050 --> 05:17.940
revised articles that are going to be beneficial, and it's going to give us The citations of the first

05:17.940 --> 05:19.060
search that we had.

05:20.100 --> 05:24.580
And after that we're going to continue and search for the new queries.

05:24.740 --> 05:30.900
We're going to downstream the new information alongside with the critique, and we're going to revise

05:30.900 --> 05:31.580
it again.

05:31.580 --> 05:36.300
So this loop is going to keep happening until we hit a stopping condition.

05:36.580 --> 05:38.460
And after that we finished.

05:38.660 --> 05:44.540
And this architecture is very similar to our reflection agent architecture that we saw in the previous

05:44.540 --> 05:45.180
section.

05:45.620 --> 05:47.700
But we added here a search engine.

05:49.060 --> 05:50.820
Let's discuss what we're going to use.

05:50.820 --> 05:57.500
We're going to use GPT four turbo because we need a strong enough model to write the text and to write

05:57.500 --> 06:01.380
the critique, which have a good reasoning power.

06:02.060 --> 06:07.220
Also be leveraging function calling, which is going to be super important in this implementation.

06:07.660 --> 06:10.820
And we're going to be using Tavileh as our search engine.

06:10.820 --> 06:17.580
And of course, we're going to be using Lindsmith for tracing, because in this complex architecture

06:17.700 --> 06:20.180
we want to be able to trace easily.