WEBVTT

1
00:00:00.330 --> 00:00:01.440
<v Eden>Hey there, Eden here,</v>

2
00:00:01.440 --> 00:00:04.980
and I want to have a recap on what we did

3
00:00:04.980 --> 00:00:08.520
so far in the course when it comes to agents.

4
00:00:08.520 --> 00:00:13.520
And by now, I really hope that the ReAct algorithm

5
00:00:13.860 --> 00:00:16.470
architecture and implementation

6
00:00:16.470 --> 00:00:19.830
and everything around it is crystal clear.

7
00:00:19.830 --> 00:00:22.950
And I'm hoping that if I'm going to wake you up at night

8
00:00:22.950 --> 00:00:26.280
and ask you, "What's the flow of the ReAct algorithm?"

9
00:00:26.280 --> 00:00:28.650
then you'll be able to sing it.

10
00:00:28.650 --> 00:00:30.780
And I'll have a really quick recap.

11
00:00:30.780 --> 00:00:34.230
We send the query to the ReAct agent,

12
00:00:34.230 --> 00:00:36.930
the large language model then ponders

13
00:00:36.930 --> 00:00:40.260
and thinks and decides which tool to call.

14
00:00:40.260 --> 00:00:42.630
Then we go and execute that tool

15
00:00:42.630 --> 00:00:46.860
and then we go and do that until we get a final answer

16
00:00:46.860 --> 00:00:49.710
until there are no more tools to invoke.

17
00:00:49.710 --> 00:00:53.250
All right, so let's go back to the diagram

18
00:00:53.250 --> 00:00:55.230
that I showed you earlier in the course

19
00:00:55.230 --> 00:00:58.830
about the evolution of ReAct agents.

20
00:00:58.830 --> 00:01:03.030
So it all started with the ReAct paper

21
00:01:03.030 --> 00:01:05.220
and the ReAct prompt,

22
00:01:05.220 --> 00:01:09.480
where we used the LLM as a reasoning engine

23
00:01:09.480 --> 00:01:14.130
and then LangChain implemented some fancy parsing

24
00:01:14.130 --> 00:01:18.330
in order to extract which tools to execute.

25
00:01:18.330 --> 00:01:20.640
Okay, so this was the very beginning.

26
00:01:20.640 --> 00:01:23.460
And this implementation was super impressive,

27
00:01:23.460 --> 00:01:25.680
but it wasn't that reliable

28
00:01:25.680 --> 00:01:28.230
so we couldn't really use this in production

29
00:01:28.230 --> 00:01:32.460
because the models weren't that strong by that time

30
00:01:32.460 --> 00:01:36.060
and it was really hard to parse the output.

31
00:01:36.060 --> 00:01:38.700
Because the models weren't that reliable,

32
00:01:38.700 --> 00:01:41.430
then parsing the output was really hard.

33
00:01:41.430 --> 00:01:43.410
It was really non-deterministic

34
00:01:43.410 --> 00:01:45.450
so we really didn't have control

35
00:01:45.450 --> 00:01:48.180
about which token the LLM is going to generate.

36
00:01:48.180 --> 00:01:50.820
And because of that, it's enough that the LLM

37
00:01:50.820 --> 00:01:53.490
is going to generate one wrong token,

38
00:01:53.490 --> 00:01:57.660
and it can cause and mess up all the output parsing.

39
00:01:57.660 --> 00:02:01.290
And then LLMs became a bit better

40
00:02:01.290 --> 00:02:03.120
and function calling came out,

41
00:02:03.120 --> 00:02:06.960
and function calling normalized this process

42
00:02:06.960 --> 00:02:11.190
of making the LLM behave as a reasoning engine

43
00:02:11.190 --> 00:02:15.240
and suddenly we don't need this ReAct prompt anymore,

44
00:02:15.240 --> 00:02:18.150
we can rely on the vendors

45
00:02:18.150 --> 00:02:21.630
and the model's function calling capabilities,

46
00:02:21.630 --> 00:02:25.650
and the model is going to return which function to call,

47
00:02:25.650 --> 00:02:28.710
and it's going to do a very good job doing it.

48
00:02:28.710 --> 00:02:31.470
And the function to call,

49
00:02:31.470 --> 00:02:32.820
it's going to specify it

50
00:02:32.820 --> 00:02:35.850
in a very special place in the request.

51
00:02:35.850 --> 00:02:40.680
So we do not need this special ReAct prompt anymore

52
00:02:40.680 --> 00:02:43.920
and this weird output parsing that we did,

53
00:02:43.920 --> 00:02:46.770
which was really hard but not that reliable.

54
00:02:46.770 --> 00:02:49.170
So now we shift everything to the vendor,

55
00:02:49.170 --> 00:02:51.540
the vendor now is responsible to do that,

56
00:02:51.540 --> 00:02:53.310
and we get the information

57
00:02:53.310 --> 00:02:57.960
about which function to call in the LLM's response.

58
00:02:57.960 --> 00:03:02.400
Now, we solved an earlier problem of this output parsing,

59
00:03:02.400 --> 00:03:03.660
which was not reliable,

60
00:03:03.660 --> 00:03:05.730
but we introduced now a new problem

61
00:03:05.730 --> 00:03:08.670
because every vendor did what they want,

62
00:03:08.670 --> 00:03:11.700
and the information about which function to call

63
00:03:11.700 --> 00:03:15.270
was placed differently in different places.

64
00:03:15.270 --> 00:03:17.550
So one called it function calling,

65
00:03:17.550 --> 00:03:19.050
the other tool calling,

66
00:03:19.050 --> 00:03:22.050
it was in different parts of the response.

67
00:03:22.050 --> 00:03:26.010
So what LangChain did is create one single interface

68
00:03:26.010 --> 00:03:29.340
that is going to be called the tool calling interface,

69
00:03:29.340 --> 00:03:33.210
and it implemented all of the integrations to the vendor.

70
00:03:33.210 --> 00:03:36.540
So now we have only one interface of tool calling,

71
00:03:36.540 --> 00:03:40.410
and it's going to get all of the functions to call

72
00:03:40.410 --> 00:03:41.910
and the information about it

73
00:03:41.910 --> 00:03:44.130
and it's going to work for every vendor.

74
00:03:44.130 --> 00:03:46.200
So this was very useful.

75
00:03:46.200 --> 00:03:49.200
Now, when LangGraph came out,

76
00:03:49.200 --> 00:03:53.670
it had a completely different architectural approach.

77
00:03:53.670 --> 00:03:58.200
So instead of having a function-based agent loop,

78
00:03:58.200 --> 00:04:01.710
which was really a while loop abstracted

79
00:04:01.710 --> 00:04:04.890
with the agent executor class of LangChain.

80
00:04:04.890 --> 00:04:07.710
So there wasn't much flexibility

81
00:04:07.710 --> 00:04:09.930
and there wasn't much visibility

82
00:04:09.930 --> 00:04:13.830
or any way to control this loop here.

83
00:04:13.830 --> 00:04:16.710
So then LangGraph came out,

84
00:04:16.710 --> 00:04:21.690
and LangGraph modeled agents as graphs.

85
00:04:21.690 --> 00:04:24.750
So they had nodes, they had edges,

86
00:04:24.750 --> 00:04:28.260
and they had between the nodes a shared state.

87
00:04:28.260 --> 00:04:32.040
So the LangGraph ReAct agent had three components.

88
00:04:32.040 --> 00:04:34.560
It had the state, which was a dictionary

89
00:04:34.560 --> 00:04:36.780
that maintained the conversation,

90
00:04:36.780 --> 00:04:40.050
maybe some intermediate results,

91
00:04:40.050 --> 00:04:43.920
and nodes are simple Python functions

92
00:04:43.920 --> 00:04:45.870
that are receiving the state,

93
00:04:45.870 --> 00:04:47.610
perform the computation,

94
00:04:47.610 --> 00:04:50.550
like maybe to call the LLM or to execute the tool,

95
00:04:50.550 --> 00:04:53.490
and then they return an updated state.

96
00:04:53.490 --> 00:04:56.610
And the edges, they define the control flow.

97
00:04:56.610 --> 00:04:59.820
And by the way, the great motivation for LangGraph

98
00:04:59.820 --> 00:05:04.560
is that LangChain really saw that in every paper

99
00:05:04.560 --> 00:05:08.280
in almost every time you describe an agent,

100
00:05:08.280 --> 00:05:13.110
you actually describe a graph with nodes and with edges.

101
00:05:13.110 --> 00:05:16.530
So this architecture change really helped

102
00:05:16.530 --> 00:05:19.080
to unhide the control flow

103
00:05:19.080 --> 00:05:22.590
because now we have an explicit graph structure

104
00:05:22.590 --> 00:05:25.650
that we can even print and show us a picture,

105
00:05:25.650 --> 00:05:28.350
and it's really hard to understand what's happening

106
00:05:28.350 --> 00:05:29.940
and what's being executed.

107
00:05:29.940 --> 00:05:33.480
Now, if we wanted in the old agent executor

108
00:05:33.480 --> 00:05:35.280
to have some kind of state,

109
00:05:35.280 --> 00:05:36.900
it was really hard to do

110
00:05:36.900 --> 00:05:39.450
and we had to do it with keyword arguments

111
00:05:39.450 --> 00:05:41.010
with a config object,

112
00:05:41.010 --> 00:05:43.230
and it was actually very hard to do.

113
00:05:43.230 --> 00:05:46.320
But with LangGraph and the state schema

114
00:05:46.320 --> 00:05:47.700
is actually very easy to do.

115
00:05:47.700 --> 00:05:50.160
We simply defined in the state

116
00:05:50.160 --> 00:05:53.100
what is the field we want to keep track of.

117
00:05:53.100 --> 00:05:55.200
So if we wanted to know what's happening

118
00:05:55.200 --> 00:05:57.270
with our agent execution

119
00:05:57.270 --> 00:06:00.270
in the agent executor original version,

120
00:06:00.270 --> 00:06:02.100
we really didn't have a way to do it.

121
00:06:02.100 --> 00:06:05.190
So we didn't have a way to monitor and to trace

122
00:06:05.190 --> 00:06:07.110
and to keep track of what's happening.

123
00:06:07.110 --> 00:06:10.080
But with LangGraph, we have automatic checkpoints.

124
00:06:10.080 --> 00:06:12.600
So every time before we execute a node,

125
00:06:12.600 --> 00:06:14.790
LangChain is going to persist the state

126
00:06:14.790 --> 00:06:16.860
and we can have access to that

127
00:06:16.860 --> 00:06:19.860
so we can see exactly what happened and when.

128
00:06:19.860 --> 00:06:21.480
And we can even rewind

129
00:06:21.480 --> 00:06:23.340
and we can travel back in time

130
00:06:23.340 --> 00:06:25.980
and it gives us much greater flexibility.

131
00:06:25.980 --> 00:06:30.180
Now, the biggest thing that LangGraph unlocked

132
00:06:30.180 --> 00:06:35.010
is the fact that we can now compose graphs,

133
00:06:35.010 --> 00:06:38.400
which are agents, one inside another.

134
00:06:38.400 --> 00:06:42.720
So we can actually use a LangGraph graph

135
00:06:42.720 --> 00:06:44.790
as a LangGraph node,

136
00:06:44.790 --> 00:06:46.920
and this is very convenient for us.

137
00:06:46.920 --> 00:06:49.560
And it comes out-of-the-box tracing

138
00:06:49.560 --> 00:06:52.650
and all of the benefits that LangGraph gives us.

139
00:06:52.650 --> 00:06:55.200
And this is something which was very hard to do

140
00:06:55.200 --> 00:06:57.720
with the original agent executor.

141
00:06:57.720 --> 00:07:00.150
So this LangGraph agent was built

142
00:07:00.150 --> 00:07:02.250
under the LangGraph library,

143
00:07:02.250 --> 00:07:04.950
under the graph prebuilt agents

144
00:07:04.950 --> 00:07:07.380
and lived there for a while.

145
00:07:07.380 --> 00:07:10.980
We actually implemented a very similar version ourselves

146
00:07:10.980 --> 00:07:12.270
in this section.

147
00:07:12.270 --> 00:07:17.270
And then LangChain and LangGraph reached version 1.0

148
00:07:17.730 --> 00:07:21.030
and it brought a cleaner API service

149
00:07:21.030 --> 00:07:23.970
with the new create_agent function,

150
00:07:23.970 --> 00:07:27.630
and LangChain then deprecated

151
00:07:27.630 --> 00:07:31.290
and replaced the Create ReAct agent

152
00:07:31.290 --> 00:07:33.540
and the LangGraph prebuilt agent

153
00:07:33.540 --> 00:07:37.260
and simply put everything under this create_agent function,

154
00:07:37.260 --> 00:07:40.560
which returns a compiled graph,

155
00:07:40.560 --> 00:07:42.750
which is LangGraph under the hood,

156
00:07:42.750 --> 00:07:46.167
but with a simple interface for people to use it.

157
00:07:46.167 --> 00:07:47.520
And in this interface,

158
00:07:47.520 --> 00:07:50.610
we can simply give it to the models and the tools,

159
00:07:50.610 --> 00:07:54.450
and boom, we get a ReAct agent which is ready to go,

160
00:07:54.450 --> 00:07:58.620
which has all of the goodies that LangGraph can offer us

161
00:07:58.620 --> 00:08:01.410
with observability, with debugging.

162
00:08:01.410 --> 00:08:03.900
And it also has more flexibility

163
00:08:03.900 --> 00:08:08.400
because we can actually go and customize it ourselves.

164
00:08:08.400 --> 00:08:11.700
All righty, so I hope you enjoyed this video,

165
00:08:11.700 --> 00:08:14.970
and I hope you enjoyed the flow of learning

166
00:08:14.970 --> 00:08:17.970
in general of LangChain agents.

167
00:08:17.970 --> 00:08:20.730
And I think it's really, really important

168
00:08:20.730 --> 00:08:23.280
to know how everything started.

169
00:08:23.280 --> 00:08:26.580
And that you now know, of course,

170
00:08:26.580 --> 00:08:29.910
how to use LangChain create_agent,

171
00:08:29.910 --> 00:08:33.690
but you know exactly how it's implemented under the hood.

172
00:08:33.690 --> 00:08:35.580
And for you, it's not magic,

173
00:08:35.580 --> 00:08:37.470
you know exactly how it's implemented.

174
00:08:37.470 --> 00:08:40.440
And in fact, this interface is just an easier way

175
00:08:40.440 --> 00:08:41.760
for you to use it.

176
00:08:41.760 --> 00:08:46.050
And it's really the foundation for modern LLM agents

177
00:08:46.050 --> 00:08:47.130
and deep agents,

178
00:08:47.130 --> 00:08:49.173
which I hope to cover in the course.