1
00:00:03,210 --> 00:00:05,820
So I call it the practical video.

2
00:00:05,820 --> 00:00:12,630
But in fact we can call this the detailed video because here is where we are going to talk about how

3
00:00:12,630 --> 00:00:18,240
Lang Smith helped us solve the challenges in the prototyping phase in detail.

4
00:00:18,300 --> 00:00:27,120
The practical video will be when we apply all these solutions in a real scenario, and that would be

5
00:00:27,120 --> 00:00:27,660
a later.

6
00:00:27,660 --> 00:00:36,390
So remember the prototyping phase involves quick experimentation between prompts, model types, retrieval

7
00:00:36,390 --> 00:00:38,280
strategy and other parameters.

8
00:00:38,280 --> 00:00:43,260
Okay, so in this phase we are going to test different prompts.

9
00:00:43,260 --> 00:00:45,720
We are going to test different model types.

10
00:00:45,720 --> 00:00:50,370
But well you know that probably we are going to stay with OpenAI.

11
00:00:50,370 --> 00:00:53,700
But we can also test you know different different models.

12
00:00:53,700 --> 00:00:58,170
And also we can also check, you know, different retrieval strategies.

13
00:00:58,170 --> 00:00:58,500
Right.

14
00:00:58,500 --> 00:01:04,709
We have been talking about the rack technique during the bootcamp in in detail.

15
00:01:04,920 --> 00:01:15,210
So we will also a pay attention to this thing, the ability to rapidly understand how the model is performing

16
00:01:15,210 --> 00:01:20,700
and debug where it is failing is incredibly important for this phase.

17
00:01:20,730 --> 00:01:28,770
Okay, so these are words of long chain in their blog post talking about the launch of Lang Smith,

18
00:01:28,770 --> 00:01:37,380
etc. but in this blog post, they are unveiling everything they have learned about the process the cycle

19
00:01:37,380 --> 00:01:39,930
of LM app development.

20
00:01:39,930 --> 00:01:40,380
Right?

21
00:01:40,380 --> 00:01:47,490
So regarding prototyping, they are telling us, okay, this is a how launch.

22
00:01:47,490 --> 00:01:56,940
The launching team understands the prototyping phase and this is where Lang Chain helps, uh, the LM

23
00:01:56,940 --> 00:01:58,860
app developer teams.

24
00:01:58,860 --> 00:02:08,520
So we talk about the first challenge when things go wrong during the prototyping phase, how do we identify

25
00:02:08,520 --> 00:02:09,690
what is failing.

26
00:02:10,500 --> 00:02:12,330
And I told you what we do.

27
00:02:12,360 --> 00:02:19,920
We have Lang Smith tracing enabled when developing a new LM application.

28
00:02:19,920 --> 00:02:20,280
Okay.

29
00:02:20,280 --> 00:02:23,700
We will see what is Lang Lang Smith tracing.

30
00:02:23,700 --> 00:02:27,510
We will see in the next blog we are going to talk about terminology.

31
00:02:27,510 --> 00:02:35,610
We will talk about traces, LM calls, runs etc. etc. but now I just want you to stay with me with the

32
00:02:35,610 --> 00:02:39,330
conceptual approach, with the, you know, high level explanation.

33
00:02:39,330 --> 00:02:51,270
So when developing new LM applications, the Lang Lang Lang chain team suggests having Lang Smith tracing

34
00:02:51,270 --> 00:02:53,340
enabled by default.

35
00:02:54,390 --> 00:02:58,590
It isn't necessary to look at every single trace.

36
00:02:59,220 --> 00:03:08,370
However, when things go wrong and unexpected end result infinite infinite agent loop, slower than

37
00:03:08,370 --> 00:03:13,170
expected execution, higher than expected token usage, etc..

38
00:03:13,170 --> 00:03:21,060
When things go wrong, it is extremely helpful to debug by looking through the application traces.

39
00:03:21,090 --> 00:03:27,390
Okay, so what the Lang Chain team is telling us is like you know what?

40
00:03:27,480 --> 00:03:36,720
When you have a problem in the prototyping phase with something doesn't work, go to the Lang Smith

41
00:03:36,750 --> 00:03:46,770
Tracing dashboard, check this trace and understand under the hood what is happening in order to identify

42
00:03:46,770 --> 00:03:48,840
where the problem may be.

43
00:03:50,270 --> 00:03:58,670
They continue saying Lang Smith gives clear visibility and debugging information at each step of an

44
00:03:58,670 --> 00:04:05,240
LM sequence, making it much easier to identify and root cause issues.

45
00:04:05,240 --> 00:04:15,680
So you will see, for example, we will see this in practice when we are in a RAC application, we can

46
00:04:15,680 --> 00:04:22,370
investigate through Lang Smith each of the stages of the RAC technique.

47
00:04:22,370 --> 00:04:27,320
What is happening there and doing that is going to be.

48
00:04:28,270 --> 00:04:35,440
Much easier for us to identify where is the problem we may be having.

49
00:04:36,720 --> 00:04:45,150
So Lang Smith provides native rendering of chat messages, functions and retrieved documents.

50
00:04:45,150 --> 00:04:46,110
We will see this.

51
00:04:46,110 --> 00:04:55,290
So problem when things go wrong, how do I identify what is failing in the prototyping phase?

52
00:04:56,520 --> 00:05:01,380
You just need to have Lang Smith tracing a naval.

53
00:05:02,150 --> 00:05:03,500
Since day one.

54
00:05:04,610 --> 00:05:12,830
So whenever you have a problem, you go under the hood, which is Lang Smith, and check what is happening.

55
00:05:14,660 --> 00:05:22,040
Second challenge, how do we iterate and experiment in a fast and easy way?

56
00:05:24,680 --> 00:05:25,700
So.

57
00:05:27,470 --> 00:05:35,690
We will see how we can use Lang Smith Playground to iterate and experiment.

58
00:05:35,960 --> 00:05:42,950
And you will see that this Lang Smith playground is, in fact, what Lang Smith used to call the Lang

59
00:05:42,950 --> 00:05:47,900
Smith prompt have or Lang Smith Hub or the hub.

60
00:05:48,500 --> 00:05:59,180
Uh, so this is a, an area in the Lang Smith uh, dashboard, uh, that, that Lang Smith application

61
00:05:59,690 --> 00:06:03,770
platform where we can play with different prompts.

62
00:06:03,770 --> 00:06:13,280
And this is not just a I remember when I saw this playground, I thought, okay, we can prove, you

63
00:06:13,280 --> 00:06:15,920
know, we can try different prompts here.

64
00:06:16,640 --> 00:06:18,860
Uh, but it's not just that.

65
00:06:19,410 --> 00:06:29,370
You can use the trace in the long chain tracing to go from the trace to the prompt playground.

66
00:06:29,400 --> 00:06:30,900
We will see how to do that.

67
00:06:30,900 --> 00:06:34,320
So it's much more interesting than you think.

68
00:06:34,320 --> 00:06:41,730
I, I, I, I was surprised by this when I saw this in action with a professional LMS application.

69
00:06:41,730 --> 00:06:42,900
You will be surprised to.

70
00:06:42,900 --> 00:06:43,650
You will see.

71
00:06:43,860 --> 00:06:44,850
So.

72
00:06:45,570 --> 00:06:46,290
A.

73
00:06:51,500 --> 00:06:56,960
Here we are saying that a create a test data set with Lang Smith.

74
00:06:56,990 --> 00:07:01,580
This is more, uh, like the last, uh, point.

75
00:07:01,580 --> 00:07:02,870
Let's see if we have.

76
00:07:02,870 --> 00:07:03,260
Okay.

77
00:07:03,260 --> 00:07:08,210
So here this is a this is a change in the proper order.

78
00:07:08,210 --> 00:07:17,300
But here you have in the notebook the blog talking about how to use Lang Lang Smith Playground to iterate

79
00:07:17,300 --> 00:07:18,200
and experiment.

80
00:07:18,200 --> 00:07:25,250
So Lang Smith provides a playground environment for rapid iteration and experimentation.

81
00:07:25,950 --> 00:07:34,470
This allows you to quickly test out different prompts and models, so you will see how I mean when you

82
00:07:34,470 --> 00:07:39,360
read this, maybe you are getting the wrong idea.

83
00:07:39,360 --> 00:07:49,230
Okay, so I mean, it's not as grandiose as it seems from this, uh, sentence, but this is very good.

84
00:07:49,230 --> 00:07:49,800
It's very good.

85
00:07:49,800 --> 00:07:55,290
It's a playground where where you really can test different prompts and different models.

86
00:07:55,290 --> 00:07:58,500
Okay, so we are going to see this later.

87
00:07:59,270 --> 00:08:03,830
Remember that Lang Smith right now is in the first is in the first version.

88
00:08:03,830 --> 00:08:10,880
So it is going to improve and we are going to learn where Lang Smith is, is going because they have

89
00:08:10,880 --> 00:08:16,820
shared the roadmap they are following in order to make improvements in the current platform.

90
00:08:16,820 --> 00:08:17,780
So.

91
00:08:18,480 --> 00:08:24,000
You can open the playground from any prompt or model round in your trace.

92
00:08:24,000 --> 00:08:25,980
This is the interesting part.

93
00:08:26,070 --> 00:08:33,809
Every playground run is logged in the system and can be used to create test cases or compare with other

94
00:08:33,809 --> 00:08:34,140
runs.

95
00:08:34,140 --> 00:08:42,120
Don't worry if you now are a little bit confused about runs and and traces and LM cause it's a little

96
00:08:42,120 --> 00:08:43,289
bit confusing.

97
00:08:43,320 --> 00:08:50,940
That's why in the next blog we are going to talk about Lang Smith terminology, and we will define all

98
00:08:50,940 --> 00:08:52,800
these for you in detail okay.

99
00:08:52,800 --> 00:08:53,670
So don't worry.

100
00:08:54,630 --> 00:08:59,700
So we were talking now about the third challenge.

101
00:08:59,700 --> 00:09:08,070
And the third challenge is how to compare the performance of alternative prompts, retrieval strategies

102
00:09:08,070 --> 00:09:09,570
and model choices.

103
00:09:09,570 --> 00:09:19,290
And we told you that the recipe that the Lang Chain team gave us in the article is you need to use the

104
00:09:19,290 --> 00:09:25,140
Lang Smith comparison view to compare the performance of alternative approaches.

105
00:09:25,230 --> 00:09:27,600
So this is what they say about that.

106
00:09:28,620 --> 00:09:36,660
When prototyping different versions of your applications and making changes, it is important to see

107
00:09:36,660 --> 00:09:43,200
whether or not you have regress with respect to your initial test cases.

108
00:09:45,800 --> 00:09:53,300
Oftentimes, changes in the prompt retrieval strategy or model choice can have huge implications in

109
00:09:53,300 --> 00:09:56,450
responses produced by your application.

110
00:09:57,460 --> 00:10:05,410
In order to get a sense for which variant is performing better, it is useful to be able to view results

111
00:10:05,410 --> 00:10:10,990
for different configurations on the same data points side by side.

112
00:10:11,530 --> 00:10:20,110
Lingmerth has invested heavily in a user friendly comparison view for test runs to track and diagnose

113
00:10:20,110 --> 00:10:26,710
regressions in test scores across multiple revisions in your of your application so well.

114
00:10:26,980 --> 00:10:31,450
Langs have has invested heavily in a user friendly comparison view.

115
00:10:32,140 --> 00:10:37,540
I think they are in the first stage so this is for sure going to improve.

116
00:10:37,540 --> 00:10:39,520
But right now it's in a very good shape.

117
00:10:39,520 --> 00:10:43,780
So you will see that with the comparison view.

118
00:10:44,290 --> 00:10:53,560
With the comparison view feature, we can compare side by side different versions of our prototype using

119
00:10:53,560 --> 00:10:58,330
different prompts, different models and different retrieval strategies.

120
00:10:58,330 --> 00:11:00,100
So this is interesting.

121
00:11:00,100 --> 00:11:08,020
And as you see it solves the third main challenge we face in the prototyping phase.

122
00:11:08,140 --> 00:11:10,270
What about the fourth and last one.

123
00:11:10,270 --> 00:11:13,390
How to test the performance of the prototype.

124
00:11:13,390 --> 00:11:15,670
So this is where this goes.

125
00:11:16,450 --> 00:11:24,070
The Lang the Lang Chain team recommend us to use Lang Smith to create a test data set.

126
00:11:24,070 --> 00:11:26,440
So let's see what they say.

127
00:11:26,980 --> 00:11:36,730
While many developers still ship an initial version of their application based on five checks, now

128
00:11:36,730 --> 00:11:38,980
remember we were talking about that.

129
00:11:38,980 --> 00:11:45,370
So I mean many teams are just jumping from prototype to production.

130
00:11:45,370 --> 00:11:50,560
They don't do beta testing or even, you know, a very professional prototyping phase.

131
00:11:50,560 --> 00:11:58,000
So I repeat, while many developers still ship an initial version of their application based on five

132
00:11:58,030 --> 00:12:06,610
checks, we have seen an increasing number of engineering teams start to adopt a more test driven approach.

133
00:12:06,700 --> 00:12:08,830
This is Lang Chain team talking.

134
00:12:08,830 --> 00:12:19,390
So these are Lang chain people that have been observing professional LM app developer development teams

135
00:12:19,390 --> 00:12:21,670
working during months.

136
00:12:21,670 --> 00:12:24,610
And this this is what they have observed.

137
00:12:25,320 --> 00:12:27,060
So they continue.

138
00:12:27,090 --> 00:12:36,240
Lang Smith allows developers to create this data sets, which are collections of inputs and reference

139
00:12:36,240 --> 00:12:41,730
outputs, and use these to run tests on their LM applications.

140
00:12:41,730 --> 00:12:49,980
So we have been talking about this in a previous section of our boot bootcamp.

141
00:12:49,980 --> 00:12:51,870
Remember we talked about.

142
00:12:52,690 --> 00:12:57,190
That we can use a.

143
00:12:58,370 --> 00:12:59,510
Good.

144
00:12:59,540 --> 00:13:08,810
Like confirm a data in order to test our application so we can select, you know, or prepare inputs

145
00:13:08,810 --> 00:13:11,690
and outputs that we know are correct.

146
00:13:11,690 --> 00:13:17,300
And with this list of inputs and outputs we can test our LM application.

147
00:13:17,300 --> 00:13:17,720
Right.

148
00:13:17,720 --> 00:13:23,600
So this is what the test data sets of Lang Smith are doing.

149
00:13:24,230 --> 00:13:32,570
So I repeat Lang Smith allows developers to create test data sets which are collections of inputs and

150
00:13:32,570 --> 00:13:38,750
reference outputs, and use these to run tests on their LM applications.

151
00:13:39,080 --> 00:13:49,430
These test cases can be uploaded in bulk, created on the fly, or exported from application traces.

152
00:13:49,430 --> 00:13:59,000
This is very interesting because as we will see in next, uh, lessons, you can whenever you are reviewing

153
00:13:59,000 --> 00:14:06,170
a trace, you know, or a call or a run or whatever in, in your, uh, Lang Smith platform, you can

154
00:14:06,170 --> 00:14:07,580
see, okay, these are good case.

155
00:14:07,580 --> 00:14:11,450
This is an interesting case for me to have in my test data set.

156
00:14:11,450 --> 00:14:18,680
And you can immediately, uh, include the particular trace or LM call, you know, in the test data

157
00:14:18,680 --> 00:14:18,890
set.

158
00:14:18,890 --> 00:14:20,930
So we will see how to do that.

159
00:14:22,000 --> 00:14:31,960
Lang Smith also makes it easy to run custom evaluations, both LM and heuristic based, to score test

160
00:14:31,960 --> 00:14:33,130
results.

161
00:14:33,160 --> 00:14:46,480
Okay, so we have seen how Lang Smith is helping us to solve the four main challenges LM app development

162
00:14:46,480 --> 00:14:50,710
teams are facing during the prototyping phase.

163
00:14:51,630 --> 00:14:59,430
In this part of the bootcamp, we have seen how long chain works in a conceptual way.

164
00:14:59,430 --> 00:15:06,000
In a next lesson, we are going to see Lang Lang Smith at work with real projects.

165
00:15:06,000 --> 00:15:13,830
Okay, but for now I just want you to understand what is happening when you use Lang chain in the prototyping

166
00:15:13,830 --> 00:15:14,490
phase.

167
00:15:14,490 --> 00:15:22,680
In the next lesson, we are going to see how Lang Smith helps us solve the main challenges we face in

168
00:15:22,680 --> 00:15:24,990
the beta testing phase.