1
00:00:03,420 --> 00:00:10,680
So in this chapter we are going to build another basic rug application.

2
00:00:11,160 --> 00:00:15,540
But in this case with an advanced output parser.

3
00:00:16,640 --> 00:00:27,110
So until now we have used in the previous exercises are very simple output parser called structured

4
00:00:27,110 --> 00:00:28,460
output parsers.

5
00:00:28,910 --> 00:00:36,590
If you remember, that parser was used to format the output into a Json dictionary.

6
00:00:37,910 --> 00:00:45,440
The structured output parser is a very simple parser that can only support strings and do not provide

7
00:00:45,440 --> 00:00:50,330
options for other data types, such as lists or integers.

8
00:00:50,630 --> 00:00:59,960
So in this case, we are going to uh, to we want to test an output parser that is capable of using

9
00:00:59,960 --> 00:01:06,470
other types of inputs besides strings and validating the output format format.

10
00:01:06,470 --> 00:01:13,040
So we are going to use the most frequent, the most popular parser in the long chain community, which

11
00:01:13,040 --> 00:01:21,260
is Pydantic output parser, which is an advanced parser that admits many data types and other features

12
00:01:21,260 --> 00:01:22,550
like validators.

13
00:01:22,550 --> 00:01:23,330
We will see.

14
00:01:23,690 --> 00:01:24,770
So.

15
00:01:26,100 --> 00:01:33,990
The process we are going to follow is this one you have in the left side of the screen, and here in

16
00:01:33,990 --> 00:01:36,930
the code we are going to go step by step.

17
00:01:36,930 --> 00:01:41,760
So first we are going to load the necessary modules.

18
00:01:41,760 --> 00:01:44,490
So the pedantic output parser.

19
00:01:45,390 --> 00:01:53,190
So a few modules from pedantic library and one module from the Typekit library.

20
00:01:53,700 --> 00:02:00,030
The first thing we are going to do is to define the desired output data structure.

21
00:02:00,030 --> 00:02:09,030
So with this class definition, what we are saying is what kind of answer do we want?

22
00:02:09,970 --> 00:02:20,500
So as you remember, the LM is going to provide the initial answer in one simple format.

23
00:02:20,710 --> 00:02:30,220
And we use the output parsers to transform this initial format in the format we want.

24
00:02:30,490 --> 00:02:40,330
So in this case we are defining a class with the format we want and the validators we want.

25
00:02:40,450 --> 00:02:49,870
So in this case what we are saying is okay, I want that your output is in two lists.

26
00:02:50,050 --> 00:03:02,260
The first list is a list of substitute words, and the second list is a list of reasons why you think

27
00:03:02,260 --> 00:03:04,720
these words fit.

28
00:03:05,740 --> 00:03:07,480
You will see why we say this.

29
00:03:07,480 --> 00:03:09,310
You will see the purpose of the application.

30
00:03:09,310 --> 00:03:11,620
We are, uh, we are building.

31
00:03:11,680 --> 00:03:21,010
So the first thing here, the first part of this class is okay, I want these two things in this format.

32
00:03:21,010 --> 00:03:25,930
I want words as a list and I want reasons as a list.

33
00:03:26,500 --> 00:03:32,710
And then I'm telling I'm giving you two validation criterias.

34
00:03:32,980 --> 00:03:41,260
For the first item, I want you to give me a I am going to give you this, uh, validator criteria.

35
00:03:41,590 --> 00:03:48,520
If this validator validation criteria is not met, you are going to throw an error.

36
00:03:49,120 --> 00:03:54,670
The first criteria is that the words cannot start with a number.

37
00:03:55,150 --> 00:04:03,250
And the second validation criteria is that the reasons you have in the second list have to end with

38
00:04:03,250 --> 00:04:03,880
a dot.

39
00:04:04,300 --> 00:04:12,700
If any of these criteria is not met, you are going to throw a error message.

40
00:04:13,090 --> 00:04:22,450
So this is the way we define the output format format we want and the validation criteria we want.

41
00:04:22,840 --> 00:04:26,080
Then we are ready to create the parser.

42
00:04:26,560 --> 00:04:34,300
We create the parser applying the pedantic output parser module we downloaded okay.

43
00:04:34,780 --> 00:04:41,500
So once we have our parser created we can determine the input.

44
00:04:41,830 --> 00:04:53,770
So the purpose of this application is uh giving the the the LM application, uh sentence and ask the

45
00:04:53,770 --> 00:05:01,720
LM application to replace one word in that sentence with a good substitute word.

46
00:05:03,010 --> 00:05:10,990
So in order to determine the input, we are going to, uh, follow a process that we have been, uh,

47
00:05:10,990 --> 00:05:13,630
following in previous applications.

48
00:05:13,630 --> 00:05:18,160
So we are going to define a template.

49
00:05:19,370 --> 00:05:22,730
A we are going to build a prompt template.

50
00:05:23,720 --> 00:05:28,010
And once we have that, we are going to load the user input.

51
00:05:28,130 --> 00:05:35,900
Here we are entering the user input manually, but in your application the user may enter this a input

52
00:05:35,900 --> 00:05:38,060
you know, through a form or whatever.

53
00:05:38,060 --> 00:05:38,660
Right.

54
00:05:39,020 --> 00:05:46,520
So if you look at the user input you will immediately understand the purpose of this LM application

55
00:05:46,520 --> 00:05:47,420
we are building.

56
00:05:47,420 --> 00:05:52,040
So the user is giving us one sentence.

57
00:05:52,040 --> 00:06:00,470
The sentence is the loyalty of the soldier was so great that even under severe torture, he refused

58
00:06:00,470 --> 00:06:02,870
to betray his comrades.

59
00:06:03,080 --> 00:06:08,750
So he's talking about a soldier in a particular situation.

60
00:06:08,750 --> 00:06:17,720
And what he's telling is I want to substitute the word loyalty in this sentence.

61
00:06:17,720 --> 00:06:21,470
So what we are asking the application is find.

62
00:06:22,160 --> 00:06:28,160
Wars that can substitute loyalty and.

63
00:06:29,050 --> 00:06:32,380
Keep the meaning of this sentence.

64
00:06:32,410 --> 00:06:33,100
Okay.

65
00:06:33,100 --> 00:06:35,320
So in order to do that.

66
00:06:35,920 --> 00:06:38,560
We are creating this prompt template.

67
00:06:38,560 --> 00:06:46,750
So in the prompt template the instructions we are giving the LM are offered a list of suggestions to

68
00:06:46,750 --> 00:06:54,430
substitute the specified target word based on the present context and the reasoning for each word.

69
00:06:55,000 --> 00:07:03,880
So in this template we are saying okay, this is the target word the user wants to replace.

70
00:07:03,880 --> 00:07:10,510
This is the context or the sentence the user has provided, and this is what you have to do.

71
00:07:10,870 --> 00:07:16,240
These format instructions is the new thing you will see here in this template.

72
00:07:16,240 --> 00:07:22,600
And this is associated with the output parser we are using.

73
00:07:22,600 --> 00:07:29,830
So you will see here that when we create the prompt template we are telling the the application okay

74
00:07:29,830 --> 00:07:31,720
you are going to use this template.

75
00:07:31,720 --> 00:07:35,200
And in this template you see that you have three variables.

76
00:07:35,200 --> 00:07:38,860
These two variables are the classical input variables we use.

77
00:07:38,860 --> 00:07:46,180
But this one here is what we call a partial variable that is associated with the output partial we are

78
00:07:46,180 --> 00:07:46,540
using.

79
00:07:46,540 --> 00:07:55,570
So in short, what we are telling here in our application is I want you to provide this answer in a

80
00:07:55,570 --> 00:07:57,400
particular format.

81
00:07:57,400 --> 00:08:02,950
And in order to understand what format you need to look in this variable.

82
00:08:02,950 --> 00:08:10,900
So this variable called format instructions is telling the LM application that we are using a parser,

83
00:08:10,900 --> 00:08:13,030
which is my parser.

84
00:08:13,030 --> 00:08:17,410
And this parser is using this definition.

85
00:08:17,680 --> 00:08:25,960
And this definition suggestions output as output structure is having all these conditions.

86
00:08:25,960 --> 00:08:34,390
So we want two uh outputs words and reasons in this particular format as as list.

87
00:08:34,390 --> 00:08:36,970
And we want this validation criteria.

88
00:08:36,970 --> 00:08:42,610
And if these validation criterias are not met we are going to throw an error message okay.

89
00:08:42,610 --> 00:08:49,210
So this is the a prompt template definition that we are using.

90
00:08:49,450 --> 00:08:56,680
Once we have that ready we can create the LM instance as we have done many times before.

91
00:08:56,680 --> 00:09:01,570
And we can start uh asking our application.

92
00:09:01,570 --> 00:09:01,870
Okay.

93
00:09:01,870 --> 00:09:11,950
So here we are saying okay use this user input and apply that to the LM instance you have created in

94
00:09:11,950 --> 00:09:13,750
order to give me a response.

95
00:09:14,200 --> 00:09:14,740
Okay.

96
00:09:15,160 --> 00:09:27,550
So what we are using here is once you have this response in your regular format, now apply the parser

97
00:09:27,550 --> 00:09:32,170
to the response in order to give me the response in the proper format.

98
00:09:32,560 --> 00:09:40,300
And as you see here, a the application is giving us two lists.

99
00:09:40,300 --> 00:09:44,860
One list is called words, the other list is called reasons.

100
00:09:45,070 --> 00:09:46,390
If you look at the.

101
00:09:47,360 --> 00:09:53,150
For my definition, we have one list called words and other lists called reasons.

102
00:09:54,110 --> 00:10:03,440
In the first list, we have four words that the application thinks that can be good replacement for

103
00:10:03,440 --> 00:10:08,180
loyalty, devotion, alliance, fidelity, and dedication.

104
00:10:08,180 --> 00:10:18,890
So the application is telling us that he thinks she thinks that we can use devotion, fidelity, alliance

105
00:10:18,890 --> 00:10:21,800
instead of loyalty in this sentence.

106
00:10:21,800 --> 00:10:29,330
So we can say the devotion of the soldier was so great that even under severe, severe torture, he

107
00:10:29,330 --> 00:10:33,830
refused Baroda or the allegiance of the soldier.

108
00:10:33,830 --> 00:10:43,280
Okay, so the important thing here is not the response, but the format that the application is using

109
00:10:43,280 --> 00:10:44,810
to give us the response.

110
00:10:44,810 --> 00:10:51,530
And as you can see here, the application is giving us two lists.

111
00:10:51,680 --> 00:10:57,020
If you remember, with a previous output parser, we were using a structured output parsers.

112
00:10:57,020 --> 00:11:04,520
We were only able to uh provide an output into a Json dictionary.

113
00:11:04,850 --> 00:11:12,440
So structured output parser is not able to provide two lists as as as a response.

114
00:11:13,620 --> 00:11:21,960
He is only able to format the output into a Json dictionary, but if we use an advanced output parser

115
00:11:21,960 --> 00:11:32,220
like pedantic output parser, we can do a more sophisticated things like a many output data types and

116
00:11:32,220 --> 00:11:35,370
also other features like validators.

117
00:11:35,400 --> 00:11:43,680
Okay, so this was an example of a little bit more advanced output parser we can use in launching.