1
00:00:00,080 --> 00:00:02,820
In this section, we'll be adding some slight

2
00:00:02,820 --> 00:00:06,040
improvements to our Travel Companion application.

3
00:00:06,040 --> 00:00:08,700
The Gemini API provides more features that

4
00:00:08,700 --> 00:00:11,140
we can take advantage of to make our application

5
00:00:11,140 --> 00:00:14,640
more efficient and our code cleaner and we'll

6
00:00:14,640 --> 00:00:16,720
be making use of some of these features to

7
00:00:16,720 --> 00:00:20,900
make our already good application even better.

8
00:00:20,900 --> 00:00:23,180
In this video, we'll be refactoring our Gemini

9
00:00:23,180 --> 00:00:27,680
API requests to produce structured outputs.

10
00:00:27,680 --> 00:00:29,700
So far, we have been instructing the models

11
00:00:29,700 --> 00:00:33,140
about how we want our JSON output directly

12
00:00:33,140 --> 00:00:34,460
in our prompt.

13
00:00:34,460 --> 00:00:36,300
We have also been clearing up and passing

14
00:00:36,300 --> 00:00:37,660
the JSON string return.

15
00:00:37,660 --> 00:00:41,540
This strategy is referred to as JSON passing.

16
00:00:41,540 --> 00:00:44,020
And while JSON passing is good, we can already

17
00:00:44,020 --> 00:00:45,640
see it working in our application without

18
00:00:45,640 --> 00:00:46,700
issues.

19
00:00:46,700 --> 00:00:49,460
Structured outputs help define and produce

20
00:00:49,460 --> 00:00:52,200
an already structured JSON output that we

21
00:00:52,200 --> 00:00:54,900
can use directly in our code.

22
00:00:54,900 --> 00:00:58,720
No prompt instructions and parsing logic needed.

23
00:00:58,720 --> 00:01:00,360
Sounds awesome, right?

24
00:01:00,360 --> 00:01:02,140
Let's do that right away.

25
00:01:02,140 --> 00:01:04,900
First, to retain our previous code for reference

26
00:01:04,900 --> 00:01:05,820
purposes,

27
00:01:05,820 --> 00:01:08,360
let us create a new branch for all the enhancements

28
00:01:08,360 --> 00:01:10,260
we'll be adding to this app.

29
00:01:10,260 --> 00:01:12,580
So I'm going to pull up the command line,

30
00:01:12,580 --> 00:01:15,780
kill the server for the front end,

31
00:01:15,780 --> 00:01:19,680
do the same for the back end.

32
00:01:19,680 --> 00:01:24,900
And I'm just going to go back here.

33
00:01:24,900 --> 00:01:26,360
Let's clear this.

34
00:01:26,360 --> 00:01:32,560
And I will say git checkout-b enhancements.

35
00:01:32,560 --> 00:01:34,940
This is going to be our enhancement branch.

36
00:01:34,940 --> 00:01:36,720
So hit Enter.

37
00:01:36,720 --> 00:01:38,860
And as you can see, we have switched to the

38
00:01:38,860 --> 00:01:40,100
enhancements

39
00:01:40,100 --> 00:01:41,440
branch.

40
00:01:41,440 --> 00:01:42,320
That's good.

41
00:01:42,320 --> 00:01:45,540
You can see the back into my back end.

42
00:01:45,540 --> 00:01:46,900
And when I go back to my front end,

43
00:01:46,900 --> 00:01:48,540
I see that we're still in the enhancement

44
00:01:48,540 --> 00:01:49,120
branch.

45
00:01:49,120 --> 00:01:52,800
So if I run git branch, yeah, we are on the

46
00:01:52,800 --> 00:01:54,380
enhancement branch.

47
00:01:54,380 --> 00:01:56,600
So let's just clear this.

48
00:01:56,600 --> 00:01:58,920
And I'm just going to run my front-end once

49
00:01:58,920 --> 00:01:59,220
again,

50
00:01:59,220 --> 00:02:06,540
so that that can keep running in the background.

51
00:02:06,540 --> 00:02:08,660
Good.

52
00:02:08,660 --> 00:02:12,080
Now to begin, we need to define a schema for

53
00:02:12,080 --> 00:02:13,060
our destination

54
00:02:13,060 --> 00:02:15,040
data that needs to be returned.

55
00:02:15,040 --> 00:02:17,060
We'll be doing that using Pydantic.

56
00:02:17,060 --> 00:02:20,580
So let us go ahead and get started.

57
00:02:20,580 --> 00:02:23,260
I'm going to close out of this command line.

58
00:02:23,260 --> 00:02:26,000
Then we're going to be defining two schemas,

59
00:02:26,000 --> 00:02:28,920
one schema for our destination and one schema

60
00:02:28,920 --> 00:02:30,940
for a list of our destinations, that is the

61
00:02:30,940 --> 00:02:32,360
actual schema we're going to be giving to

62
00:02:32,360 --> 00:02:34,720
our model to return.

63
00:02:34,720 --> 00:02:40,920
So just below the models we already have,

64
00:02:40,920 --> 00:02:43,220
we're going to define a new class and call

65
00:02:43,220 --> 00:02:46,100
it destination.

66
00:02:46,100 --> 00:02:51,360
it base model and this will be our schema

67
00:02:51,360 --> 00:02:54,800
for the destination we want to return.

68
00:02:54,800 --> 00:02:56,880
This is going to have the same structure as

69
00:02:56,880 --> 00:02:58,320
we had in our prompt.

70
00:02:58,320 --> 00:03:00,400
The same structure that we told Gemini to

71
00:03:00,400 --> 00:03:02,140
return.

72
00:03:02,140 --> 00:03:05,580
We have name which will be a string.

73
00:03:05,580 --> 00:03:09,420
We have description which is going to be a

74
00:03:09,420 --> 00:03:11,300
string also.

75
00:03:11,300 --> 00:03:15,740
We also have best underscore time, which is

76
00:03:15,740 --> 00:03:18,340
also a string that tells us the best time

77
00:03:18,340 --> 00:03:21,020
the user can visit.

78
00:03:21,020 --> 00:03:22,480
We have attractions.

79
00:03:22,480 --> 00:03:26,860
Attractions will be a list of strings.

80
00:03:26,860 --> 00:03:34,740
And we have our budget level, which is a string.

81
00:03:34,740 --> 00:03:37,740
So this is our destination data structure.

82
00:03:37,740 --> 00:03:40,580
This is what we told our prompts to return

83
00:03:40,580 --> 00:03:44,140
if we scroll down to look at our prompt, not

84
00:03:44,140 --> 00:03:46,720
this one, the endpoints that are returning

85
00:03:46,720 --> 00:03:49,720
the destinations, and we go to the prompt,

86
00:03:49,720 --> 00:03:52,760
we see we have name, description, best time,

87
00:03:52,760 --> 00:03:54,520
attractions, and budget level.

88
00:03:54,520 --> 00:03:56,860
So that's the same thing that we are defining

89
00:03:56,860 --> 00:03:59,660
in our data model.

90
00:03:59,660 --> 00:04:01,180
Let's go up.

91
00:04:01,180 --> 00:04:04,560
Now we actually need our Gemini model to return

92
00:04:04,560 --> 00:04:07,240
a list of these destination objects.

93
00:04:07,240 --> 00:04:10,360
So I'm going to create another class, call

94
00:04:10,360 --> 00:04:15,940
it DestinationList, we'll also give it BaseModel.

95
00:04:15,940 --> 00:04:20,019
And it will contain a destination's key.

96
00:04:20,019 --> 00:04:26,020
And this will be a list of the type Destination.

97
00:04:26,020 --> 00:04:28,040
So we have these two defined and that is all

98
00:04:28,040 --> 00:04:29,140
we need.

99
00:04:29,140 --> 00:04:32,260
The next thing we need to do is to go to our

100
00:04:32,260 --> 00:04:32,820
prompt.

101
00:04:32,820 --> 00:04:36,240
Let's go to the one for SuggestByLocation.

102
00:04:36,240 --> 00:04:38,640
And we're going to get rid of all these data

103
00:04:38,640 --> 00:04:40,140
instructions.

104
00:04:40,140 --> 00:04:42,220
So I'm just going to get a new clean up prompt

105
00:04:42,220 --> 00:04:49,420
and I'm going to paste it here.

106
00:04:49,420 --> 00:04:51,660
Boom.

107
00:04:51,660 --> 00:04:53,780
So this prompt is basically the same as the

108
00:04:53,780 --> 00:04:56,500
last one but without all the data instructions.

109
00:04:56,500 --> 00:04:58,720
It just says, make sure each destination has

110
00:04:58,720 --> 00:05:00,680
exactly three attractions in the attractions

111
00:05:00,680 --> 00:05:02,880
array which the previous one also says and

112
00:05:02,880 --> 00:05:04,740
ensure that budget level is one of budget

113
00:05:04,740 --> 00:05:07,540
moderator luxury, which we also know it was

114
00:05:07,540 --> 00:05:11,560
specified in the previous prompt.

115
00:05:11,560 --> 00:05:13,820
But now we don't have any of those data stuff

116
00:05:13,820 --> 00:05:14,360
anymore.

117
00:05:14,360 --> 00:05:17,720
So that is cleaner.

118
00:05:17,720 --> 00:05:19,340
Now the next thing we need to do is to instruct

119
00:05:19,340 --> 00:05:22,480
our Gemini API request to produce structured

120
00:05:22,480 --> 00:05:23,460
outputs.

121
00:05:23,460 --> 00:05:25,860
We need to tell it about our models so that

122
00:05:25,860 --> 00:05:27,760
it can then use the models that we have given

123
00:05:27,760 --> 00:05:29,980
it to create structured outputs.

124
00:05:29,980 --> 00:05:33,320
And we can achieve this by adding a generation

125
00:05:33,320 --> 00:05:35,420
config.

126
00:05:35,420 --> 00:05:37,960
So we're just going to add a config parameter

127
00:05:37,960 --> 00:05:39,380
to this.

128
00:05:39,380 --> 00:05:41,580
And inside this generation config, we're going

129
00:05:41,580 --> 00:05:44,740
to tell it about our structured data models.

130
00:05:44,740 --> 00:05:47,240
First, we're going to tell it what we want

131
00:05:47,240 --> 00:05:50,840
back as our response MIME type.

132
00:05:50,840 --> 00:05:54,420
response MIME type.

133
00:05:54,420 --> 00:05:58,960
I will tell it application JSON so that it

134
00:05:58,960 --> 00:05:59,660
returns it

135
00:05:59,660 --> 00:06:01,580
as a JSON object.

136
00:06:01,580 --> 00:06:06,820
And we then tell it about our schemas.

137
00:06:06,820 --> 00:06:09,600
So say response

138
00:06:09,600 --> 00:06:13,480
underscore JSON underscore schema.

139
00:06:13,480 --> 00:06:16,860
So this is where we give it our data model.

140
00:06:16,860 --> 00:06:16,880
And

141
00:06:16,880 --> 00:06:22,660
And what we want is the destination list model,

142
00:06:22,660 --> 00:06:26,160
which already contains the destination model.

143
00:06:26,160 --> 00:06:33,780
And we need to call .model.jsonSchema.

144
00:06:33,780 --> 00:06:37,220
Very important so that it can return the JSON

145
00:06:37,220 --> 00:06:39,700
schema to the request.

146
00:06:39,700 --> 00:06:43,820
So .model.jsonSchema.

147
00:06:43,820 --> 00:06:45,420
So that's how we tell our request that we

148
00:06:45,420 --> 00:06:47,340
want it to use our schemas to produce structured

149
00:06:47,340 --> 00:06:48,740
outputs.

150
00:06:48,740 --> 00:06:50,500
Now that we have this logic in place, now

151
00:06:50,500 --> 00:06:52,060
that we are telling it to produce structured

152
00:06:52,060 --> 00:06:54,660
outputs, we don't need all the text parsing

153
00:06:54,660 --> 00:06:55,620
stuff anymore.

154
00:06:55,620 --> 00:06:58,340
So let us get rid of that.

155
00:06:58,340 --> 00:07:03,860
So from this line, down to this line, we can

156
00:07:03,860 --> 00:07:07,640
say, bye bye, say bye bye.

157
00:07:07,640 --> 00:07:10,660
And now all we need to do is get our destinations

158
00:07:10,660 --> 00:07:15,980
from the new response and return it to this

159
00:07:15,980 --> 00:07:16,580
request.

160
00:07:16,580 --> 00:07:20,820
So I'm just going to say result equals destination

161
00:07:20,820 --> 00:07:26,320
list, then we'll call it the .model,

162
00:07:26,320 --> 00:07:26,780
underscore

163
00:07:26,780 --> 00:07:34,020
validate, underscore JSON function, and simply

164
00:07:34,020 --> 00:07:36,700
give it our response text, which I'm going

165
00:07:36,700 --> 00:07:39,800
to say response.text.

166
00:07:39,800 --> 00:07:42,000
So we get our results.

167
00:07:42,000 --> 00:07:43,460
We validate it.

168
00:07:43,460 --> 00:07:45,780
And here, we can just simply

169
00:07:45,780 --> 00:07:49,660
say result, which is our actual result object

170
00:07:49,660 --> 00:07:54,440
that comes back and called the destinations.

171
00:07:54,440 --> 00:07:58,620
That is this destination key here, the destinations.

172
00:07:58,620 --> 00:08:00,220
That's where we get our destinations.

173
00:08:00,220 --> 00:08:00,460
And

174
00:08:00,460 --> 00:08:03,980
we simply return it back to our user.

175
00:08:03,980 --> 00:08:05,700
Simple and clean.

176
00:08:05,700 --> 00:08:09,380
And more programmatic, more native to programming.

177
00:08:09,380 --> 00:08:11,560
Now that we're all done, let us run our server

178
00:08:11,560 --> 00:08:12,500
and test this

179
00:08:12,500 --> 00:08:13,780
out to make sure that everything

180
00:08:13,780 --> 00:08:15,100
is working as expected.

181
00:08:15,100 --> 00:08:18,080
Let's go back to our back end and say Python

182
00:08:18,080 --> 00:08:20,940
main.py.

183
00:08:20,940 --> 00:08:22,980
Nothing is really going to change in the front

184
00:08:22,980 --> 00:08:23,520
end, but

185
00:08:23,520 --> 00:08:26,140
that is the success story, the fact that we're

186
00:08:26,140 --> 00:08:26,860
now using

187
00:08:26,860 --> 00:08:31,300
structured outputs to produce these responses.

188
00:08:31,300 --> 00:08:33,400
So back in our browser, first let us refresh

189
00:08:33,400 --> 00:08:35,059
our application.

190
00:08:35,059 --> 00:08:36,620
It's good.

191
00:08:36,620 --> 00:08:38,480
And let us search for Tokyo this time.

192
00:08:38,480 --> 00:08:40,820
Let's say Tokyo.

193
00:08:40,820 --> 00:08:44,059
And it gets suggestions.

194
00:08:44,059 --> 00:08:45,980
Everything should work just fine.

195
00:08:45,980 --> 00:08:47,160
We should get no errors.

196
00:08:47,160 --> 00:08:48,100
Simply perfect.

197
00:08:48,100 --> 00:08:51,160
So our structured outputs are working perfectly.

198
00:08:51,160 --> 00:08:54,820
Now they are working for the text search endpoint.

199
00:08:54,820 --> 00:08:56,780
Let us head back to VSCode and implement structured

200
00:08:56,780 --> 00:08:57,600
outputs

201
00:08:57,600 --> 00:09:01,920
also for the image search endpoint.

202
00:09:01,920 --> 00:09:03,880
Back in VSCode, first, I'm going to kill the

203
00:09:03,880 --> 00:09:06,720
server.

204
00:09:06,720 --> 00:09:11,180
And let us head to our other endpoint,

205
00:09:11,180 --> 00:09:13,160
which is search by image.

206
00:09:13,160 --> 00:09:14,860
Now, we already have our data models for the

207
00:09:14,860 --> 00:09:15,240
destination,

208
00:09:15,240 --> 00:09:17,000
so we don't need to touch those.

209
00:09:17,000 --> 00:09:18,520
The first thing we need to do is to clean

210
00:09:18,520 --> 00:09:19,420
up this prompt

211
00:09:19,420 --> 00:09:21,840
of all instructions.

212
00:09:21,840 --> 00:09:30,880
So I'm just going to clean that up.

213
00:09:30,880 --> 00:09:31,660
Good.

214
00:09:31,660 --> 00:09:40,260
It was formatted.

215
00:09:40,260 --> 00:09:42,680
And now here we have the same prompt

216
00:09:42,680 --> 00:09:44,220
that says analyze this landmark or travel

217
00:09:44,220 --> 00:09:44,680
image

218
00:09:44,680 --> 00:09:46,420
and suggest five similar travel destinations

219
00:09:46,420 --> 00:09:48,880
with comparable features, architecture, atmosphere.

220
00:09:48,880 --> 00:09:50,720
We have our preferences text.

221
00:09:50,720 --> 00:09:54,000
what we want back as data, and no data return

222
00:09:54,000 --> 00:09:57,380
instruction, no data structure instruction.

223
00:09:57,380 --> 00:09:59,240
So we have that in place, next, let us get

224
00:09:59,240 --> 00:10:00,520
our configuration in.

225
00:10:00,520 --> 00:10:04,420
I'm just going to copy that from here, copy

226
00:10:04,420 --> 00:10:13,940
that, go down, and give it to this request

227
00:10:13,940 --> 00:10:18,480
also, that's fine.

228
00:10:18,480 --> 00:10:24,320
And also, we get rid of all the parsing logic,

229
00:10:24,320 --> 00:10:34,960
get our result, destination list.model.validate.json,

230
00:10:34,960 --> 00:10:39,180
parse it our response text, which is response.text,

231
00:10:39,180 --> 00:10:43,560
directly from the request response, and here

232
00:10:43,560 --> 00:10:49,360
we can say result.destinations.

233
00:10:49,360 --> 00:10:50,120
Perfect.

234
00:10:50,120 --> 00:10:51,180
Now, let's try this out.

235
00:10:51,180 --> 00:10:52,020
First, let us rerun

236
00:10:52,020 --> 00:10:56,660
the server.

237
00:10:56,660 --> 00:10:58,320
Server is running fine.

238
00:10:58,320 --> 00:11:01,520
Now, I'll see you in the browser.

239
00:11:01,520 --> 00:11:04,700
Now, in the browser, let us refresh once again.

240
00:11:04,700 --> 00:11:07,300
And now, let us switch to the image search.

241
00:11:07,300 --> 00:11:12,620
Let's pick our London Eye image once again.

242
00:11:12,620 --> 00:11:16,680
And let us find similar destinations.

243
00:11:16,680 --> 00:11:18,320
Everything should work fine if our structured

244
00:11:18,320 --> 00:11:20,200
output is in effect.

245
00:11:20,200 --> 00:11:21,460
We should see no errors.

246
00:11:21,460 --> 00:11:22,520
Boom.

247
00:11:22,520 --> 00:11:24,660
Simply awesome.

248
00:11:24,660 --> 00:11:27,280
The Singapore Flyer, the Airgola in Las Vegas

249
00:11:27,280 --> 00:11:30,680
and the High of the Sahara in Mauritania.

250
00:11:30,680 --> 00:11:31,680
Perfect.

251
00:11:31,680 --> 00:11:33,540
So, we have implemented structured output

252
00:11:33,540 --> 00:11:39,340
for our search endpoints.

253
00:11:39,340 --> 00:11:39,340
an assignment for you to implement structured

254
00:11:39,340 --> 00:11:40,340
Now I'll be leaving the weather endpoint as

255
00:11:40,340 --> 00:11:41,420
outputs into.

256
00:11:41,420 --> 00:11:42,000
That should be fun.