1
00:00:00,180 --> 00:00:02,120
In this video, we'll be writing our backend

2
00:00:02,120 --> 00:00:05,200
endpoint that will take an uploaded image

3
00:00:05,200 --> 00:00:08,039
and an optional list of preferences and ship

4
00:00:08,039 --> 00:00:11,080
it to the Gemini API to return a list of travel

5
00:00:11,080 --> 00:00:12,040
suggestions.

6
00:00:12,040 --> 00:00:15,000
To begin, we need to get in some modules to

7
00:00:15,000 --> 00:00:17,940
help us with our file upload and HTTP error

8
00:00:17,940 --> 00:00:18,860
handling.

9
00:00:18,860 --> 00:00:21,580
First, we're going to be using the File and

10
00:00:21,580 --> 00:00:25,020
Upload File modules from FastAPI.

11
00:00:25,020 --> 00:00:28,200
We've already brought that in so we're good

12
00:00:28,200 --> 00:00:29,960
on that front.

13
00:00:29,960 --> 00:00:32,400
Next for our image processing, we'll be importing

14
00:00:32,400 --> 00:00:35,100
the Image class from our Pillow package.

15
00:00:35,100 --> 00:00:38,440
We'll also import the IO module for file operations.

16
00:00:38,440 --> 00:00:41,040
So let us import them.

17
00:00:41,040 --> 00:00:47,460
Just below JSON, say from DIL, import image

18
00:00:47,460 --> 00:00:52,640
and also import IO.

19
00:00:52,640 --> 00:00:53,840
Good.

20
00:00:53,840 --> 00:00:55,920
We now have every module that we need to get

21
00:00:55,920 --> 00:00:56,860
started.

22
00:00:56,860 --> 00:00:58,480
So let's write our endpoint.

23
00:00:58,480 --> 00:01:04,480
Just below the search endpoint, scroll down,

24
00:01:04,480 --> 00:01:07,700
here we're going to be writing our endpoint

25
00:01:07,700 --> 00:01:10,320
and the corresponding handler.

26
00:01:10,320 --> 00:01:15,460
So here we'll say at app.post,

27
00:01:15,460 --> 00:01:21,000
forward slash API, forward slash suggest by

28
00:01:21,000 --> 00:01:23,480
image.

29
00:01:23,480 --> 00:01:25,720
And for the handler, we'll also be giving

30
00:01:25,720 --> 00:01:27,600
it an async handler because we're going to

31
00:01:27,600 --> 00:01:30,840
to be performing async operations.

32
00:01:30,840 --> 00:01:34,860
We'll call it suggest by image.

33
00:01:34,860 --> 00:01:38,160
And it's going to be taking two parameters.

34
00:01:38,160 --> 00:01:39,760
First is file, which is the file we're going

35
00:01:39,760 --> 00:01:41,600
to be uploading.

36
00:01:41,600 --> 00:01:45,380
That instance of upload file, set that to

37
00:01:45,380 --> 00:01:49,220
file.

38
00:01:49,220 --> 00:01:53,460
And we'll also be getting our preferences,

39
00:01:53,460 --> 00:01:59,400
which is going to be an optional string.

40
00:01:59,400 --> 00:02:04,160
And default to none.

41
00:02:04,160 --> 00:02:07,320
Let's come down here and add some little documentation,

42
00:02:07,320 --> 00:02:11,300
say it generates travel suggestions based

43
00:02:11,300 --> 00:02:12,040
on

44
00:02:12,040 --> 00:02:14,280
an uploaded image.

45
00:02:14,280 --> 00:02:16,140
To begin our request, we'll need to wrap it

46
00:02:16,140 --> 00:02:17,880
in a try except block so let us write that

47
00:02:17,880 --> 00:02:22,840
out.

48
00:02:22,840 --> 00:02:27,400
Let's pass in a suitable message for our error,

49
00:02:27,400 --> 00:02:34,180
use an f-string and say error processing image

50
00:02:34,180 --> 00:02:40,560
and let us just give you the error.

51
00:02:40,560 --> 00:02:41,360
That's good.

52
00:02:41,360 --> 00:02:43,320
Now we can begin processing the user's request

53
00:02:43,320 --> 00:02:48,000
by first getting our image data.

54
00:02:48,000 --> 00:02:53,300
So I will say image data and set it to await

55
00:02:53,300 --> 00:02:58,420
file dot read.

56
00:02:58,420 --> 00:03:01,020
That's from the file that is coming in our

57
00:03:01,020 --> 00:03:02,280
request.

58
00:03:02,280 --> 00:03:05,340
Then our image is going to be processed using

59
00:03:05,340 --> 00:03:12,140
the image package from pillow.

60
00:03:12,140 --> 00:03:14,200
Good.

61
00:03:14,200 --> 00:03:16,460
Next let us take care of our preferences and

62
00:03:16,460 --> 00:03:18,820
build a string of user preferences that we

63
00:03:18,820 --> 00:03:22,500
can embed inside our prompt.

64
00:03:22,500 --> 00:03:29,520
Just going to say, pref list.

65
00:03:29,520 --> 00:03:34,380
Let's get an array of what was sent.

66
00:03:34,380 --> 00:03:36,280
And once we have this, we can then build a

67
00:03:36,280 --> 00:03:42,120
string of comma-separated preferences.

68
00:03:42,120 --> 00:03:47,960
Just as we did with the text search.

69
00:03:47,960 --> 00:03:52,260
Personal preface here.

70
00:03:52,260 --> 00:03:54,420
Good, now we have a preferences text that

71
00:03:54,420 --> 00:03:57,960
we can embed inside our prompt.

72
00:03:57,960 --> 00:03:59,860
Now speaking of the prompt, I'm also going

73
00:03:59,860 --> 00:04:01,740
to be saving time by bringing the elaborate

74
00:04:01,740 --> 00:04:04,980
prompt here from the exercise files of the

75
00:04:04,980 --> 00:04:06,720
completed version of the course.

76
00:04:06,720 --> 00:04:08,440
So in the exercise files of the course, you

77
00:04:08,440 --> 00:04:10,240
get a base project, which we are building

78
00:04:10,240 --> 00:04:11,280
on right now.

79
00:04:11,280 --> 00:04:13,500
And you also get a branch where you see the

80
00:04:13,500 --> 00:04:15,340
completed version of the course.

81
00:04:15,340 --> 00:04:17,720
So you can always grab your prompt from there,

82
00:04:17,720 --> 00:04:20,200
as I'm going to be doing right now.

83
00:04:20,200 --> 00:04:22,580
So paste the prompt here, and I'm quickly

84
00:04:22,580 --> 00:04:25,540
just going to format this.

85
00:04:25,540 --> 00:04:30,100
Let's get formatting up there, to this, to

86
00:04:30,100 --> 00:04:33,520
this, good.

87
00:04:33,520 --> 00:04:39,480
Now let us see what we have inside our prompt.

88
00:04:39,480 --> 00:04:41,800
So this prompt says analyze this landmark

89
00:04:41,800 --> 00:04:46,440
or travel image and suggest 5 similar destinations

90
00:04:46,440 --> 00:04:48,480
with comparable features, architecture or

91
00:04:48,480 --> 00:04:49,500
atmosphere.

92
00:04:49,500 --> 00:04:52,580
Then we embed our preferences text, if any,

93
00:04:52,580 --> 00:04:54,460
and says for each destination, we provide

94
00:04:54,460 --> 00:04:56,780
the same details we wanted it to provide when

95
00:04:56,780 --> 00:04:59,420
we were doing the text-based search.

96
00:04:59,420 --> 00:05:01,640
And we also give it the same data format that

97
00:05:01,640 --> 00:05:04,600
we want back, at least with JSON objects of

98
00:05:04,600 --> 00:05:05,600
each destination.

99
00:05:05,600 --> 00:05:09,040
And we also want it to return just the JSON

100
00:05:09,040 --> 00:05:09,440
array

101
00:05:09,440 --> 00:05:11,120
and no additional text.

102
00:05:11,120 --> 00:05:12,340
Perfect.

103
00:05:12,340 --> 00:05:14,860
Now we can take this prompt and our image

104
00:05:14,860 --> 00:05:17,620
and send them as a payload to our Gemini API

105
00:05:17,620 --> 00:05:19,180
in our request.

106
00:05:19,180 --> 00:05:20,700
So let's do that.

107
00:05:20,700 --> 00:05:23,140
Let's just come down here and scroll this

108
00:05:23,140 --> 00:05:24,860
into view.

109
00:05:24,860 --> 00:05:33,820
And here we can say response equals

110
00:05:33,820 --> 00:05:37,140
genAIClient.models.generateContent.

111
00:05:37,140 --> 00:05:40,240
And for our model, we're passing our GeminiModel

112
00:05:40,240 --> 00:05:43,500
constant.

113
00:05:43,500 --> 00:05:50,860
And for contents, we're passing a list of

114
00:05:50,860 --> 00:05:53,100
content types, which would include our prompt

115
00:05:53,100 --> 00:05:55,400
and our image.

116
00:05:55,400 --> 00:05:56,660
So this is all good.

117
00:05:56,660 --> 00:05:58,660
Now we will need to process the response.

118
00:05:58,660 --> 00:06:00,380
We need to get the response text.

119
00:06:00,380 --> 00:06:02,500
We need to strip off all the markdown that

120
00:06:02,500 --> 00:06:03,720
was brought with it.

121
00:06:03,720 --> 00:06:05,480
We also need to make sure that we are stripping

122
00:06:05,480 --> 00:06:08,480
all white spaces, passing it, and returning

123
00:06:08,480 --> 00:06:10,000
our destinations.

124
00:06:10,000 --> 00:06:12,240
Now we've already done all that for our previous

125
00:06:12,240 --> 00:06:12,800
endpoint.

126
00:06:12,800 --> 00:06:15,940
So let us just copy that.

127
00:06:15,940 --> 00:06:20,660
I just scroll down here and we can simply

128
00:06:20,660 --> 00:06:26,960
copy everything from here down to here.

129
00:06:26,960 --> 00:06:30,540
So I'm just going to collapse this once again,

130
00:06:30,540 --> 00:06:38,020
then we come down here and simply paste this.

131
00:06:38,020 --> 00:06:39,680
Good.

132
00:06:39,680 --> 00:06:42,760
We have our response text stripped.

133
00:06:42,760 --> 00:06:47,140
We have all the markdown removed, and we strip

134
00:06:47,140 --> 00:06:51,160
it once again of whitespaces, pass it as json,

135
00:06:51,160 --> 00:06:53,640
and return our destinations.

136
00:06:53,640 --> 00:06:54,660
Awesome.

137
00:06:54,660 --> 00:06:56,260
Now let us test out this endpoint.

138
00:06:56,260 --> 00:06:58,840
First let me clean up this function.

139
00:06:58,840 --> 00:07:02,420
Yeah, let me also come out of it.

140
00:07:02,420 --> 00:07:05,380
And then we can save this file.

141
00:07:05,380 --> 00:07:08,760
To test our endpoint, we need to make sure

142
00:07:08,760 --> 00:07:09,900
that our backend is running.

143
00:07:09,900 --> 00:07:11,300
So let us run that.

144
00:07:11,300 --> 00:07:14,500
So python main.py,

145
00:07:14,500 --> 00:07:17,360
run it, make sure that your file is saved

146
00:07:17,360 --> 00:07:18,560
if you don't have autosaved

147
00:07:18,560 --> 00:07:20,460
turned on in your editor.

148
00:07:20,460 --> 00:07:22,180
And now we can head over to our browser and

149
00:07:22,180 --> 00:07:24,780
test it out in the docs page.

150
00:07:24,780 --> 00:07:27,140
So here on the docs page of our API, we currently

151
00:07:27,140 --> 00:07:28,540
have two endpoints.

152
00:07:28,540 --> 00:07:30,600
But if we refresh that, we're now going to

153
00:07:30,600 --> 00:07:32,980
have three, which is our suggest by image

154
00:07:32,980 --> 00:07:33,760
endpoint.

155
00:07:33,760 --> 00:07:34,920
So let's test this out.

156
00:07:34,920 --> 00:07:40,320
I'm going to expand this, click Try It Out,

157
00:07:40,320 --> 00:07:43,960
and as you can see, it gives me a place where

158
00:07:43,960 --> 00:07:46,600
I can select a binary file, and we can also

159
00:07:46,600 --> 00:07:48,740
add a string of preferences here.

160
00:07:48,740 --> 00:07:52,020
So I'm going to select a file, which is a

161
00:07:52,020 --> 00:07:55,520
London Eye image.

162
00:07:55,520 --> 00:07:59,140
preferences let's say a string where we say

163
00:07:59,140 --> 00:08:07,580
parks and shopping now we can click

164
00:08:07,580 --> 00:08:11,920
to execute that and as expected we get our

165
00:08:11,920 --> 00:08:16,860
results back 200 which means success

166
00:08:16,860 --> 00:08:19,440
and if we look at our response body we can

167
00:08:19,440 --> 00:08:21,760
see our destinations we can see that

168
00:08:21,760 --> 00:08:24,220
But the Singapore Flyer also has this type

169
00:08:24,220 --> 00:08:25,460
of wheel.

170
00:08:25,460 --> 00:08:27,520
The High Roller in Las Vegas also has this

171
00:08:27,520 --> 00:08:28,420
type of wheel.

172
00:08:28,420 --> 00:08:31,180
So our API endpoint is working fine which

173
00:08:31,180 --> 00:08:32,419
is just fantastic.

174
00:08:32,419 --> 00:08:33,900
Now we can start our integration with our

175
00:08:33,900 --> 00:08:38,000
frontend using this endpoint in the next video.

