﻿1
00:00:01,100 --> 00:00:02,450
‫Welcome back.

2
00:00:02,450 --> 00:00:04,570
‫In this video and the next one

3
00:00:04,570 --> 00:00:06,820
‫I want to introduce you to the MongoDB

4
00:00:06,820 --> 00:00:10,680
‫aggregation pipeline which is an extremely powerful

5
00:00:10,680 --> 00:00:13,750
‫and extremely useful MongoDB framework

6
00:00:13,750 --> 00:00:15,433
‫for data aggregation.

7
00:00:17,180 --> 00:00:19,020
‫And the idea is that we basically

8
00:00:19,020 --> 00:00:21,900
‫define a pipeline that all documents

9
00:00:21,900 --> 00:00:24,070
‫from a certain collection go through

10
00:00:24,070 --> 00:00:26,470
‫where they are processed step by step

11
00:00:26,470 --> 00:00:30,870
‫in order to transform them into aggregated results.

12
00:00:30,870 --> 00:00:33,660
‫For example, we can use the aggregation pipeline

13
00:00:33,660 --> 00:00:36,190
‫in order to calculate averages

14
00:00:36,190 --> 00:00:39,010
‫or calculating minimum and maximum values

15
00:00:39,010 --> 00:00:41,380
‫or we can calculate distances even,

16
00:00:41,380 --> 00:00:43,590
‫and we can really do all kinds of stuff.

17
00:00:43,590 --> 00:00:46,220
‫It's really amazing how powerful

18
00:00:46,220 --> 00:00:48,440
‫this aggregation pipeline is.

19
00:00:48,440 --> 00:00:50,530
‫All right, but enough talk.

20
00:00:50,530 --> 00:00:53,900
‫Let's now actually start using the aggregation pipeline

21
00:00:53,900 --> 00:00:57,122
‫and I'm gonna create a new header function here

22
00:00:57,122 --> 00:01:00,290
‫because later on we will then define a new route

23
00:01:00,290 --> 00:01:02,583
‫and then use that function for that.

24
00:01:03,610 --> 00:01:06,093
‫But that new route, I'm gonna create it later.

25
00:01:07,495 --> 00:01:09,720
‫So I want to create a function here that's gonna

26
00:01:09,720 --> 00:01:13,756
‫calculate a couple of statistics about our tours.

27
00:01:13,756 --> 00:01:16,987
‫So I'm gonna call this one getTourStats.

28
00:01:19,070 --> 00:01:21,860
‫Okay, so Stats for statistics,

29
00:01:21,860 --> 00:01:23,770
‫and it's gonna be an async function

30
00:01:23,770 --> 00:01:28,770
‫and so a vs code now automatically prefilled off this.

31
00:01:29,140 --> 00:01:33,923
‫All right, let's create or try catch block here.

32
00:01:38,010 --> 00:01:41,970
‫Actually, I'm gonna copy this from right up here,

33
00:01:41,970 --> 00:01:45,600
‫and now in here I will create

34
00:01:45,600 --> 00:01:47,703
‫a variable called stats.

35
00:01:50,120 --> 00:01:54,600
‫Now the aggregation pipeline really is a MongoDB feature.

36
00:01:54,600 --> 00:01:57,330
‫But Mongoose, of course, gives us access to it,

37
00:01:57,330 --> 00:02:01,210
‫so that we can use it in the Mongoose driver, right?

38
00:02:01,210 --> 00:02:04,070
‫So using our tour model in order to access

39
00:02:04,070 --> 00:02:07,217
‫the tour collection, we say

40
00:02:07,217 --> 00:02:10,053
‫tour.aggregate.

41
00:02:12,580 --> 00:02:15,250
‫And so the aggregation pipeline is a bit

42
00:02:15,250 --> 00:02:18,847
‫like a regular query and so using the aggregation pipeline

43
00:02:18,847 --> 00:02:22,512
‫it's a just a bit like doing a regular query.

44
00:02:22,512 --> 00:02:25,550
‫The difference here is that in aggregations,

45
00:02:25,550 --> 00:02:28,620
‫as I already mentioned, we can manipulate the data

46
00:02:28,620 --> 00:02:30,750
‫in a couple of different steps

47
00:02:30,750 --> 00:02:33,400
‫and so let's now actually define these steps.

48
00:02:33,400 --> 00:02:37,093
‫And for that, we pass in an array of so-called stages.

49
00:02:38,150 --> 00:02:41,050
‫So we pass in an array, and then here we will

50
00:02:41,050 --> 00:02:43,110
‫then have a lot of stages.

51
00:02:43,110 --> 00:02:46,540
‫And again the documents then pass through these stages

52
00:02:46,540 --> 00:02:50,350
‫one by one, step by step in the define sequence

53
00:02:50,350 --> 00:02:52,462
‫as we define it here.

54
00:02:52,462 --> 00:02:54,860
‫So each of the elements in this array

55
00:02:54,860 --> 00:02:56,970
‫will be one of the stages.

56
00:02:56,970 --> 00:02:59,470
‫And there are a ton of different stages

57
00:02:59,470 --> 00:03:02,110
‫that we can choose from, but I will just tell you

58
00:03:02,110 --> 00:03:04,200
‫the most common ones in this lecture

59
00:03:04,200 --> 00:03:06,070
‫and also in the next one.

60
00:03:06,070 --> 00:03:08,400
‫But anyway, I now want to take a minute

61
00:03:08,400 --> 00:03:11,500
‫and take a quick look at the MongoDB documentation

62
00:03:11,500 --> 00:03:13,853
‫because I think we didn't do that yet.

63
00:03:17,285 --> 00:03:20,840
‫So MongoDB and we already know this website.

64
00:03:20,840 --> 00:03:24,460
‫So on our MongoDB website we go here to learn,

65
00:03:24,460 --> 00:03:27,840
‫then documentation and then in here we select

66
00:03:27,840 --> 00:03:32,380
‫the MongoDB Manual and this documentation is huge.

67
00:03:32,380 --> 00:03:34,480
‫It really, it has a lot of stuff in here

68
00:03:34,480 --> 00:03:36,450
‫and if you were gonna go through it all,

69
00:03:36,450 --> 00:03:39,623
‫well, then you'd become really a MongoDB master.

70
00:03:40,600 --> 00:03:44,010
‫So it gives you a nice introduction to MongoDB,

71
00:03:44,010 --> 00:03:45,760
‫you have the installation here.

72
00:03:45,760 --> 00:03:49,610
‫You have the CRUD Operations, so a nice overview

73
00:03:49,610 --> 00:03:51,340
‫about a lot of the stuff

74
00:03:51,340 --> 00:03:54,610
‫that we already learned before, right?

75
00:03:54,610 --> 00:03:56,960
‫Then you have stuff about the aggregation

76
00:03:57,800 --> 00:04:00,390
‫and in particular, the aggregation pipeline

77
00:04:00,390 --> 00:04:02,320
‫which we're gonna talk about now,

78
00:04:02,320 --> 00:04:04,730
‫but what I want to really show you here

79
00:04:04,730 --> 00:04:08,210
‫is actually this reference that we have at the bottom

80
00:04:08,210 --> 00:04:10,750
‫because in the reference we have here for example

81
00:04:10,750 --> 00:04:12,880
‫the operators and that's where we have

82
00:04:12,880 --> 00:04:16,570
‫a lot of important stuff that we've been using all the time.

83
00:04:16,570 --> 00:04:18,430
‫So let's take a look here at the query operators,

84
00:04:18,430 --> 00:04:22,200
‫for example and here you see, we actually have

85
00:04:22,200 --> 00:04:24,210
‫these operators that we already used.

86
00:04:24,210 --> 00:04:26,970
‫So for example, greater than or equal,

87
00:04:26,970 --> 00:04:29,560
‫or less than and equal, but if someday

88
00:04:29,560 --> 00:04:32,200
‫in your own applications you need some other operator,

89
00:04:32,200 --> 00:04:35,170
‫well, now you know where you can find them.

90
00:04:35,170 --> 00:04:37,570
‫Now of course, it's also always very helpful

91
00:04:37,570 --> 00:04:40,370
‫to just do a quick Google search, probably find

92
00:04:40,370 --> 00:04:43,490
‫some stack overflow post about it,

93
00:04:43,490 --> 00:04:45,687
‫but believe me, many times it's easier

94
00:04:45,687 --> 00:04:49,790
‫and faster to just quickly jump into the documentation

95
00:04:49,790 --> 00:04:52,763
‫and quickly search for whatever you're needing.

96
00:04:53,700 --> 00:04:56,350
‫You just need to know how the documentation works

97
00:04:56,350 --> 00:04:59,050
‫and once you know that, it's then gonna be quite easy

98
00:04:59,050 --> 00:05:02,473
‫for you to find what you're looking for, all right?

99
00:05:03,400 --> 00:05:05,570
‫Here you also have some logical operators

100
00:05:05,570 --> 00:05:08,920
‫for example, the OR operator we already used.

101
00:05:08,920 --> 00:05:13,920
‫Again, really a lot of stuff, geospatial operators

102
00:05:14,120 --> 00:05:16,790
‫and we're gonna use some actually, later,

103
00:05:16,790 --> 00:05:19,040
‫and yeah, really a lot of stuff.

104
00:05:19,040 --> 00:05:22,280
‫Then what I wanted to show you after this lecture actually,

105
00:05:22,280 --> 00:05:24,793
‫are the aggregation pipeline stages.

106
00:05:26,680 --> 00:05:29,140
‫So here you have a lot of them

107
00:05:29,140 --> 00:05:32,090
‫and what we're gonna use in this lecture here is match,

108
00:05:32,090 --> 00:05:35,410
‫for example and group,

109
00:05:35,410 --> 00:05:36,820
‫where is that?

110
00:05:36,820 --> 00:05:39,670
‫Yeah group, that is a very important one,

111
00:05:39,670 --> 00:05:41,660
‫and so as I mentioned, I'm gonna show you

112
00:05:41,660 --> 00:05:44,920
‫the most important ones in this lecture and the next one.

113
00:05:44,920 --> 00:05:48,270
‫But again, if you in some case need something else,

114
00:05:48,270 --> 00:05:51,110
‫then just go ahead, come here to this documentation

115
00:05:51,110 --> 00:05:53,920
‫and you will find what you're looking for.

116
00:05:53,920 --> 00:05:54,753
‫All right?

117
00:05:55,740 --> 00:05:58,630
‫Anyway, let's now learn how we can actually define

118
00:05:58,630 --> 00:06:00,940
‫one of these stages, all right?

119
00:06:00,940 --> 00:06:03,170
‫And I'm gonna start with match.

120
00:06:03,170 --> 00:06:05,330
‫And match is basically to select

121
00:06:05,330 --> 00:06:07,890
‫or to filter certain documents.

122
00:06:07,890 --> 00:06:09,510
‫And so it's very simple.

123
00:06:09,510 --> 00:06:13,210
‫It's really just like a filter object in MongoDB,

124
00:06:13,210 --> 00:06:16,380
‫such like we've been using so many times.

125
00:06:16,380 --> 00:06:19,340
‫So each of the stages is an object

126
00:06:19,340 --> 00:06:22,750
‫and then here comes the name of the stage.

127
00:06:22,750 --> 00:06:26,300
‫So this one is the match stage, all right?

128
00:06:26,300 --> 00:06:29,603
‫And as I mentioned, it's really just a query.

129
00:06:30,730 --> 00:06:33,570
‫And so let's say that for starters,

130
00:06:33,570 --> 00:06:35,200
‫we only want to select documents

131
00:06:35,200 --> 00:06:39,630
‫which have a ratings average greater or equal than 4.5.

132
00:06:39,630 --> 00:06:41,110
‫So can you do that?

133
00:06:41,110 --> 00:06:43,560
‫I'm just gonna give you a second to do that here.

134
00:06:44,730 --> 00:06:46,123
‫So I hope you did it.

135
00:06:47,090 --> 00:06:50,440
‫So ratings average, should be

136
00:06:52,240 --> 00:06:55,383
‫greater or equal than 4.5.

137
00:06:56,460 --> 00:06:59,260
‫And so that's it, that's the first stage.

138
00:06:59,260 --> 00:07:03,500
‫And usually this match stage is just a preliminary stage

139
00:07:03,500 --> 00:07:07,050
‫to then prepare for the next stages which come ahead.

140
00:07:07,050 --> 00:07:10,173
‫So the next one is now the group stage.

141
00:07:13,000 --> 00:07:14,610
‫So group and then in there

142
00:07:14,610 --> 00:07:17,840
‫we need to always pass just another object.

143
00:07:17,840 --> 00:07:20,600
‫So it looks kinda weird with all these objects,

144
00:07:20,600 --> 00:07:22,870
‫but, yeah, you've seen that before

145
00:07:22,870 --> 00:07:24,780
‫and MongoDB just works this way.

146
00:07:24,780 --> 00:07:28,293
‫It's always objects, inside of objects, inside of objects.

147
00:07:29,520 --> 00:07:32,690
‫And this group here is where the real magic happens

148
00:07:32,690 --> 00:07:34,520
‫because as the name says,

149
00:07:34,520 --> 00:07:36,870
‫it allows us to group documents together,

150
00:07:36,870 --> 00:07:38,820
‫basically using accumulators.

151
00:07:38,820 --> 00:07:40,600
‫And an accumulator is for example,

152
00:07:40,600 --> 00:07:42,850
‫even calculating an average.

153
00:07:42,850 --> 00:07:46,270
‫So if we have five tours, each of them has a rating,

154
00:07:46,270 --> 00:07:50,250
‫we can then calculate the average rating using group.

155
00:07:50,250 --> 00:07:52,680
‫And so let's do exactly that right here.

156
00:07:52,680 --> 00:07:55,190
‫Now the first thing, is we always need to specify

157
00:07:55,190 --> 00:07:58,950
‫is the id because this is where we're gonna specify

158
00:07:58,950 --> 00:08:00,803
‫what we want to group by.

159
00:08:02,420 --> 00:08:05,480
‫For now, we say null here because we want

160
00:08:05,480 --> 00:08:08,870
‫to have everything in one group so that we can calculate

161
00:08:08,870 --> 00:08:11,600
‫the statistics for all of the tours together

162
00:08:11,600 --> 00:08:13,713
‫and not separate it by groups.

163
00:08:14,590 --> 00:08:18,250
‫We will, a bit later then also group by different stuff,

164
00:08:18,250 --> 00:08:21,010
‫for example we can group by the difficulty

165
00:08:21,010 --> 00:08:23,900
‫and we can then, for example calculate the average

166
00:08:23,900 --> 00:08:26,820
‫for the easy tours, the average for the medium tours

167
00:08:26,820 --> 00:08:29,373
‫and the average for the difficult tours.

168
00:08:29,373 --> 00:08:32,720
‫So again, we can group by one of our fields

169
00:08:32,720 --> 00:08:35,750
‫and that field, we are gonna specify in here,

170
00:08:35,750 --> 00:08:39,000
‫but for now, as I said, we want to calculate these averages

171
00:08:39,000 --> 00:08:41,803
‫for all the tours together in one big group.

172
00:08:42,865 --> 00:08:46,990
‫So in that case we say _id and set it to null.

173
00:08:47,994 --> 00:08:51,600
‫Now let's actually calculate the average rating.

174
00:08:51,600 --> 00:08:54,423
‫In order to do that, we simply specify a new field,

175
00:08:55,320 --> 00:08:58,713
‫so let's simply call it the average rating,

176
00:09:01,080 --> 00:09:03,480
‫so like this, and this will be

177
00:09:05,620 --> 00:09:07,160
‫well, the average,

178
00:09:07,160 --> 00:09:11,560
‫which is yet another MongoDB operator, so this one here.

179
00:09:11,560 --> 00:09:14,233
‫You will find it in the reference if you look it up.

180
00:09:15,149 --> 00:09:17,800
‫So this is a mathematical operator calculating

181
00:09:17,800 --> 00:09:20,520
‫the average and now the name of the field.

182
00:09:20,520 --> 00:09:23,530
‫And again, I know that this is gonna look very weird,

183
00:09:23,530 --> 00:09:26,390
‫but in order to specify the field which we want

184
00:09:26,390 --> 00:09:29,110
‫to calculate the average from, we need to use

185
00:09:29,110 --> 00:09:31,900
‫the dollar sign, but in quotes here

186
00:09:31,900 --> 00:09:33,583
‫and then the name of the field.

187
00:09:34,560 --> 00:09:36,803
‫So ratings average in this case.

188
00:09:39,316 --> 00:09:41,573
‫And let's also calculate the average price.

189
00:09:43,500 --> 00:09:47,143
‫So average price, and again, did it wrong here,

190
00:09:48,200 --> 00:09:49,300
‫and so

191
00:09:50,870 --> 00:09:52,970
‫the average and then again,

192
00:09:52,970 --> 00:09:55,170
‫the name of the field and in quotes

193
00:09:55,170 --> 00:09:56,470
‫and with this dollar sign.

194
00:09:58,800 --> 00:10:00,890
‫Right, so it looks kind of weird again,

195
00:10:00,890 --> 00:10:01,900
‫but this is just the way

196
00:10:01,900 --> 00:10:04,510
‫that the aggregation pipeline works.

197
00:10:04,510 --> 00:10:06,300
‫So we have the average price.

198
00:10:06,300 --> 00:10:08,880
‫Let's actually also calculate the minimum price,

199
00:10:08,880 --> 00:10:12,560
‫so the smallest price, and the largest price.

200
00:10:12,560 --> 00:10:13,620
‫So minPrice

201
00:10:15,439 --> 00:10:19,360
‫and from now on, it's gonna all look kind of similar,

202
00:10:19,360 --> 00:10:21,190
‫so now we use the min operator

203
00:10:22,200 --> 00:10:26,603
‫and then again the field name like this.

204
00:10:29,783 --> 00:10:32,530
‫And a comma here and let's now just duplicate this one

205
00:10:32,530 --> 00:10:35,053
‫in order to calculate the maximum price.

206
00:10:37,910 --> 00:10:40,310
‫And now, let's actually take a look

207
00:10:40,310 --> 00:10:42,160
‫at what we already got at this point.

208
00:10:44,240 --> 00:10:47,430
‫So we now need to actually send out this response,

209
00:10:47,430 --> 00:10:49,250
‫and so again, I'm just gonna go head

210
00:10:49,250 --> 00:10:53,513
‫and grab some of these already done responses here.

211
00:10:59,154 --> 00:11:01,853
‫So the data that we want to send out is called stats.

212
00:11:06,740 --> 00:11:09,610
‫And now all we need to do is actually add a new route

213
00:11:09,610 --> 00:11:12,390
‫here in our tour routes and again,

214
00:11:12,390 --> 00:11:14,543
‫I'm gonna edit here, right at the top.

215
00:11:16,368 --> 00:11:18,690
‫So router.route

216
00:11:21,597 --> 00:11:26,597
‫/gettour, or actually just tour-stats, just like this

217
00:11:29,460 --> 00:11:33,730
‫because the get is already defined in the HTTP method.

218
00:11:33,730 --> 00:11:37,090
‫So no need to repeat it in the name.

219
00:11:37,090 --> 00:11:38,640
‫So tourController.getTourStats.

220
00:11:42,268 --> 00:11:46,707
‫Give it a safe and let's actually test it out now.

221
00:11:48,270 --> 00:11:52,710
‫So I'm really exciting to test it out now, actually

222
00:11:52,710 --> 00:11:57,710
‫to see if it works because again it's really an amazing

223
00:11:57,720 --> 00:11:59,370
‫and really powerful tool

224
00:11:59,370 --> 00:12:01,503
‫which allows you to do so much stuff.

225
00:12:03,130 --> 00:12:05,280
‫So tour-stats, send it

226
00:12:06,290 --> 00:12:09,000
‫and that's not really the result

227
00:12:09,000 --> 00:12:12,600
‫that we were waiting for, so that's just the pipeline

228
00:12:12,600 --> 00:12:15,750
‫that we defined, but I already know why that is.

229
00:12:15,750 --> 00:12:18,950
‫So we did not await the result.

230
00:12:18,950 --> 00:12:21,410
‫So this, basically, just like a normal query

231
00:12:21,410 --> 00:12:24,453
‫is gonna return an aggregate object.

232
00:12:25,857 --> 00:12:29,770
‫So .find is gonna return a query, and .aggregate

233
00:12:29,770 --> 00:12:32,330
‫is gonna return an aggregate object.

234
00:12:32,330 --> 00:12:34,180
‫And then only when we await it,

235
00:12:34,180 --> 00:12:36,940
‫it actually comes back with the result.

236
00:12:36,940 --> 00:12:39,850
‫So that's also why we defined this function here

237
00:12:39,850 --> 00:12:42,640
‫as an async function, so that we can then use await

238
00:12:42,640 --> 00:12:44,953
‫in there and so this is the right place.

239
00:12:45,860 --> 00:12:47,950
‫Let's try it again.

240
00:12:47,950 --> 00:12:49,850
‫And here we go!

241
00:12:49,850 --> 00:12:51,250
‫So great!

242
00:12:51,250 --> 00:12:54,910
‫We have the average rating of all our tours,

243
00:12:54,910 --> 00:12:56,720
‫we have the average price which you see is

244
00:12:56,720 --> 00:13:01,720
‫1563 and then the minimum and the maximum, right?

245
00:13:01,910 --> 00:13:04,210
‫And from our data, we can actually confirm

246
00:13:04,210 --> 00:13:06,410
‫that this is true.

247
00:13:06,410 --> 00:13:07,770
‫So great!

248
00:13:07,770 --> 00:13:09,770
‫Really, really great.

249
00:13:09,770 --> 00:13:11,950
‫Now, one more thing that I want to do here

250
00:13:11,950 --> 00:13:14,450
‫is to actually calculate the total number

251
00:13:14,450 --> 00:13:18,730
‫of ratings that we have and also the total number of tours.

252
00:13:18,730 --> 00:13:20,700
‫So we have the average rating here

253
00:13:20,700 --> 00:13:23,090
‫and let's actually do it before,

254
00:13:23,090 --> 00:13:23,973
‫so numRatings,

255
00:13:26,490 --> 00:13:27,323
‫like this

256
00:13:28,850 --> 00:13:31,950
‫and you can probably guess

257
00:13:31,950 --> 00:13:36,950
‫that this one is called sum and then the ratingsAverage,

258
00:13:38,500 --> 00:13:43,500
‫or actually not ratingsAverage, but ratingsQuantity, right?

259
00:13:43,610 --> 00:13:46,480
‫So that's where the number of ratings is stored

260
00:13:46,480 --> 00:13:49,354
‫and so the number of ratings, the total,

261
00:13:49,354 --> 00:13:51,863
‫will be the sum of all of these together.

262
00:13:52,840 --> 00:13:55,090
‫And now, the last one, is the number of tours

263
00:13:56,100 --> 00:13:57,880
‫and that one is a bit trickier,

264
00:13:57,880 --> 00:14:00,660
‫and so that's a nice one to show you here.

265
00:14:00,660 --> 00:14:04,570
‫So we still want to sum, so to add everything together,

266
00:14:04,570 --> 00:14:07,960
‫basically, so we still use sum,

267
00:14:07,960 --> 00:14:10,340
‫but what are we gonna add together?

268
00:14:10,340 --> 00:14:13,493
‫Well, we basically add one for each document,

269
00:14:14,443 --> 00:14:17,380
‫and so we say 1, and that's it.

270
00:14:17,380 --> 00:14:19,610
‫So basically for each of the document

271
00:14:19,610 --> 00:14:22,020
‫that's gonna go through this pipeline,

272
00:14:22,020 --> 00:14:24,960
‫one will be added to this num counter.

273
00:14:24,960 --> 00:14:29,430
‫Let's call it numTours.

274
00:14:29,430 --> 00:14:33,763
‫And so, yeah, let's test it out now, again.

275
00:14:36,350 --> 00:14:39,320
‫And indeed we get our nine tours

276
00:14:39,320 --> 00:14:41,303
‫and we already know that we have nine.

277
00:14:42,650 --> 00:14:46,820
‫So these are all the statistics for all the tours together,

278
00:14:46,820 --> 00:14:48,890
‫but let's now take it to the next level.

279
00:14:48,890 --> 00:14:52,450
‫As I said before, we can now group our results

280
00:14:52,450 --> 00:14:53,790
‫for different fields.

281
00:14:53,790 --> 00:14:56,870
‫And let's actually start out with the difficulty.

282
00:14:56,870 --> 00:14:59,830
‫And so that's quite similar to specifying the fields

283
00:14:59,830 --> 00:15:01,990
‫down there which is simply the dollar sign,

284
00:15:01,990 --> 00:15:03,570
‫and then the name of the field.

285
00:15:03,570 --> 00:15:07,303
‫So $difficulty, give it a safe and that's it.

286
00:15:08,350 --> 00:15:12,040
‫Send it here, and now we have the statistics

287
00:15:12,040 --> 00:15:15,950
‫all defined for each of these three difficulties.

288
00:15:15,950 --> 00:15:18,200
‫So that's really, really amazing.

289
00:15:18,200 --> 00:15:20,470
‫So this is the part that blows my mind.

290
00:15:20,470 --> 00:15:22,170
‫It's absolutely fantastic.

291
00:15:22,170 --> 00:15:25,020
‫I mean, you can start to see all the kinds of stuff,

292
00:15:25,020 --> 00:15:27,260
‫all the kinds of data manipulation

293
00:15:27,260 --> 00:15:30,890
‫that you can do using this pipeline, right?

294
00:15:30,890 --> 00:15:33,360
‫So we really can now start to analyze stuff,

295
00:15:33,360 --> 00:15:36,170
‫for example, we see that the easiest tours

296
00:15:36,170 --> 00:15:38,640
‫are the ones who get the poorest ratings

297
00:15:38,640 --> 00:15:41,370
‫with a 4.6, while the medium tours

298
00:15:41,370 --> 00:15:43,980
‫are the ones who get the highest ratings.

299
00:15:43,980 --> 00:15:46,750
‫We can also see that the difficult tours

300
00:15:46,750 --> 00:15:49,620
‫are the most expensive ones with an average price

301
00:15:49,620 --> 00:15:54,520
‫of 1997, while the easiest ones are also the cheapest ones

302
00:15:54,520 --> 00:15:58,010
‫with a price of 1272, right?

303
00:15:58,010 --> 00:15:59,910
‫We also see that the easiest tours

304
00:15:59,910 --> 00:16:01,880
‫are the ones which we have the most,

305
00:16:01,880 --> 00:16:05,400
‫so four easy ones and only two difficult ones, right?

306
00:16:05,400 --> 00:16:07,510
‫So we can use this, of course,

307
00:16:07,510 --> 00:16:10,693
‫to get all kinds of insights into our data.

308
00:16:12,350 --> 00:16:13,630
‫Let's comment this one out here

309
00:16:13,630 --> 00:16:16,583
‫and let's just try another one,

310
00:16:17,560 --> 00:16:20,130
‫for example, by ratings, why not,

311
00:16:20,130 --> 00:16:22,773
‫let's just try out, see what we get.

312
00:16:24,450 --> 00:16:26,310
‫So for example, we have two tours,

313
00:16:26,310 --> 00:16:28,353
‫so basically with a rating of 4.5,

314
00:16:29,530 --> 00:16:32,963
‫we have two tours with 4.7, one tour with 4.8,

315
00:16:34,040 --> 00:16:37,077
‫and three tours with 4.9.

316
00:16:37,077 --> 00:16:39,870
‫Oh, and also down here, one tour with 4.6,

317
00:16:39,870 --> 00:16:42,320
‫and so for example, one insight that we can get here

318
00:16:42,320 --> 00:16:45,680
‫is that the very expensive tours with this average price

319
00:16:45,680 --> 00:16:49,103
‫of 1997 get extremely good ratings.

320
00:16:50,260 --> 00:16:55,260
‫So all kinds of stuff, really that we can do here.

321
00:16:55,360 --> 00:16:59,520
‫Let's get rid of this one, let's add this one back,

322
00:16:59,520 --> 00:17:00,975
‫and we can now actually even,

323
00:17:00,975 --> 00:17:03,180
‫do some operations with this one.

324
00:17:03,180 --> 00:17:08,180
‫So just for fun, let's put this difficulty to uppercase.

325
00:17:08,550 --> 00:17:12,370
‫So this of course, is gonna be another object here

326
00:17:13,570 --> 00:17:15,810
‫and then here in front is the operator.

327
00:17:15,810 --> 00:17:18,873
‫And so the operator in this case, is called $toUpper.

328
00:17:21,150 --> 00:17:25,230
‫So just, yet another MongoDB operator.

329
00:17:25,230 --> 00:17:30,140
‫So check it out, and now it's spelled uppercase.

330
00:17:30,140 --> 00:17:31,290
‫Great!

331
00:17:31,290 --> 00:17:34,650
‫So we have the group stage here which is quite complete.

332
00:17:34,650 --> 00:17:38,180
‫So let's try another one which is a sort stage.

333
00:17:38,180 --> 00:17:42,120
‫So another object in our array for yet another stage

334
00:17:43,250 --> 00:17:45,380
‫and this one is called sort.

335
00:17:45,380 --> 00:17:49,070
‫So sort and then in there, we need yet another object,

336
00:17:49,070 --> 00:17:52,190
‫and it kind of auto-completed this one, I don't know why,

337
00:17:52,190 --> 00:17:54,860
‫and so here we can now specify which field we want

338
00:17:54,860 --> 00:17:59,183
‫to sort this by and let's actually use the average price.

339
00:18:00,580 --> 00:18:03,610
‫And now here in the sorting we actually need to use

340
00:18:03,610 --> 00:18:06,563
‫the field names that we specified up here in the group.

341
00:18:07,520 --> 00:18:10,630
‫We can no longer use the old names because at this point

342
00:18:10,630 --> 00:18:11,930
‫they are already gone.

343
00:18:11,930 --> 00:18:13,610
‫They no longer exist.

344
00:18:13,610 --> 00:18:16,400
‫So at this point, in the aggregation pipeline,

345
00:18:16,400 --> 00:18:19,410
‫we really already have these results.

346
00:18:19,410 --> 00:18:22,790
‫So these are now our documents basically.

347
00:18:22,790 --> 00:18:25,960
‫So if you want to sort by the average price,

348
00:18:25,960 --> 00:18:28,943
‫then this is the field name we gotta use.

349
00:18:30,530 --> 00:18:33,070
‫So we can say, the average price

350
00:18:33,070 --> 00:18:35,793
‫and then we can say 1 for ascending.

351
00:18:37,680 --> 00:18:38,883
‫So let's try that out.

352
00:18:41,550 --> 00:18:43,070
‫And yeah, indeed.

353
00:18:43,070 --> 00:18:46,480
‫That's the lowest one and that's the highest one.

354
00:18:46,480 --> 00:18:47,820
‫Great!

355
00:18:47,820 --> 00:18:50,210
‫So we did a lot of stuff.

356
00:18:50,210 --> 00:18:53,480
‫Let me just show you that we can also repeat stages.

357
00:18:53,480 --> 00:18:55,710
‫So let's actually do another match here

358
00:18:58,740 --> 00:19:00,930
‫and this one is really just to show you

359
00:19:00,930 --> 00:19:03,240
‫that we actually can repeat stages

360
00:19:03,240 --> 00:19:07,130
‫and I also want to show you another operator.

361
00:19:07,130 --> 00:19:11,020
‫So let's now select by id and remember that id

362
00:19:11,020 --> 00:19:12,920
‫is now the difficulty, right?

363
00:19:12,920 --> 00:19:15,023
‫So we just specified that up here.

364
00:19:16,820 --> 00:19:20,670
‫So we want the id and now a new operator

365
00:19:20,670 --> 00:19:22,500
‫that we didn't use yet.

366
00:19:22,500 --> 00:19:26,913
‫We want it to be not equal to, and let's say easy.

367
00:19:28,210 --> 00:19:29,830
‫And so, just like this we're gonna select

368
00:19:29,830 --> 00:19:32,260
‫all the documents that are not easy.

369
00:19:32,260 --> 00:19:33,880
‫So in this case, the one that says

370
00:19:33,880 --> 00:19:36,340
‫medium and difficult, right?

371
00:19:36,340 --> 00:19:41,007
‫So basically, excluding the one that says easy, right?

372
00:19:42,200 --> 00:19:43,790
‫So again, it's not really useful here

373
00:19:43,790 --> 00:19:47,950
‫because that will take away a lot of our meaningful data,

374
00:19:47,950 --> 00:19:50,090
‫but just to show you that we can of course,

375
00:19:50,090 --> 00:19:51,980
‫match multiple times.

376
00:19:51,980 --> 00:19:54,750
‫So in this case, we matched once before we actually

377
00:19:54,750 --> 00:19:58,160
‫did the group and then we matched once we were ready,

378
00:19:58,160 --> 00:19:59,475
‫doing the grouping.

379
00:19:59,475 --> 00:20:01,720
‫Now, let's actually keep this here,

380
00:20:01,720 --> 00:20:03,798
‫out of our aggregation pipeline,

381
00:20:03,798 --> 00:20:06,200
‫but I'm gonna leave it here,

382
00:20:06,200 --> 00:20:08,200
‫just so that you keep it as a reference.

383
00:20:09,580 --> 00:20:12,750
‫Give it another save, give it another try,

384
00:20:12,750 --> 00:20:14,730
‫and so these are our results

385
00:20:14,730 --> 00:20:16,590
‫for getting our tour statistics.

386
00:20:16,590 --> 00:20:19,030
‫So, really nice, really great.

387
00:20:19,030 --> 00:20:21,070
‫I hope you enjoyed this video

388
00:20:21,070 --> 00:20:23,200
‫and in the next one, we're actually gonna keep using

389
00:20:23,200 --> 00:20:26,810
‫the aggregation pipeline to do some more calculations.

390
00:20:26,810 --> 00:20:30,050
‫And to show you some more different stages.

391
00:20:30,050 --> 00:20:31,973
‫So keep tuned for that one.

