﻿1
00:00:01,150 --> 00:00:02,540
‫So in the last lecture,

2
00:00:02,540 --> 00:00:04,990
‫we learned a theory about data modeling.

3
00:00:04,990 --> 00:00:07,430
‫And, so, let's now use that theory

4
00:00:07,430 --> 00:00:09,930
‫in order to actually design the data model

5
00:00:09,930 --> 00:00:12,140
‫of our Natours application.

6
00:00:12,140 --> 00:00:15,160
‫And this is for me and for many other developers

7
00:00:15,160 --> 00:00:18,400
‫actually the most difficult part of building an app.

8
00:00:18,400 --> 00:00:21,570
‫And, so, I hope that this application will serve

9
00:00:21,570 --> 00:00:24,660
‫as a good example to you and to give you the knowledge

10
00:00:24,660 --> 00:00:27,860
‫in order to later design your own data models,

11
00:00:27,860 --> 00:00:29,663
‫basically completely on your own.

12
00:00:30,640 --> 00:00:32,130
‫So let's do it now.

13
00:00:32,130 --> 00:00:34,560
‫And let's start with all the datasets

14
00:00:34,560 --> 00:00:37,690
‫that we actually need in our application.

15
00:00:37,690 --> 00:00:39,430
‫So starting with the tours,

16
00:00:39,430 --> 00:00:41,630
‫and that's of course the most obvious one.

17
00:00:41,630 --> 00:00:44,730
‫And we already have this one implemented.

18
00:00:44,730 --> 00:00:47,150
‫Then also we need some users.

19
00:00:47,150 --> 00:00:50,590
‫And, again, we already have a users collection actually

20
00:00:50,590 --> 00:00:51,870
‫in our database.

21
00:00:51,870 --> 00:00:54,020
‫And, so, basically tours and users

22
00:00:54,020 --> 00:00:56,470
‫are two completely separate datasets.

23
00:00:56,470 --> 00:00:58,270
‫And, so, we have them normalized.

24
00:00:58,270 --> 00:01:00,593
‫And of course they're not gonna be embedded.

25
00:01:01,540 --> 00:01:04,270
‫Next up, we're also gonna have reviews,

26
00:01:04,270 --> 00:01:06,360
‫and we will also have locations.

27
00:01:06,360 --> 00:01:07,300
‫Okay?

28
00:01:07,300 --> 00:01:09,380
‫Because most tours actually have a number

29
00:01:09,380 --> 00:01:10,930
‫of different locations.

30
00:01:10,930 --> 00:01:11,763
‫Okay?

31
00:01:11,763 --> 00:01:14,600
‫And, so, that again is yet another dataset.

32
00:01:14,600 --> 00:01:17,300
‫And finally, we're also gonna have bookings.

33
00:01:17,300 --> 00:01:20,780
‫But a little bit more about why that is in a second.

34
00:01:20,780 --> 00:01:23,320
‫Okay, so, we have all these datasets.

35
00:01:23,320 --> 00:01:25,950
‫Now let's actually model the relationships

36
00:01:25,950 --> 00:01:27,480
‫that exist between them.

37
00:01:27,480 --> 00:01:29,100
‫And I'm gonna start with the relationship

38
00:01:29,100 --> 00:01:31,470
‫between users and reviews.

39
00:01:31,470 --> 00:01:36,100
‫And this relationship is clearly a one-to-many relationship

40
00:01:36,100 --> 00:01:39,260
‫because one user can write multiple reviews,

41
00:01:39,260 --> 00:01:42,360
‫but one review can only belong to one user.

42
00:01:42,360 --> 00:01:45,550
‫And the parent in this relationship is clearly the users,

43
00:01:45,550 --> 00:01:47,240
‫and the child, the reviews

44
00:01:47,240 --> 00:01:51,160
‫because again it's the parent, so the users in this case,

45
00:01:51,160 --> 00:01:53,560
‫who can be related to many reviews,

46
00:01:53,560 --> 00:01:56,730
‫but one review can only be related to one user.

47
00:01:56,730 --> 00:01:59,290
‫Anyway, I chose to model this relationship

48
00:01:59,290 --> 00:02:01,160
‫using parent referencing.

49
00:02:01,160 --> 00:02:04,830
‫And that's because a user can write a lot of reviews

50
00:02:04,830 --> 00:02:07,490
‫and also because we might actually need to query

51
00:02:07,490 --> 00:02:09,600
‫only for the reviews on their own.

52
00:02:09,600 --> 00:02:12,490
‫So the data axis pattern is really important

53
00:02:12,490 --> 00:02:16,300
‫to take into consideration in this particular relationship.

54
00:02:16,300 --> 00:02:18,940
‫Now, about the kind of referencing that we're gonna use,

55
00:02:18,940 --> 00:02:20,610
‫it is parent referencing,

56
00:02:20,610 --> 00:02:24,220
‫so basically the review keeping a reference of the user.

57
00:02:24,220 --> 00:02:26,670
‫So keeping an ID, basically.

58
00:02:26,670 --> 00:02:28,220
‫And that is as you already know

59
00:02:28,220 --> 00:02:32,510
‫because we do not want to allow a race to grow indefinitely.

60
00:02:32,510 --> 00:02:33,940
‫And that might be the case

61
00:02:33,940 --> 00:02:37,860
‫if a user writes tons and tons (laughs) of reviews.

62
00:02:37,860 --> 00:02:38,930
‫Okay?

63
00:02:38,930 --> 00:02:41,790
‫Also, it's nice to have the review knowing

64
00:02:41,790 --> 00:02:43,220
‫who actually wrote it.

65
00:02:43,220 --> 00:02:44,053
‫Okay?

66
00:02:44,053 --> 00:02:46,440
‫And, so, having the user ID right on the review

67
00:02:46,440 --> 00:02:48,273
‫will also us to do just that.

68
00:02:49,120 --> 00:02:49,953
‫All right.

69
00:02:49,953 --> 00:02:51,060
‫Next up, let's take a look

70
00:02:51,060 --> 00:02:54,310
‫at the relationship between tours and reviews.

71
00:02:54,310 --> 00:02:56,580
‫And this one is actually very similar.

72
00:02:56,580 --> 00:02:59,450
‫So, again, it's a one-to-many relationship,

73
00:02:59,450 --> 00:03:02,070
‫where one tour can have multiple reviews

74
00:03:02,070 --> 00:03:05,260
‫but one review can only be about one tour.

75
00:03:05,260 --> 00:03:06,093
‫Right?

76
00:03:06,093 --> 00:03:07,810
‫So that's the way it makes sense.

77
00:03:07,810 --> 00:03:11,180
‫And, so, we're actually gonna model it in the exact same way

78
00:03:11,180 --> 00:03:13,380
‫as the user-reviews relationship.

79
00:03:13,380 --> 00:03:15,460
‫So, again, parent referencing,

80
00:03:15,460 --> 00:03:17,670
‫so that in the end the reviews end up

81
00:03:17,670 --> 00:03:20,530
‫with a tour ID and a user ID.

82
00:03:20,530 --> 00:03:23,270
‫And, so, then once we query for reviews,

83
00:03:23,270 --> 00:03:25,040
‫we always know exactly.

84
00:03:25,040 --> 00:03:27,930
‫Great, so let's now talk about the relationship

85
00:03:27,930 --> 00:03:30,800
‫between tours and locations.

86
00:03:30,800 --> 00:03:32,230
‫So as I mentioned earlier,

87
00:03:32,230 --> 00:03:35,230
‫each tour is gonna have a couple of locations.

88
00:03:35,230 --> 00:03:38,680
‫So for example, the park camper will basically stop

89
00:03:38,680 --> 00:03:41,080
‫in like three or four national parks.

90
00:03:41,080 --> 00:03:43,150
‫And, so, each of these national parks

91
00:03:43,150 --> 00:03:45,120
‫is gonna be one location.

92
00:03:45,120 --> 00:03:45,953
‫Right?

93
00:03:45,953 --> 00:03:49,700
‫And, so, each tour will basically have a few locations.

94
00:03:49,700 --> 00:03:52,730
‫Now, following that example, one of these national parks

95
00:03:52,730 --> 00:03:55,930
‫might also be part of one of the other tours.

96
00:03:55,930 --> 00:03:58,260
‫And, so, basically this relationship here

97
00:03:58,260 --> 00:04:00,770
‫is a few-to-few relationship.

98
00:04:00,770 --> 00:04:03,630
‫And we called this relationship many-to-many before

99
00:04:03,630 --> 00:04:06,480
‫but we still can also call them few-to-few

100
00:04:06,480 --> 00:04:08,910
‫or a ton to a ton.

101
00:04:08,910 --> 00:04:10,850
‫And, so, I called them few-to-few

102
00:04:10,850 --> 00:04:15,290
‫because each tour is only gonna have three, four locations

103
00:04:15,290 --> 00:04:17,460
‫but not really like 100.

104
00:04:17,460 --> 00:04:18,370
‫Okay?

105
00:04:18,370 --> 00:04:21,540
‫And, again, each of the locations can also be part

106
00:04:21,540 --> 00:04:23,060
‫of another tour.

107
00:04:23,060 --> 00:04:26,210
‫Now, this could be a good example for actually implementing

108
00:04:26,210 --> 00:04:30,670
‫two-way referencing, so basically normalizing the locations

109
00:04:30,670 --> 00:04:32,480
‫into its own dataset.

110
00:04:32,480 --> 00:04:33,313
‫Right?

111
00:04:33,313 --> 00:04:36,330
‫But instead I'm actually gonna denormalize the locations

112
00:04:36,330 --> 00:04:39,270
‫so to embed them into the tours.

113
00:04:39,270 --> 00:04:41,350
‫And that's actually for multiple reasons.

114
00:04:41,350 --> 00:04:44,500
‫First, because there only so few locations.

115
00:04:44,500 --> 00:04:47,400
‫Also, we will not really gonna access the locations

116
00:04:47,400 --> 00:04:48,690
‫on their own.

117
00:04:48,690 --> 00:04:51,890
‫And, finally, these locations are intrinsically related

118
00:04:51,890 --> 00:04:55,400
‫to the tours because really without locations

119
00:04:55,400 --> 00:04:57,280
‫there couldn't be any tours.

120
00:04:57,280 --> 00:04:58,113
‫Right?

121
00:04:58,113 --> 00:05:00,480
‫So these datasets belong closely together.

122
00:05:00,480 --> 00:05:04,030
‫And, so, I chose to embed locations into tours

123
00:05:04,030 --> 00:05:06,580
‫and not create yet another collection for these.

124
00:05:06,580 --> 00:05:07,413
‫Right?

125
00:05:07,413 --> 00:05:10,750
‫So we will have one collection for tours, one for users,

126
00:05:10,750 --> 00:05:13,330
‫and a bit later we will also create a new collection

127
00:05:13,330 --> 00:05:14,710
‫for the reviews.

128
00:05:14,710 --> 00:05:15,543
‫All right?

129
00:05:15,543 --> 00:05:18,860
‫But for locations, again, because these will be embedded

130
00:05:18,860 --> 00:05:19,793
‫into the tours.

131
00:05:20,640 --> 00:05:23,710
‫Okay, and next up there's also a relationship

132
00:05:23,710 --> 00:05:26,250
‫between the tours and the users.

133
00:05:26,250 --> 00:05:28,780
‫And that's because we're gonna have tour guides

134
00:05:28,780 --> 00:05:33,150
‫in the tours, and these tour guides will actually be users.

135
00:05:33,150 --> 00:05:36,270
‫So remember how we actually gave users a role

136
00:05:36,270 --> 00:05:37,760
‫in our Mongoose schema?

137
00:05:37,760 --> 00:05:40,770
‫And the possibilities there contained the guide

138
00:05:40,770 --> 00:05:43,020
‫and lead guide, remember?

139
00:05:43,020 --> 00:05:44,670
‫And, so, there's gonna be a relationship

140
00:05:44,670 --> 00:05:48,210
‫between these types of users and the tours.

141
00:05:48,210 --> 00:05:52,240
‫Now, this relationship is again a few-to-few relationship

142
00:05:52,240 --> 00:05:55,550
‫because one tour can have only a few users,

143
00:05:55,550 --> 00:05:58,410
‫so a few tour guides, but at the same time,

144
00:05:58,410 --> 00:06:02,150
‫each tour guide can also be guiding a few tours.

145
00:06:02,150 --> 00:06:02,983
‫All right?

146
00:06:02,983 --> 00:06:06,490
‫And, so, again, there's a many-to-many relationship here,

147
00:06:06,490 --> 00:06:09,270
‫which I simply called here few-to-few.

148
00:06:09,270 --> 00:06:12,140
‫Now, about actually modeling this relationship,

149
00:06:12,140 --> 00:06:14,410
‫we could do it in two ways.

150
00:06:14,410 --> 00:06:17,280
‫We could use referencing or embedding.

151
00:06:17,280 --> 00:06:19,620
‫And actually I'm gonna show you how to implement

152
00:06:19,620 --> 00:06:22,830
‫both child referencing embedding using Mongoose

153
00:06:22,830 --> 00:06:24,410
‫throughout this section.

154
00:06:24,410 --> 00:06:25,620
‫Okay?

155
00:06:25,620 --> 00:06:28,800
‫And the argument for embedding is that in this case

156
00:06:28,800 --> 00:06:31,930
‫we could then have all the information about each tour

157
00:06:31,930 --> 00:06:34,310
‫containing the information about tour guides

158
00:06:34,310 --> 00:06:36,700
‫right on each tour document.

159
00:06:36,700 --> 00:06:38,710
‫But on the other hand, that would then create

160
00:06:38,710 --> 00:06:41,120
‫some extra information in the database

161
00:06:41,120 --> 00:06:43,670
‫because we will still need to have the users

162
00:06:43,670 --> 00:06:45,210
‫as a separate collection

163
00:06:45,210 --> 00:06:48,700
‫simply because we need to access them all the time

164
00:06:48,700 --> 00:06:51,250
‫for user authentication and authorization

165
00:06:51,250 --> 00:06:52,510
‫and all that stuff.

166
00:06:52,510 --> 00:06:56,290
‫So usually, users are always an entity on their own

167
00:06:56,290 --> 00:06:57,700
‫in each database.

168
00:06:57,700 --> 00:06:58,533
‫Okay?

169
00:06:58,533 --> 00:07:02,380
‫But we could still embed some of the users into the tours.

170
00:07:02,380 --> 00:07:04,750
‫So basically when the user is a tour guide

171
00:07:04,750 --> 00:07:08,190
‫for a specific tour, we could then copy all this data

172
00:07:08,190 --> 00:07:09,950
‫into the tour document.

173
00:07:09,950 --> 00:07:10,783
‫Okay?

174
00:07:10,783 --> 00:07:14,230
‫But also we would then have to update the user on the tour

175
00:07:14,230 --> 00:07:17,590
‫each time that the underlying user itself changes.

176
00:07:17,590 --> 00:07:19,710
‫So let's say that the role of a user changes

177
00:07:19,710 --> 00:07:21,690
‫from guide to a lead guide.

178
00:07:21,690 --> 00:07:24,410
‫And in that case, we would then have to go to the tour

179
00:07:24,410 --> 00:07:26,850
‫and also update that role information

180
00:07:26,850 --> 00:07:28,840
‫right there on the embedded data.

181
00:07:28,840 --> 00:07:29,673
‫Okay?

182
00:07:29,673 --> 00:07:32,320
‫And, so, that's not ideal, and so we're actually

183
00:07:32,320 --> 00:07:35,350
‫also gonna then implement child referencing.

184
00:07:35,350 --> 00:07:37,280
‫And, so, with that, we can still keep

185
00:07:37,280 --> 00:07:39,590
‫basically the information about the tour guides

186
00:07:39,590 --> 00:07:42,860
‫on the users but simply in a referenced form,

187
00:07:42,860 --> 00:07:44,930
‫so basically keeping the IDs there,

188
00:07:44,930 --> 00:07:47,630
‫which are then gonna point to the users.

189
00:07:47,630 --> 00:07:48,463
‫Okay?

190
00:07:48,463 --> 00:07:51,370
‫And of course we could also use two-way referencing,

191
00:07:51,370 --> 00:07:55,100
‫so also keeping an ID of the tour right on the user.

192
00:07:55,100 --> 00:07:56,650
‫But I think that's a bit too much

193
00:07:56,650 --> 00:07:59,140
‫for this kind of small example

194
00:07:59,140 --> 00:08:02,850
‫because not all users will actually need an ID of the tour

195
00:08:02,850 --> 00:08:05,580
‫because not all users are tour guides.

196
00:08:05,580 --> 00:08:08,870
‫And, so, this relationship here is a bit tricky to model,

197
00:08:08,870 --> 00:08:10,800
‫I think, but I believe that in the end

198
00:08:10,800 --> 00:08:14,200
‫child referencing is gonna be the best way to go.

199
00:08:14,200 --> 00:08:15,033
‫Okay?

200
00:08:15,033 --> 00:08:17,220
‫But still, I'm gonna also show you embedding

201
00:08:17,220 --> 00:08:20,120
‫because I think that's also important to learn.

202
00:08:20,120 --> 00:08:21,400
‫All right?

203
00:08:21,400 --> 00:08:23,530
‫Next up, we have our bookings.

204
00:08:23,530 --> 00:08:26,130
‫And basically a new booking will be created

205
00:08:26,130 --> 00:08:29,340
‫each time that a user purchases a tour.

206
00:08:29,340 --> 00:08:31,340
‫So this is still kind of a relationship

207
00:08:31,340 --> 00:08:33,240
‫between users and tours

208
00:08:33,240 --> 00:08:36,950
‫because again it's a user who is gonna buy a tour.

209
00:08:36,950 --> 00:08:38,810
‫But we also want to store some data

210
00:08:38,810 --> 00:08:40,920
‫about that relationship itself,

211
00:08:40,920 --> 00:08:44,450
‫so in this case about the purchase itself in our database.

212
00:08:44,450 --> 00:08:46,430
‫For example, the price or the date

213
00:08:46,430 --> 00:08:49,560
‫when the purchase happened or something like that.

214
00:08:49,560 --> 00:08:50,810
‫And, so, in cases like this,

215
00:08:50,810 --> 00:08:53,750
‫it's a good idea to create an extra dataset,

216
00:08:53,750 --> 00:08:55,920
‫which in this case is the bookings.

217
00:08:55,920 --> 00:08:56,753
‫Okay?

218
00:08:56,753 --> 00:08:58,710
‫And, so, of course there will be a relationship

219
00:08:58,710 --> 00:09:02,398
‫between tours and bookings and also users and bookings.

220
00:09:02,398 --> 00:09:06,150
‫And, again, because basically the booking connects tours

221
00:09:06,150 --> 00:09:09,763
‫with users but kind of with an intermediate step.

222
00:09:09,763 --> 00:09:12,530
‫So one tour can have many bookings,

223
00:09:12,530 --> 00:09:15,760
‫but one booking can only belong to one tour.

224
00:09:15,760 --> 00:09:17,350
‫And the same thing with users.

225
00:09:17,350 --> 00:09:19,870
‫So one user can book many tours,

226
00:09:19,870 --> 00:09:23,610
‫but one booking can only belong to one of the users.

227
00:09:23,610 --> 00:09:26,380
‫And, so, of course we have a one-to-many relationship

228
00:09:26,380 --> 00:09:29,080
‫in both cases, and also in both cases,

229
00:09:29,080 --> 00:09:31,140
‫we're gonna use parent referencing.

230
00:09:31,140 --> 00:09:33,610
‫And, so, that means that on each booking

231
00:09:33,610 --> 00:09:37,640
‫we're gonna keep an ID of both the tour that was purchased

232
00:09:37,640 --> 00:09:40,270
‫and also of the user who actually purchased the tour.

233
00:09:40,270 --> 00:09:41,103
‫Okay?

234
00:09:41,103 --> 00:09:42,930
‫And, so, in this case, I'm doing it this way

235
00:09:42,930 --> 00:09:46,140
‫because basically I don't want to pollute the tour data

236
00:09:46,140 --> 00:09:49,510
‫with information about who actually bought the tour.

237
00:09:49,510 --> 00:09:50,343
‫Right?

238
00:09:50,343 --> 00:09:53,157
‫It wouldn't be really relevant to the tour data itself.

239
00:09:53,157 --> 00:09:55,070
‫And the same thing with users.

240
00:09:55,070 --> 00:09:58,370
‫So we also don't want to pollute the users object

241
00:09:58,370 --> 00:10:00,740
‫with all of the bookings that they did.

242
00:10:00,740 --> 00:10:01,573
‫All right?

243
00:10:01,573 --> 00:10:03,000
‫And, so, instead again

244
00:10:03,000 --> 00:10:05,770
‫we're gonna create an intermediate object

245
00:10:05,770 --> 00:10:08,450
‫or an intermediate dataset that's going to stand

246
00:10:08,450 --> 00:10:12,520
‫between users and tours whenever they create a new purchase.

247
00:10:12,520 --> 00:10:13,353
‫Right?

248
00:10:13,353 --> 00:10:14,590
‫Make sense?

249
00:10:14,590 --> 00:10:17,520
‫And that's actually it for our data model.

250
00:10:17,520 --> 00:10:21,370
‫And of course, this now looks kind of abstract,

251
00:10:21,370 --> 00:10:23,150
‫but once we start implementing it,

252
00:10:23,150 --> 00:10:24,660
‫it's gonna be very helpful

253
00:10:24,660 --> 00:10:28,730
‫to have all of our ideas organized into something like this.

254
00:10:28,730 --> 00:10:31,310
‫So whenever this data model that we're gonna implement

255
00:10:31,310 --> 00:10:34,560
‫throughout this section is becoming a bit confusing to you,

256
00:10:34,560 --> 00:10:36,970
‫then just reference back to this slide.

257
00:10:36,970 --> 00:10:39,080
‫Or you can maybe even print it out

258
00:10:39,080 --> 00:10:40,980
‫if that makes it easier for you.

259
00:10:40,980 --> 00:10:43,960
‫So this is our data model in theory.

260
00:10:43,960 --> 00:10:46,080
‫And now throughout the rest of the course,

261
00:10:46,080 --> 00:10:48,870
‫I will give you the tools to actually model the data

262
00:10:48,870 --> 00:10:50,543
‫using the Mongoose Library.

