1
00:00:03,710 --> 00:00:08,700
Let me give you a quick overview of MongoDB.

2
00:00:08,700 --> 00:00:10,875
Why is MongoDB interesting?

3
00:00:10,875 --> 00:00:13,165
How is it useful for our application?

4
00:00:13,165 --> 00:00:16,290
And what are some of the salient features of MongoDB in

5
00:00:16,290 --> 00:00:19,845
contrast to traditional SQL databases?

6
00:00:19,845 --> 00:00:23,190
So this will not be an entire treaties or databases.

7
00:00:23,190 --> 00:00:25,890
I assume that you have sufficient knowledge of databases.

8
00:00:25,890 --> 00:00:31,050
So what I would introduce what MongoDB would be easy for you to follow.

9
00:00:31,050 --> 00:00:34,695
From your prior knowledge of databases,

10
00:00:34,695 --> 00:00:40,235
I assume that you already understand that databases are used to store structured data

11
00:00:40,235 --> 00:00:42,515
and also enable you to perform

12
00:00:42,515 --> 00:00:46,455
various operations of the data including querying the data,

13
00:00:46,455 --> 00:00:48,740
inserting records into the database,

14
00:00:48,740 --> 00:00:53,870
updating an existing record in the database or deleting a record from the database.

15
00:00:53,870 --> 00:00:58,795
The typical crud operations that are supported on databases.

16
00:00:58,795 --> 00:01:02,870
Structured Query Language or SQL based databases have been

17
00:01:02,870 --> 00:01:07,330
very popular for a long time as a means of storing data.

18
00:01:07,330 --> 00:01:13,760
The MySQL is one example of SQL-based database.

19
00:01:13,760 --> 00:01:16,850
They have been very effective in storing

20
00:01:16,850 --> 00:01:20,230
data and then addressing many of the needs of applications.

21
00:01:20,230 --> 00:01:27,650
Indeed, many websites already use SQL databases as the backend for storing data given

22
00:01:27,650 --> 00:01:30,630
that why is no SQL databases

23
00:01:30,630 --> 00:01:35,610
important with new kinds of applications that are coming online.

24
00:01:35,610 --> 00:01:37,850
There is an increasing demand for

25
00:01:37,850 --> 00:01:43,170
new features not all of which the SQL-based databases can address.

26
00:01:43,170 --> 00:01:46,400
So this is where the NoSQL based database are not only

27
00:01:46,400 --> 00:01:51,485
SQL-based database are gaining a lot of grant,

28
00:01:51,485 --> 00:01:54,720
MongoDB being one example of that.

29
00:01:54,720 --> 00:01:58,745
So the NoSQL databases are designed to

30
00:01:58,745 --> 00:02:03,305
address some of the shortcomings of SQL-based databases.

31
00:02:03,305 --> 00:02:08,935
The NoSQL databases themselves can be classified into four different categories.

32
00:02:08,935 --> 00:02:13,315
We have document based databases like MongoDB,

33
00:02:13,315 --> 00:02:17,820
we have the more simpler key value based databases like Redis,

34
00:02:17,820 --> 00:02:24,710
column-family-based databases like Cassandra and then the newer graph databases like

35
00:02:24,710 --> 00:02:32,210
Neo4J and indeed there are more now in the market than these examples that I have given.

36
00:02:32,210 --> 00:02:34,370
But of course in this course,

37
00:02:34,370 --> 00:02:40,050
we will be concentrating primarily on document-based databases, MongoDB in particular.

38
00:02:40,050 --> 00:02:44,855
So I will review more about MongoDB in the rest of this lecture.

39
00:02:44,855 --> 00:02:48,095
Document databases, as the name implies,

40
00:02:48,095 --> 00:02:50,200
are built around documents.

41
00:02:50,200 --> 00:02:56,530
A document is a self-contained unit of information and can be in many different formats,

42
00:02:56,530 --> 00:03:03,425
JSON being one of the most popular formats for storing documents in a document database.

43
00:03:03,425 --> 00:03:07,535
As an example, a JSON document is shown here and

44
00:03:07,535 --> 00:03:12,215
this would be something that are bestowed in a typical document database.

45
00:03:12,215 --> 00:03:15,730
Documents themselves can be organized into collections.

46
00:03:15,730 --> 00:03:20,830
So a collection is a group of documents and in turn,

47
00:03:20,830 --> 00:03:26,205
the database itself can be considered as a set of collections.

48
00:03:26,205 --> 00:03:31,970
So these terms, "Documents collections" and "The database" will occur

49
00:03:31,970 --> 00:03:39,290
frequently when we discuss about document databases and MongoDB in particular.

50
00:03:39,290 --> 00:03:42,910
Why are NoSQL databases of interest to us?

51
00:03:42,910 --> 00:03:45,890
In particular, scalability is one of

52
00:03:45,890 --> 00:03:49,560
the reasons why NoSQL databases have shined very well.

53
00:03:49,560 --> 00:03:52,015
Now in terms of scalability,

54
00:03:52,015 --> 00:03:53,630
when we look at the two requirements;

55
00:03:53,630 --> 00:03:56,990
availability and consistency of the databases, typically,

56
00:03:56,990 --> 00:04:02,390
SQL databases find it very difficult to meet both requirements simultaneously.

57
00:04:02,390 --> 00:04:05,239
So there is a tradeoff between availability and consistency.

58
00:04:05,239 --> 00:04:08,690
So this is where NoSQL databases have been a lot

59
00:04:08,690 --> 00:04:12,175
more successful at meeting both the requirements.

60
00:04:12,175 --> 00:04:14,705
This is where the third aspect highlighted here,

61
00:04:14,705 --> 00:04:17,465
partition tolerance, also comes into effect.

62
00:04:17,465 --> 00:04:23,660
Now partitioning a SQL database and then distributing it is not as straightforward.

63
00:04:23,660 --> 00:04:29,239
Whereas a NoSQL database is lot more amenable

64
00:04:29,239 --> 00:04:34,755
to being subdivided and then distributed across multiple servers.

65
00:04:34,755 --> 00:04:40,990
The second aspect of why NoSQL databases have been popular is ease of deployment.

66
00:04:40,990 --> 00:04:43,165
When you use an SQL database,

67
00:04:43,165 --> 00:04:48,200
there is a need for matching the records in your SQL database back

68
00:04:48,200 --> 00:04:53,410
to objects in your native language like Java or Javascript and so on.

69
00:04:53,410 --> 00:04:56,810
So there is a need for object relational mapping and this is

70
00:04:56,810 --> 00:05:00,920
where an intermediate gateway needs to fill in this requirement.

71
00:05:00,920 --> 00:05:07,950
With a NoSQL database like a document-based database storing data in the form of JSON,

72
00:05:07,950 --> 00:05:12,200
the mapping becomes quite straight forward and that is one of the reasons why

73
00:05:12,200 --> 00:05:17,900
NoSQL databases have been very popular in the web development area.

74
00:05:17,900 --> 00:05:20,680
Coming to MongoDB in particular,

75
00:05:20,680 --> 00:05:23,820
MongoDB is a document database.

76
00:05:23,820 --> 00:05:27,150
The server itself can support multiple databases.

77
00:05:27,150 --> 00:05:31,790
A database in particular is a set of collections

78
00:05:31,790 --> 00:05:36,620
and the collection itself as we discussed earlier is a set of documents.

79
00:05:36,620 --> 00:05:41,705
So the document becomes the unit of information in case of MongoDB.

80
00:05:41,705 --> 00:05:46,825
The document in MongoDB is nothing but a JSON document.

81
00:05:46,825 --> 00:05:53,470
In fact, MongoDB stores the document in a more compact form called as the BSON format.

82
00:05:53,470 --> 00:05:56,495
We'll talk about that in the next slide.

83
00:05:56,495 --> 00:06:00,175
While MongoDB is a document-based database,

84
00:06:00,175 --> 00:06:04,160
it stores the JSON documents in a compact form

85
00:06:04,160 --> 00:06:09,125
called as the BSON format or the binary JSON format.

86
00:06:09,125 --> 00:06:12,920
Now this supports length prefix on each value so

87
00:06:12,920 --> 00:06:16,870
that skipping over a field becomes lot more easier.

88
00:06:16,870 --> 00:06:23,365
So as you see, MongoDB supports additional features than a simple document database.

89
00:06:23,365 --> 00:06:27,835
The information about the type of a field value is also stored.

90
00:06:27,835 --> 00:06:31,280
And in addition, within the JSON document,

91
00:06:31,280 --> 00:06:35,405
additional primitive types are stored which are useful

92
00:06:35,405 --> 00:06:39,860
when you are performing operations on the database.

93
00:06:39,860 --> 00:06:43,190
Things like the UTC date format,

94
00:06:43,190 --> 00:06:48,920
it also supports raw binary and also uses an object ID format

95
00:06:48,920 --> 00:06:54,790
for storing the ID of each document in the database if you choose to.

96
00:06:54,790 --> 00:06:58,745
Let's talk about that in a bit more detail in the next slide.

97
00:06:58,745 --> 00:07:02,330
Let's talk about the MongoDB object ID.

98
00:07:02,330 --> 00:07:07,000
Every document in MongoDB database must have an ID field,

99
00:07:07,000 --> 00:07:08,985
an underscore ID field,

100
00:07:08,985 --> 00:07:14,055
which acts as the primary key for the document.

101
00:07:14,055 --> 00:07:17,465
And this field is unique for each document.

102
00:07:17,465 --> 00:07:20,810
The ID field itself can be used in

103
00:07:20,810 --> 00:07:25,955
many formats and one particular format that MongoDB automatically

104
00:07:25,955 --> 00:07:30,020
assigns in case you don't choose to use your own ID field

105
00:07:30,020 --> 00:07:35,350
is the object ID that is created by default by MongoDB.

106
00:07:35,350 --> 00:07:37,550
So the object ID itself is

107
00:07:37,550 --> 00:07:43,660
a structured piece of information but is stored as the ID of the document.

108
00:07:43,660 --> 00:07:47,825
As an example, the ID field that is automatically

109
00:07:47,825 --> 00:07:52,390
assigned by Mongo in case you don't specify an ID field,

110
00:07:52,390 --> 00:07:55,960
contains the object ID in the form of a long string.

111
00:07:55,960 --> 00:08:00,605
Now this string has a specific format which

112
00:08:00,605 --> 00:08:06,530
enables it to store a number of pieces of information within the object ID.

113
00:08:06,530 --> 00:08:11,975
Let's look at the structure of the object ID itself in the next slide.

114
00:08:11,975 --> 00:08:16,325
As I mentioned, the object ID field itself is

115
00:08:16,325 --> 00:08:22,635
a 12 byte field which stores information in a specific format.

116
00:08:22,635 --> 00:08:26,445
The first four bytes includes a timestamp,

117
00:08:26,445 --> 00:08:31,760
the typical Unix timestamp in the resolution of a second.

118
00:08:31,760 --> 00:08:34,340
So this is told in the first four bytes.

119
00:08:34,340 --> 00:08:37,580
Then the next three bytes towards the machine ID,

120
00:08:37,580 --> 00:08:40,490
the machine on which the Mongo server is running

121
00:08:40,490 --> 00:08:43,910
and the next two bytes is the process ID,

122
00:08:43,910 --> 00:08:47,000
the specific Mongo process which has created

123
00:08:47,000 --> 00:08:50,674
this document and then the last field is an increment.

124
00:08:50,674 --> 00:08:52,490
Now as you understand,

125
00:08:52,490 --> 00:08:56,390
the timestamp field itself is at the resolution of a second.

126
00:08:56,390 --> 00:09:00,110
So if you have multiple documents that are stored within the same second,

127
00:09:00,110 --> 00:09:03,735
then the increment field will distinguish among the documents.

128
00:09:03,735 --> 00:09:06,500
They increment field is a self incrementing field.

129
00:09:06,500 --> 00:09:11,750
So each new document created within a second will get a new increment value.

130
00:09:11,750 --> 00:09:14,150
So combined together with these two,

131
00:09:14,150 --> 00:09:16,655
you can easily distinguish between

132
00:09:16,655 --> 00:09:21,550
different documents that are stored within your document database.

133
00:09:21,550 --> 00:09:28,500
So this enables you to clearly give a unique ID to each document.

134
00:09:28,500 --> 00:09:30,960
Not only that, given an ID,

135
00:09:30,960 --> 00:09:34,050
you can easily retrieve information from this ID.

136
00:09:34,050 --> 00:09:37,460
So for example, you can get hold of the ObjectID and then call

137
00:09:37,460 --> 00:09:40,880
the getTimestamp method of the object ID and

138
00:09:40,880 --> 00:09:44,655
this will return the timestamp in the ISO date format.

139
00:09:44,655 --> 00:09:49,595
So that will enable you to identify when this document has been created.

140
00:09:49,595 --> 00:09:52,750
With this quick understanding of MongoDB,

141
00:09:52,750 --> 00:09:58,700
let's proceed on to the exercise where we will first install MongoDB on our computer

142
00:09:58,700 --> 00:10:04,970
and thereafter interact with the MongoDB database using the Mongo ripple,

143
00:10:04,970 --> 00:10:09,365
the read evaluate print loop that Mongo supports.

144
00:10:09,365 --> 00:10:12,695
Now thereafter, we will look at how we can access

145
00:10:12,695 --> 00:10:19,830
the Mongo server from within our node application in the next lesson.