1
00:00:02,290 --> 00:00:04,590
So let's explore these two kinds

2
00:00:04,590 --> 00:00:06,800
of database management systems,

3
00:00:06,800 --> 00:00:10,020
and let's start with the SQL databases

4
00:00:10,020 --> 00:00:13,223
with relational database management systems.

5
00:00:14,380 --> 00:00:17,440
Now, how do SQL databases work?

6
00:00:17,440 --> 00:00:20,090
What's the idea behind them?

7
00:00:20,090 --> 00:00:21,810
Now, in SQL databases,

8
00:00:21,810 --> 00:00:25,550
you typically store so-called normalized data

9
00:00:25,550 --> 00:00:27,720
across multiple tables.

10
00:00:27,720 --> 00:00:30,360
And I'll come back to what normalized data means.

11
00:00:30,360 --> 00:00:32,790
For the moment, let me give you an example.

12
00:00:32,790 --> 00:00:34,700
Let's say we're building a website

13
00:00:34,700 --> 00:00:39,700
where we need airports and flights between those airports.

14
00:00:41,190 --> 00:00:44,680
If we opt for a SQL database,

15
00:00:44,680 --> 00:00:48,370
we could create two tables in that database.

16
00:00:48,370 --> 00:00:51,150
One for the airports and one for the flights.

17
00:00:51,150 --> 00:00:53,050
And then in every table,

18
00:00:53,050 --> 00:00:56,090
we would have columns and rows.

19
00:00:56,090 --> 00:00:58,990
For airports, we could have three columns,

20
00:00:58,990 --> 00:01:02,860
with the idea of the airport, the city and the country.

21
00:01:02,860 --> 00:01:05,129
We could also have more columns.

22
00:01:05,129 --> 00:01:07,493
You can have as many columns as you want.

23
00:01:08,490 --> 00:01:11,480
Now for flights, we could also have three columns.

24
00:01:11,480 --> 00:01:13,860
We could also have more or less.

25
00:01:13,860 --> 00:01:16,650
You don't need to have the same amount of columns

26
00:01:16,650 --> 00:01:18,880
into the two tables because the two tables

27
00:01:18,880 --> 00:01:21,530
will be totally independent from each other.

28
00:01:21,530 --> 00:01:22,750
And then in flights,

29
00:01:22,750 --> 00:01:25,370
we also might have IDs for the flights,

30
00:01:25,370 --> 00:01:28,810
and then we have our start and destination airports,

31
00:01:28,810 --> 00:01:32,593
where we pick up the IDs from our airports.

32
00:01:33,500 --> 00:01:34,750
And that's important.

33
00:01:34,750 --> 00:01:38,700
We kind of relate these two tables with each other

34
00:01:38,700 --> 00:01:42,370
by using the ID from one table as a value

35
00:01:42,370 --> 00:01:45,040
in another column in another table.

36
00:01:45,040 --> 00:01:48,280
So the IDs of our airports like JFK,

37
00:01:48,280 --> 00:01:51,115
are used in the start and destination fields

38
00:01:51,115 --> 00:01:53,400
of our flights table.

39
00:01:53,400 --> 00:01:57,743
And that's a very common thing to do in SQL databases.

40
00:01:58,580 --> 00:02:01,080
Now, the data here is normalized

41
00:02:01,080 --> 00:02:05,360
because we don't have any nested data in any table.

42
00:02:05,360 --> 00:02:10,360
Every column in our tables holds only one value per row.

43
00:02:10,949 --> 00:02:15,290
Start does not hold two airports, but just one airport.

44
00:02:15,290 --> 00:02:17,270
City in our airports table,

45
00:02:17,270 --> 00:02:20,690
does not hold multiple cities, but one city.

46
00:02:20,690 --> 00:02:22,500
And therefore, here in this example,

47
00:02:22,500 --> 00:02:27,013
we would have normalized data split across multiple tables.

48
00:02:28,120 --> 00:02:31,490
Now, you would create such tables in your database

49
00:02:31,490 --> 00:02:34,780
with help of some database management software

50
00:02:34,780 --> 00:02:37,633
or by running such SQL queries.

51
00:02:38,520 --> 00:02:43,440
And then, the database would store that data and the schema

52
00:02:43,440 --> 00:02:47,560
of these tables in some files on your system,

53
00:02:47,560 --> 00:02:49,900
but it would do that for you.

54
00:02:49,900 --> 00:02:53,260
You are not involved in that file storing.

55
00:02:53,260 --> 00:02:55,020
You don't have to care about that.

56
00:02:55,020 --> 00:02:55,853
You instead,

57
00:02:55,853 --> 00:02:58,300
would only interact with your database

58
00:02:58,300 --> 00:03:00,183
and with those tables.

59
00:03:01,450 --> 00:03:03,980
Now, one important thing to understand

60
00:03:03,980 --> 00:03:05,580
about SQL databases,

61
00:03:05,580 --> 00:03:07,190
is therefore indeed,

62
00:03:07,190 --> 00:03:10,450
that you do have clearly defined schemas

63
00:03:10,450 --> 00:03:12,960
and also data types.

64
00:03:12,960 --> 00:03:16,470
Before you start saving or fetching any data,

65
00:03:16,470 --> 00:03:18,760
you set up your database tables

66
00:03:18,760 --> 00:03:21,630
and you come up with a schema for every table

67
00:03:21,630 --> 00:03:24,460
and you define which kind of data should be stored

68
00:03:24,460 --> 00:03:26,990
in every field of that database.

69
00:03:26,990 --> 00:03:27,823
Now for example,

70
00:03:27,823 --> 00:03:29,490
that the idea of the airports

71
00:03:29,490 --> 00:03:31,610
should be some unique string,

72
00:03:31,610 --> 00:03:34,370
that city and country holds some text data

73
00:03:34,370 --> 00:03:36,100
and they offer strings,

74
00:03:36,100 --> 00:03:39,330
that the IDs of our flights could be integers

75
00:03:39,330 --> 00:03:40,730
that should be unique,

76
00:03:40,730 --> 00:03:43,280
and that we then again, have more strings in there.

77
00:03:44,180 --> 00:03:47,670
The uniqueness is also an important feature here

78
00:03:47,670 --> 00:03:52,670
because that ensures that each row has a unique identifier,

79
00:03:53,080 --> 00:03:55,120
which is another very important thing

80
00:03:55,120 --> 00:03:58,180
in relational databases since you need

81
00:03:58,180 --> 00:04:01,850
to be able to establish relations between tables,

82
00:04:01,850 --> 00:04:05,220
with help of such unique identifiers.

83
00:04:05,220 --> 00:04:06,840
Without unique identifiers,

84
00:04:06,840 --> 00:04:09,320
you could not have clear relations

85
00:04:09,320 --> 00:04:12,793
between different pieces of data in different tables.

86
00:04:14,090 --> 00:04:16,130
And that brings us to the next point.

87
00:04:16,130 --> 00:04:18,130
With SQL databases,

88
00:04:18,130 --> 00:04:21,060
you can of course, then query your data,

89
00:04:21,060 --> 00:04:25,150
and especially also your related data.

90
00:04:25,150 --> 00:04:26,440
You could for example,

91
00:04:26,440 --> 00:04:29,730
run a simple query where you get all the flights

92
00:04:29,730 --> 00:04:31,710
that start in Munich

93
00:04:31,710 --> 00:04:34,070
and you could then go to the flights table

94
00:04:34,070 --> 00:04:36,370
and query on the start column

95
00:04:36,370 --> 00:04:39,760
and search for the MUC identifier

96
00:04:39,760 --> 00:04:44,040
and get all the entries where you have that identifier.

97
00:04:44,040 --> 00:04:48,250
And then you could return the start value and the ID

98
00:04:48,250 --> 00:04:52,043
and the destination value that belongs to this identifier.

99
00:04:53,020 --> 00:04:55,670
This would be a relatively simple query.

100
00:04:55,670 --> 00:04:58,850
You can also run queries across multiple tables,

101
00:04:58,850 --> 00:05:00,950
and indeed that's one of the main ideas

102
00:05:00,950 --> 00:05:03,810
behind relational databases.

103
00:05:03,810 --> 00:05:04,960
You can for example,

104
00:05:04,960 --> 00:05:08,670
get all the flights data where flights start in Munich,

105
00:05:08,670 --> 00:05:11,470
but then you don't just want to get the flights data,

106
00:05:11,470 --> 00:05:15,710
but also all the data for that start airport.

107
00:05:15,710 --> 00:05:17,890
So then, you would use that relation

108
00:05:17,890 --> 00:05:19,850
and that unique identifier,

109
00:05:19,850 --> 00:05:22,530
which is so important as I mentioned,

110
00:05:22,530 --> 00:05:24,960
to then also go to the airport's table,

111
00:05:24,960 --> 00:05:29,680
look for the MUC identifier data in the ID column

112
00:05:29,680 --> 00:05:34,020
and fetch all the data related to that airport.

113
00:05:34,020 --> 00:05:37,800
And then one query could return data about the flight,

114
00:05:37,800 --> 00:05:40,640
so the ID, start and destination,

115
00:05:40,640 --> 00:05:42,870
and also in the same query,

116
00:05:42,870 --> 00:05:44,860
data about the starting airport,

117
00:05:44,860 --> 00:05:46,633
like the city and the country.

118
00:05:47,840 --> 00:05:49,980
That's the idea behind SQL.

119
00:05:49,980 --> 00:05:52,430
And of course, it still might be a bit abstract

120
00:05:52,430 --> 00:05:54,530
with these slides only,

121
00:05:54,530 --> 00:05:56,430
but in the very next section,

122
00:05:56,430 --> 00:06:00,080
we are going to dive deeper into SQL databases,

123
00:06:00,080 --> 00:06:03,890
and then we're going to play with some concrete examples

124
00:06:03,890 --> 00:06:05,350
in a real database.

125
00:06:05,350 --> 00:06:08,690
And then this will definitely become very clear.

126
00:06:08,690 --> 00:06:11,490
So, these were SQL databases

127
00:06:11,490 --> 00:06:14,080
and the key characteristics really are,

128
00:06:14,080 --> 00:06:15,960
that you have multiple tables,

129
00:06:15,960 --> 00:06:18,400
possibly a lot of tables,

130
00:06:18,400 --> 00:06:21,850
and that you split your data across these tables,

131
00:06:21,850 --> 00:06:25,400
that every table has a clearly defined schema

132
00:06:25,400 --> 00:06:29,370
with clearly defined data types for the different fields,

133
00:06:29,370 --> 00:06:31,440
and that you can then run queries

134
00:06:31,440 --> 00:06:34,830
on a single table or on multiple tables

135
00:06:34,830 --> 00:06:36,920
by using these relations,

136
00:06:36,920 --> 00:06:38,440
which under the hood,

137
00:06:38,440 --> 00:06:41,090
utilize unique identifiers,

138
00:06:41,090 --> 00:06:43,283
which you have in every table.

