WEBVTT

00:04.400 --> 00:09.440
In order to better understand the steps needed, I have listed them in point format, so we are all

00:09.440 --> 00:13.080
clear on what needs to be done with regards to the first point.

00:13.280 --> 00:15.840
The tables are what will make up data warehouse.

00:16.000 --> 00:21.360
In order to group these tables, we will define schemas which are nothing more than a way to logically

00:21.360 --> 00:22.360
group objects.

00:22.680 --> 00:28.280
In our case, these objects would be mostly tables, but you can have other objects like views, stored

00:28.280 --> 00:29.600
procedures and more.

00:29.800 --> 00:31.600
Moving on to the second point.

00:31.640 --> 00:34.160
When you create the tables, you will have no data.

00:34.160 --> 00:40.840
So we need to define functions that will insert, update and if needed, delete data from these tables.

00:41.360 --> 00:44.160
To do this we will create functions in Python.

00:44.240 --> 00:50.480
Finally, the third point we need a way to transform the raw data that we just inserted into the raw

00:50.520 --> 00:55.160
layer into the refined layer where we will have our refined data.

00:55.600 --> 00:57.800
For this, we will also use Python.

00:58.320 --> 01:04.680
If we had to represent the data flow from data source to the refined layer of the data warehouse, it

01:04.680 --> 01:05.800
would look like this.

01:06.000 --> 01:09.120
We have the data source, which in our case is the API.

01:09.520 --> 01:14.000
We extract and load the data into the raw layer of our Postgres data warehouse.

01:14.240 --> 01:17.440
The data in this layer will be under the staging schema.

01:17.480 --> 01:21.920
Then we perform transformations to get the data from raw to the refined layer.

01:22.600 --> 01:25.360
The refined layer will be represented by the core schema.

01:25.400 --> 01:30.680
In reality, a further refined layer you will also have what is known as the data mart or presentation

01:30.680 --> 01:31.120
layer.

01:31.880 --> 01:36.960
The way you structure this data mart layer is company specific, but you could have, for example,

01:37.000 --> 01:41.120
a data mart for each department like marketing, sales and finance.

01:41.800 --> 01:46.640
In this course, we will put our focus only up until the refined layer.

01:47.080 --> 01:51.440
That's it for now and I will see you next to set up the connection to the data warehouse.
