WEBVTT

00:04.360 --> 00:10.000
Integration testing tests how different parts of the system, which in our case is the ELT pipeline,

00:10.040 --> 00:10.840
work together.

00:11.240 --> 00:17.400
An example of this would be the part where we transform the data from staging to the core layer and

00:17.400 --> 00:19.160
upload it to the Postgres table.

00:19.440 --> 00:26.080
So here we would want to test that the transform data, which we create using the Python functions is

00:26.080 --> 00:28.440
updated correctly to the Postgres database.

00:28.600 --> 00:34.440
As we said in the unit test lecture, we will aim to use real credentials for integration testing.

00:34.840 --> 00:41.560
And we can start off with what we tested in the unit tests with the airflow variables and database connection,

00:41.800 --> 00:43.720
but using real credentials.

00:44.120 --> 00:47.720
Let's start with the variables part by first defining a function.

00:48.000 --> 00:52.680
Get underscore airflow underscore variable in the test script.

00:52.840 --> 00:54.720
Since this function will be reused.

00:59.760 --> 01:05.040
This function will return the value of the variables from the environment by specifying the variable

01:05.040 --> 01:06.240
name as an argument.

01:09.560 --> 01:16.480
We then construct a full environment variable name by defining a variable env underscore var and set

01:16.480 --> 01:18.680
it to the airflow variable.

01:20.040 --> 01:26.640
So what this piece of code over here does is it gets the string variable defined and makes sure it is

01:26.640 --> 01:27.480
all in caps.

01:27.640 --> 01:31.600
And finally we are retrieving the value from the environment.

01:34.840 --> 01:41.320
Now to use this helper function as a fixture what we need to do is to wrap it inside another function

01:41.720 --> 01:44.480
using the Pytest fixture decorator.

01:45.920 --> 01:53.400
So the function we just defined will be inside this outside function airflow underscore variable, which

01:53.400 --> 01:56.680
is wrapped around the pytest fixture decorator.

01:57.160 --> 01:59.080
Finally, we simply return it.

01:59.800 --> 02:00.640
So return.

02:03.520 --> 02:06.280
We can now create a test that uses these variables.

02:06.760 --> 02:12.680
A simple test would be to use them as input variables for the API connection, to see if we get an okay

02:12.680 --> 02:13.960
response or no.

02:14.520 --> 02:23.560
So let's first create another script which we will call integration underscore test.

02:24.480 --> 02:24.920
Py.

02:25.720 --> 02:28.000
And here we will build a function.

02:28.080 --> 02:30.160
We will call it as follows.

02:31.680 --> 02:36.160
Here we need to make sure that we pass the airflow variable fixture as an argument.

02:36.840 --> 02:42.240
We can then get the API key and the channel handle using the airflow variable fixture which we just

02:42.240 --> 02:42.720
built.

02:44.880 --> 02:50.560
And as you can see here we are defining it in both the API key and the channel handle.

02:50.720 --> 02:53.040
Now we can define the URL.

02:53.440 --> 02:59.480
So this URL is actually from the very first function which we built in this course which was the get

02:59.480 --> 03:01.160
playlist id function.

03:01.360 --> 03:07.320
Now to make the API request it would be better to use the try and accept clauses, since we are connecting

03:07.360 --> 03:14.000
to the actual real database, as there are more potential fail points than when using a mock connection,

03:14.200 --> 03:18.720
such as network issues, timeouts, authentication failures, to name a few.

03:18.880 --> 03:20.880
So let's define the try except block.

03:25.000 --> 03:29.120
And here we are making the API request using the request method.

03:29.480 --> 03:32.400
So to do that we actually need to import the request module.

03:34.560 --> 03:37.680
Another thing that we will need is the Pytest module.

03:38.600 --> 03:39.880
Let's import Pytest.

03:40.960 --> 03:47.200
What is left is to assert if the response status code is a successful one, which we know is 200.

03:47.840 --> 03:49.560
So let's just say this right now.

03:54.840 --> 04:00.840
So to give a brief summary, what we're doing here is we're checking if the response that we get from

04:00.840 --> 04:08.400
the URL where we are using the real channel handle and the API key will give a successful response.

04:08.920 --> 04:16.560
If not, then we define in the accept block the pi test fail function, which causes the Pi test to

04:16.640 --> 04:22.280
mark the test as failed, and we can also specify a failure reason in the Pi test output.

04:23.080 --> 04:24.400
So that's it for this function.

04:24.400 --> 04:26.160
And now we can actually test it out.

04:26.160 --> 04:30.360
So we can use the same structure of the Pi test.

04:30.400 --> 04:31.840
Testing syntax.

04:31.840 --> 04:34.480
What we need to do is to change the actual function.

04:34.840 --> 04:37.880
And here we replace it with the functions we just built.

04:40.800 --> 04:43.600
So actually I'm seeing that we're deselected.

04:44.280 --> 04:45.400
None were actually run.

04:45.800 --> 04:53.800
The reason being that we are no longer using the unit test script, but we are instead using the integration.

04:55.840 --> 04:57.360
Underscore test pi.

04:57.800 --> 04:59.360
So this should turn the result now.

05:02.440 --> 05:04.360
And in fact it did.

05:04.400 --> 05:06.400
And the test has successfully passed.

05:06.720 --> 05:12.160
So what we're saying here is that the response was actually a 200 response code.

05:12.840 --> 05:13.040
Okay.

05:13.080 --> 05:18.160
So at this point we have just used the real API key and channel handled variables that will be used

05:18.160 --> 05:20.840
in the actual deployment of this ELT pipeline.

05:20.880 --> 05:26.000
Another test that we can do is with the connection to the database that will store the ELT data.

05:26.320 --> 05:32.840
Since this connection can be used by other functions, we again go to the conf test script and start

05:32.840 --> 05:36.840
writing the fixture that will contain the connection with the real credentials.

05:37.120 --> 05:43.000
As we have seen so far, we use the Pytest fixture decorator and define an appropriate function name.

05:45.400 --> 05:47.760
In this case it will be real Postgres connection.

05:49.720 --> 05:56.000
Then, as we have seen already, we retrieve the connection details to the Postgres database from the

05:56.000 --> 06:04.880
airflow environment variables using the Os.getenv function where we have the following variables.

06:05.560 --> 06:11.130
Next we establish the connection to the Postgres database using the Psycopg2 library.

06:11.810 --> 06:17.610
So let's go at the very top and import the library here.

06:20.370 --> 06:26.210
And since this is a real connection we use, try and accept the block again and start passing the connection

06:26.210 --> 06:29.970
parameters, same as we have done in the database section of this course.

06:30.170 --> 06:32.170
We define a connection variable.

06:32.170 --> 06:34.730
Let's name it and set it to the value of none.

06:36.410 --> 06:42.290
And since this is already a connection, we use a try and except block and start passing the connection

06:42.290 --> 06:43.090
parameters.

06:44.330 --> 06:49.370
We can't forget to provide the connection to test, so we yield the connection.

06:50.250 --> 06:52.530
Then we handle the errors in the except block.

06:54.410 --> 06:57.690
And finally we ensure that the connection is always closed.

06:58.090 --> 07:07.090
So finally if connection is defined connection dot close we can come back to the integration test Script

07:07.090 --> 07:12.250
and start writing our main function to verify the poses connection using the real credentials.

07:12.490 --> 07:18.650
Same as we have done with the other functions, we define the function that uses the fixture from Conftest.

07:18.690 --> 07:23.810
We then define the cursor variable as null, which will help with the closing logic in the finally block.

07:27.610 --> 07:34.210
Using the try and Accept and finally block, we can start off with the try block and define the cursor

07:34.210 --> 07:39.930
using the following code to verify that both the database connection and the SQL execution pipeline

07:39.930 --> 07:41.090
are working fine.

07:41.130 --> 07:46.850
We use a basic Select one SQL statement that selects the constant value one.

07:49.290 --> 07:55.530
So when we come to fetch the first row of the result set using the fetch one method, it should return

07:55.530 --> 07:56.530
the value of one.

07:58.610 --> 08:06.090
We then do this section by writing insert result zero equals one.

08:06.530 --> 08:11.970
We have covered this already, but to be sure everyone is on the same page, we specify the first element

08:11.970 --> 08:14.290
in the result, which is this one.

08:14.290 --> 08:20.450
Here, because the result is a tuple when created using the cursor dot, fetch one method.

08:20.930 --> 08:23.170
So it would return a tuple as follows.

08:24.650 --> 08:30.730
So in summary, to access the first and only element, we are using the indexing.

08:31.250 --> 08:32.810
So here we can remove this.

08:32.850 --> 08:36.010
And also here we terminate the try block.

08:36.330 --> 08:38.810
Next we can move on to the except block.

08:38.810 --> 08:42.970
But first we will need to do an import of the Psycopg2 module.

08:47.010 --> 08:52.330
So coming back to the except block we state what happens when there is an error, which we do by defining

08:52.330 --> 08:58.610
the cycle dot error class and pass the exception to the variable e.

08:59.090 --> 09:06.490
We use the Python function, which will mark the test as a fail and return the text in the quotes with

09:06.490 --> 09:07.290
the exception.

09:09.930 --> 09:14.410
Finally, we use the finally block similar to what we did for the connection.

09:14.410 --> 09:16.010
This time to close the carousel.

09:23.490 --> 09:28.370
Like that, the function is complete and now we can run the Pytest test.

09:34.450 --> 09:37.890
As you can see, the test has outputted a pass.

09:37.890 --> 09:45.130
So this means that the credentials that we are using to connect to the database are working as expected.

09:45.130 --> 09:53.250
And the select one statement, the SQL select statement indeed ran and we get the result as expected.

09:53.290 --> 09:56.610
With that, we have concluded our integration tests.

09:56.770 --> 10:01.850
There is much more you can do, and I invite you to create integration tests of your own to get more

10:01.850 --> 10:02.490
practice.

10:02.890 --> 10:05.970
Next up we will look into the end to end tests.
