WEBVTT

00:07.430 --> 00:12.500
I will call you back again to another lecture in PostgreSQL, and in this video we go ahead and learn

00:12.500 --> 00:15.350
how to generate a large data set.

00:15.590 --> 00:20.090
So over here we go ahead and generate a large data set.

00:20.120 --> 00:23.480
Because right here you can see we have just four rows.

00:23.480 --> 00:26.570
And I want to create something like 1000 rows.

00:26.840 --> 00:29.240
I know you may be like okay no problem.

00:29.240 --> 00:31.940
I'm going to take you on a simple step on how to do that.

00:31.970 --> 00:32.720
Yes.

00:32.750 --> 00:33.380
All right.

00:33.410 --> 00:35.090
But it's going to be just a mockup.

00:35.780 --> 00:36.050
Right.

00:36.050 --> 00:39.170
So it's going to be just a mockup for this course.

00:39.200 --> 00:42.050
Now let's go ahead and move over to our website.

00:42.050 --> 00:46.160
And that website is called Mercado.

00:46.190 --> 00:53.240
So Mercado is used by data analysts to experiment and play around.

00:53.240 --> 00:57.620
And is going to be a very good introduction in this course.

00:57.620 --> 00:59.450
So it's absolutely free.

00:59.480 --> 01:03.560
As you know, it has appeared in case you want to generate more than 1000 data sets.

01:03.560 --> 01:06.260
set, and that is when you're going to pay for it.

01:06.290 --> 01:09.440
What if you're going to generate anything less than 1000 data sets?

01:09.440 --> 01:11.090
So it's absolutely free.

01:11.180 --> 01:17.990
Now we have some of the features name the ID, first name, last name, email gender IP address.

01:17.990 --> 01:20.420
So these are by default right in here.

01:21.080 --> 01:24.440
I want to remove the ID because we're going to auto increment.

01:24.440 --> 01:27.170
And we don't need to improve that by ourselves.

01:27.170 --> 01:29.120
And the first name is going to be there.

01:29.150 --> 01:30.230
The last name will be there.

01:30.260 --> 01:31.490
The email will be there.

01:31.520 --> 01:34.970
Gender is something we've never used, but I think it's a very nice option.

01:35.000 --> 01:40.760
We'll go ahead and leave the gender so that we can be able to use it so that every student is either

01:40.760 --> 01:41.960
male or female.

01:42.320 --> 01:42.980
All right.

01:42.980 --> 01:44.990
So your male or your female.

01:44.990 --> 01:46.100
So it's a very nice option.

01:46.100 --> 01:48.080
We're going to leave that field.

01:48.110 --> 01:50.060
But we don't need IP address.

01:50.060 --> 01:54.620
Then we are going to include or exclude the email date.

01:54.620 --> 01:56.990
So I'm going to put this in raw case.

01:56.990 --> 02:01.940
So we're going to put in raw underscore dates.

02:01.940 --> 02:08.300
And I'll go over to this and I'll go over to date dates and check out date.

02:08.330 --> 02:11.210
Now we have what's called the.

02:12.020 --> 02:12.680
Dates.

02:12.680 --> 02:15.170
So if you check over here, we have a date.

02:15.170 --> 02:17.870
And that is what I'm going to select.

02:17.870 --> 02:23.120
And over here you we can select the year we want.

02:23.210 --> 02:26.720
And let's go ahead and say maybe um.

02:28.820 --> 02:34.040
We can start from any, any year we want or any time we want this to start from.

02:34.040 --> 02:36.290
I can choose any one.

02:36.290 --> 02:40.760
So I just want this to move to something ahead of me.

02:40.790 --> 02:41.480
All right.

02:41.480 --> 02:44.540
So that, uh, it's going to always look good.

02:44.540 --> 02:46.520
So I'm going to start from 2030.

02:46.550 --> 02:47.240
All right.

02:47.240 --> 02:55.250
So from 2030, uh, I need to always select this first.

02:56.090 --> 02:57.920
So I've changed to 2030.

02:57.920 --> 03:00.800
And I can also change this to maybe 2050.

03:00.800 --> 03:02.960
And, uh, this is just a mockup.

03:02.960 --> 03:03.860
No problem.

03:03.860 --> 03:05.960
And you're going to choose a date format.

03:05.990 --> 03:09.530
I want the year to come first, then the month, and then do be.

03:09.560 --> 03:10.730
That is what I want.

03:10.760 --> 03:14.450
And I'm not going to leave anyone to be blunt.

03:14.450 --> 03:16.310
So it's going to be 0% for black.

03:16.340 --> 03:22.760
Then for the email address, we're going to make about 30 of them to be a blank, because I don't want

03:22.760 --> 03:26.120
every student to have an email address that will be students that don't have email address.

03:26.120 --> 03:26.990
No problem.

03:27.110 --> 03:30.620
This is just for, uh, experiment.

03:30.620 --> 03:33.410
Then I'm going to add maybe the country of the students.

03:33.410 --> 03:36.620
Maybe the students are coming from different countries is an international school.

03:36.620 --> 03:42.140
So go ahead and say country and I'll go over to date and I'm going to select country.

03:42.170 --> 03:44.840
Go down here and let's select country.

03:44.840 --> 03:47.330
And I'm not going to restrict any country.

03:47.330 --> 03:49.220
It's just going to be the way they are.

03:49.610 --> 03:55.310
I am going to go ahead and make this to be 2014 to 2015.

03:55.310 --> 03:58.580
And the date format is the year, month and day.

03:58.580 --> 04:02.240
And the email address 30% are going to be blank.

04:02.270 --> 04:02.510
Why?

04:02.540 --> 04:04.430
70% are going to be filled.

04:04.430 --> 04:10.370
And I will learn here the rules are still going to be 1000 rows, because that is going to enable me

04:10.370 --> 04:11.300
to download free.

04:11.300 --> 04:14.090
And the format is going to be SQL.

04:14.090 --> 04:17.510
So you can see you have a CSV, JSON and so on.

04:17.540 --> 04:19.370
I'm going to select CSV.

04:19.700 --> 04:22.040
Uh I want to select the SQL.

04:22.040 --> 04:26.240
And then I'll go over here and change this to students.

04:26.240 --> 04:33.650
And then I'm going to include the create table so that I can always work with this and make some adjustments.

04:33.650 --> 04:37.220
And then I'll go ahead and hit on Create Data.

04:37.370 --> 04:40.010
And once you click on Create Data it will download.

04:40.010 --> 04:43.190
And you can see we have SQL here.

04:43.190 --> 04:45.260
That is students SQL.

04:45.410 --> 04:46.220
All right.

04:46.250 --> 04:53.270
Now over here you can see we have a students table right in here.

04:53.270 --> 04:55.670
And now I want to open up this.

04:55.700 --> 05:01.850
I can use any editor or maybe Atom or Visual Studio or Sublime Text.

05:01.880 --> 05:06.320
I'll go ahead and open the Visual Studio and this will have.

05:06.350 --> 05:09.920
Then let's go ahead and move over to file and open this file.

05:09.920 --> 05:11.690
And I'm going to double click on that.

05:11.690 --> 05:19.100
And this is what I have now I have the first name, last name, email, gender, date and country.

05:19.130 --> 05:31.460
And for the ID, which is a stress ID, I'm going to say, uh, stress underscore ID and this is going

05:31.490 --> 05:36.380
to be the big Syria just like we always know.

05:36.380 --> 05:39.680
So this is our big Syria.

05:39.680 --> 05:42.260
And that is going to be not.

05:42.260 --> 05:42.830
No.

05:42.830 --> 05:43.910
And uh.

05:45.920 --> 05:46.940
Primary key.

05:47.420 --> 05:51.920
Now we go ahead and add not no to every other one.

05:51.920 --> 05:53.630
So I'm going to copy this not no.

05:53.630 --> 05:56.270
And I'm going to move under the first name.

05:56.270 --> 05:58.250
And I'm going to say is not no.

05:58.250 --> 06:00.710
And I'm going to go over here not no.

06:00.710 --> 06:08.210
Then the email, I'm not going to include that because some of them to be blank and gender is not known.

06:08.240 --> 06:16.820
Then the country and uh, I'll go ahead and leave the country to also be not known now.

06:16.850 --> 06:23.300
Then for the dates, all I need to do is I'm going to go over to date and I'm going to say, not now.

06:23.330 --> 06:24.890
I hope that is cool.

06:25.040 --> 06:26.060
All right.

06:26.300 --> 06:32.000
So now if I have done this, all I need to do is to go ahead and just save this.

06:32.180 --> 06:33.680
Try to save your code.

06:33.710 --> 06:34.190
All right.

06:34.280 --> 06:35.870
And save this.

06:35.870 --> 06:38.720
And I'm going to minimize this again.

06:38.720 --> 06:40.430
Then I'll get right in here.

06:40.430 --> 06:43.970
And I'm going to import this file.

06:43.970 --> 06:48.860
So to import this file or had I use backward slash I.

06:49.040 --> 06:55.010
So that is backward slash I and you then go ahead and copy the path.

06:55.040 --> 06:59.300
So if I copy the path now I can be able to use that.

06:59.330 --> 06:59.870
All right.

06:59.870 --> 07:04.130
So if I go ahead and copy this part I'll go ahead and copy the parts of this.

07:04.130 --> 07:09.260
And now that is copied out ahead and paste it in a notepad.

07:09.260 --> 07:10.850
And it did it.

07:11.510 --> 07:12.080
All right.

07:12.080 --> 07:13.580
So I'll just copy the path.

07:13.580 --> 07:16.850
And now all I have to do is use a memoir.

07:16.850 --> 07:24.470
And then what I just did now is I change the forward slash, the backward slash to forward slash, because

07:24.470 --> 07:27.200
that is the only way I can be able to use it.

07:27.200 --> 07:33.170
And now I'm going to copy that and I'm get back to my SQL shell.

07:33.170 --> 07:39.380
And I'm going to press this right in here so you can see backward slash I then everyone should be a

07:39.380 --> 07:40.100
forward slash.

07:40.100 --> 07:44.330
You remove the C that is in front and this is what is going to set.

07:44.360 --> 07:47.300
Then you go ahead and hit enter.

07:47.600 --> 07:48.560
Wow.

07:49.160 --> 07:52.370
You can see that it is being loaded.

07:52.370 --> 07:57.770
But there is an error which I'm seeing somewhere and we're going to address that error.

07:58.640 --> 08:10.280
So if that is being put in and uh it's done, but there is an error, it says the column gender of variations

08:10.280 --> 08:11.480
does not exist.

08:11.480 --> 08:15.710
And uh, we can see we don't have any gender column.

08:15.710 --> 08:21.680
Then what it means is that this table we have right in here has only first name, last name in row,

08:21.710 --> 08:27.410
date, email and age and said, I'm going to drop this table.

08:27.440 --> 08:32.330
All right, so if I drop this table, we go ahead and create a new table.

08:32.870 --> 08:34.700
So I'll go right in here.

08:34.700 --> 08:38.060
You can see below I still have option down here to create my table.

08:38.060 --> 08:43.430
So what I need to do is first of all drop table.

08:43.430 --> 08:48.680
And the table name is a student.

08:48.740 --> 08:52.250
And I had to make that to be a lowercase students.

08:52.250 --> 08:56.270
And I'll put a semicolon at the end and hit enter.

08:56.270 --> 09:00.470
And you can see that the table is now dropped.

09:00.500 --> 09:03.170
Now I'll go back and put my back.

09:03.200 --> 09:05.990
Was I right in here.

09:05.990 --> 09:13.850
And then I'm going to press that which I copied and the word head and hit enter.

09:13.850 --> 09:19.340
And you can see that is being inserted and it has inserted everything.

09:19.340 --> 09:27.080
Now if I want to see what I've just done, all I need to do is I can go over here or here.

09:27.080 --> 09:29.780
And if I execute this, let's go ahead and check it out.

09:29.780 --> 09:33.710
And you can see I have this because it looks good right in the end.

09:33.740 --> 09:39.050
And I'm dragging this down, I'm dragging it up so that you can be able to see everything.

09:39.080 --> 09:47.180
I hope you can see we have this file and that this is up to 1000 some correct clicking to the end of

09:47.180 --> 09:47.870
this.

09:47.960 --> 09:48.620
All right.

09:48.620 --> 09:52.070
So you can see that is cool right now.

09:52.070 --> 09:56.030
Let's go ahead and uh copy this okay I just copied them.

09:56.030 --> 09:57.020
Go ahead and copy this.

09:57.020 --> 10:00.470
So I will also check that on my SQL shell.

10:00.470 --> 10:02.030
So I'm going to paste that right in here.

10:02.030 --> 10:03.770
And I'm going to hit enter.

10:03.770 --> 10:08.840
And uh I'm going to drag this open so that we can see everything.

10:08.870 --> 10:09.200
Okay.

10:09.230 --> 10:10.880
Let me go ahead and enlarge this.

10:12.500 --> 10:21.260
And then it says this okay quiet I hit enter is going to open up every one of them 111 by one.

10:21.290 --> 10:21.980
All right.

10:21.980 --> 10:28.640
So the more you hit enter you can see I'm being I'm seeing everything and it's is going down.

10:28.640 --> 10:32.600
So if you are using some of you, it might just open up all of them immediately.

10:32.600 --> 10:38.930
But when you click on, more and more is going down and it's showing you all the tables that you have,

10:38.930 --> 10:44.270
so I can be able to see everything, everything, everything.

10:44.270 --> 10:45.830
So it's going down.

10:45.860 --> 10:56.780
I'm checking out these and we are still on 700 and it's going down to 800 and it's moving down and very

10:56.780 --> 10:59.120
soon it will get to 1000.

11:00.740 --> 11:01.190
All right.

11:01.190 --> 11:02.420
So we are almost there.

11:02.420 --> 11:05.510
And now we have 1000 rows here.

11:05.510 --> 11:06.770
So that is it.

11:06.800 --> 11:11.450
Well you are able to create and uh generate a large data sets for this.

11:11.450 --> 11:12.830
And that is very interesting.

11:12.830 --> 11:15.920
So in the next video lecture go ahead and kick off and start again.

11:15.920 --> 11:18.320
So thank you for getting to this point.

11:18.320 --> 11:21.350
And that is a very big, big achievement.

11:21.380 --> 11:23.750
See you in the next video lecture.
