WEBVTT

00:00.800 --> 00:06.570
The left side is my editor, Visual Studio Code, which I use to edit the code.

00:06.590 --> 00:13.100
And on the right side, we have the browser where I navigate around in the page that I want to scrape.

00:13.220 --> 00:18.920
So this enables me to look around in the elements, select elements, find out how to select them,

00:18.920 --> 00:20.150
and then write the code.

00:20.180 --> 00:21.620
On the other side.

00:23.240 --> 00:26.660
So first let's get the request module.

00:27.790 --> 00:28.660
Say require.

00:28.690 --> 00:30.220
Request promise.

00:31.120 --> 00:38.320
And this enables us to download pages, which we can then use chario to select different elements.

00:39.130 --> 00:40.660
Just like jQuery.

00:41.170 --> 00:46.870
Okay, so let's create our main function for the web scraper, which is going to be an async function.

00:47.440 --> 00:54.190
And if you're not familiar with async, it's a feature in the new JavaScript Es7, which enables us

00:54.190 --> 00:56.920
to use keywords such as Await.

00:57.940 --> 01:00.120
And I'll explain that later.

01:00.130 --> 01:02.890
Now we'll make the call for the function down here.

01:03.790 --> 01:11.650
So it enables us to use features such as Await so we can say await for every asynchronous call we make.

01:13.630 --> 01:16.030
So let me show you in practice how this works.

01:16.030 --> 01:20.080
We will get the HTML from the page first.

01:20.230 --> 01:30.010
So we say request dot get and this downloads basically any URL we paste into request so we get the quicktest

01:30.010 --> 01:30.910
page here.

01:30.910 --> 01:35.260
With all the jobs, I will make a URL variable.

01:38.010 --> 01:41.040
Going to paste this URL down in the description as well.

01:41.040 --> 01:42.990
So if you want to get it.

01:43.020 --> 01:47.970
You look in the description or the resources for the Udemy lecture.

01:49.630 --> 01:58.120
And then we can use the await keyword to wait for requests to finish this request and then we can do

01:58.120 --> 02:00.430
something with the result.

02:04.580 --> 02:11.990
Like the old fashioned way to do it would be using something like request dot, get URL and then then

02:12.410 --> 02:15.020
and then do something with the result.

02:15.170 --> 02:17.660
And maybe we have a catch clause here.

02:18.050 --> 02:25.040
But I think that using the await keyword in the new version of JavaScript is a lot cleaner than having

02:25.040 --> 02:27.920
the chain then clauses and so on.

02:29.320 --> 02:33.910
So that was just a short intro to await, in case you're not used to that.

02:34.540 --> 02:42.040
Then I'm also going to paste in a try catch clause so we catch any errors we have in the code and we

02:42.040 --> 02:44.350
can paste it out to our console.

02:52.920 --> 02:59.040
So now let's try and just see in the console what request is actually getting from the Craigslist page

02:59.040 --> 03:01.470
so you can see it in practice.

03:01.740 --> 03:09.330
We'll write console log result and then we'll write Node Index.js, which has our web scraper.

03:09.870 --> 03:16.620
And then you can see inside of the console that it's actually just the basic HTML page you're getting

03:16.620 --> 03:18.240
in a string format.

03:18.630 --> 03:23.010
Now we can't really do so much with this page as it is right now.

03:23.010 --> 03:25.110
That's what we need Cherry-o for.