WEBVTT

00:00.620 --> 00:01.370
Okay, guys.

00:01.370 --> 00:08.600
So I'm really sorry that we haven't been getting into coding yet, but I promise in this section we're

00:08.600 --> 00:11.840
actually going to go and write our scraper itself.

00:11.960 --> 00:15.680
So as you can see, we have the homepage URL here.

00:15.680 --> 00:19.310
If you didn't get it, look in the resources.

00:19.340 --> 00:24.170
There's a link for this URL so you don't have to type it all by hand.

00:24.170 --> 00:25.550
It's pretty long.

00:26.150 --> 00:33.380
So and that's the URL we get to which I showed you in the previous sections.

00:34.400 --> 00:38.930
Okay guys, so let's write a function as usual.

00:39.500 --> 00:46.370
We write a let's call it scrape homes in index page.

00:46.580 --> 00:48.890
It takes in a URL.

00:50.080 --> 00:53.970
And then we create a browser using puppeteer.

00:53.980 --> 01:04.780
So puppeteer await puppeteer dot launch, and then we pass in an option to make the browser not headless

01:04.780 --> 01:12.400
so we can see the actual Chrome browser running while we are programming this and debugging it and so

01:12.400 --> 01:14.230
on so we can see what's going on.

01:15.010 --> 01:18.040
Okay, so then we create a page.

01:18.460 --> 01:22.540
So we say await parser new page.

01:23.380 --> 01:27.850
And then we can go and look at the URL.

01:27.850 --> 01:32.920
So we can go and say page dot go to pass in the URL.

01:33.130 --> 01:41.020
Again, use await and that should open the page that we pass in to it like this.

01:42.040 --> 01:52.300
So let's go and see if it works so we can call scrape Holmes pass in the URL and then we should see

01:52.300 --> 01:56.260
the browser window open up and open this page.

01:56.770 --> 02:00.520
So let's go and write Node Index.js.

02:05.360 --> 02:12.090
And we see the page is opening up and you notice the window has a smaller size.

02:12.110 --> 02:13.250
That's okay.

02:13.250 --> 02:18.260
We don't mind that because it makes us get the bar down here below.

02:18.590 --> 02:21.620
So yeah, that's how you open the page.

02:21.620 --> 02:27.860
And now we just need to execute some JavaScript inside so we can select all the different elements in

02:27.860 --> 02:30.920
here and get the URLs of the homes.
