WEBVTT

00:00.380 --> 00:02.270
Okay, so now you go.

00:02.300 --> 00:04.220
That's cool, Stefan.

00:04.220 --> 00:04.940
That's cool.

00:04.940 --> 00:11.290
But how do I use this with a web scraper instead of just adding two numbers together?

00:11.360 --> 00:18.560
So let's see how we could do that so we could have we have a parser here that's going to pass our HTML.

00:18.560 --> 00:27.770
And then let's also have a, let's call it a getter dot JS, which is going to get the HTML itself from

00:27.770 --> 00:28.460
the side.

00:28.460 --> 00:32.780
So in here we import request.

00:34.210 --> 00:36.100
From request promise.

00:39.750 --> 00:47.100
And then we have the good old, let's call it get HTML with a URL here.

00:47.520 --> 00:51.690
And in here we get the HTML.

00:52.080 --> 00:55.980
So we say await request.

00:56.480 --> 01:00.380
Get your l and, um.

01:00.470 --> 01:00.800
Okay.

01:00.800 --> 01:06.440
Okay, then let's also have a function here for saving the HTML to a file because we're going to use

01:06.440 --> 01:08.140
that for our test.

01:08.150 --> 01:19.280
So let's say function save HTML to file here we have the HTML in and then we also need to import.

01:20.070 --> 01:20.250
F.

01:20.430 --> 01:21.030
S.

01:23.680 --> 01:26.410
So we can save the HTML to our file.

01:27.770 --> 01:29.600
So let's see here.

01:29.600 --> 01:30.580
So we say F.

01:30.590 --> 01:40.450
S right file sync and let's call it just a test dot HTML and let's pass in the HTML here.

01:40.460 --> 01:48.920
So now we save a HTML or HTML file and we can save that and use it inside of our test for the parser

01:48.920 --> 01:49.700
itself.

01:50.630 --> 01:56.360
And then inside of our main function we can call this getter.

01:56.570 --> 02:00.230
Actually, I can just call the getter inside from the file here.

02:00.380 --> 02:05.690
So let's see, we can say HTML, get HTML.

02:07.130 --> 02:10.250
So now I need to go on to Craigslist.

02:11.040 --> 02:18.540
Let's see, we go to a site that we want to scrape and just get the HTML here so we can go into, let's

02:18.540 --> 02:21.810
see, apartments, housing.

02:22.140 --> 02:24.300
Let's take another kind of site.

02:24.300 --> 02:29.430
We have like a list here, like I have with the software development jobs.

02:29.430 --> 02:35.970
It's going to be the same thing except for we're doing it a test test driven way instead.

02:36.330 --> 02:37.740
So here we go.

02:37.740 --> 02:41.970
And we take this URL here for the section we have.

02:41.970 --> 02:45.570
This is musician musicians in this case.

02:45.570 --> 02:56.820
So we say get HTML paste in the URL here and then let's save the file or save the HTML to our file.

02:58.020 --> 03:07.200
Um, actually, I need to put this inside a async function as well because the get HTML, um, is also

03:07.200 --> 03:09.180
a async function.

03:09.990 --> 03:14.290
So I need to put that inside a another async function.

03:14.290 --> 03:17.170
I'm just going to paste that in here.

03:18.800 --> 03:22.670
And then we can call Maine down here in the ghetto.

03:22.920 --> 03:23.900
JS file.

03:25.480 --> 03:31.720
Now time for the moment of truth, which is to run the getter JS file.

03:33.840 --> 03:35.360
And let's see.

03:35.370 --> 03:44.040
So we have a promise object we get here, which is because I don't have a weight in front of get HTML.

03:44.580 --> 03:55.170
So I run it again and this time I'm getting undefined in the HTML because I forgot to return the HTML

03:55.200 --> 03:58.560
up in the get function up here.

03:59.100 --> 04:02.370
So here we have hopefully the full code.

04:02.640 --> 04:07.830
Now let's try and run it the final time and it should be working now.

04:07.830 --> 04:14.700
So here we have all of the San Francisco Bay Area musicians from Craigslist.

04:14.730 --> 04:22.240
So we have some nice HTML here, which is the same well, which is what request would get.

04:22.260 --> 04:29.970
And now we're going to save it and use it inside of the test and basically make out a web scraper this

04:29.970 --> 04:34.000
way without calling the Craigslist site all of the time.
