WEBVTT

00:02.030 --> 00:08.870
So now let's try and use our new selector or new each loop we built inside of Chrome console to test

00:08.870 --> 00:09.410
it out.

00:09.410 --> 00:12.530
And let's see how it looks like inside of NodeJS.

00:13.690 --> 00:17.740
So I'm just going to go and hit and paste it inside of NodeJS like this.

00:17.740 --> 00:21.910
And let's take a look at how the console log looks like when we run it.

00:23.080 --> 00:28.060
And we can see that all of the job URLs is being displayed just fine.

00:29.900 --> 00:38.210
Now, maybe you noticed in earlier that if we did a dot text on the A elements, we also display all

00:38.210 --> 00:43.610
the titles because all of the titles is actually A elements also.

00:44.660 --> 00:51.710
This means that we could perhaps tidy up our code inside of NodeJS a little bit, because right now

00:51.710 --> 00:56.870
we have two different groups and it kind of gets a little annoying sometimes.

00:56.870 --> 01:02.300
If you have two different groups and have to merge your elements together with the data properties.

01:02.690 --> 01:10.370
So what we can do instead is just have this loop or this loop for the A element instead, and just get

01:10.370 --> 01:12.470
the titles inside here as well.

01:13.320 --> 01:15.150
So that looks something like this.

01:15.180 --> 01:20.670
We are going to have a curly brace here because we're going to have several statements inside of this,

01:20.850 --> 01:21.990
uh, each loop.

01:22.470 --> 01:24.450
So here we have the title.

01:24.450 --> 01:27.420
And then let's also get the I'm sorry.

01:27.480 --> 01:28.590
Here we have the URL.

01:28.590 --> 01:31.740
And then let's also get the title.

01:32.560 --> 01:36.010
So I'm just going to paste it console statement down here.

01:36.010 --> 01:39.940
And let's delete this each do we have here.

01:41.060 --> 01:43.010
So it now looks something like this.

01:43.010 --> 01:45.830
Let's test it out again and see how it looks like.

01:46.280 --> 01:53.480
And we can see it prints out just fine the title first of the element and then afterwards the URL,

01:53.480 --> 01:54.890
just like we wanted.

01:55.310 --> 02:01.700
Actually it prints the the URL first and then the title, but well that's the devil in the details right?

02:03.260 --> 02:09.410
So now let's put this data into an object instead of just outputting it on our console log.

02:19.230 --> 02:22.410
Dot text for the title and URL.

02:29.330 --> 02:30.710
And there we have the URL.

02:31.430 --> 02:34.910
And then let's just make a object down here.

02:36.640 --> 02:38.620
Actually, we can say return.

02:42.920 --> 02:44.570
Title and URL.

02:44.570 --> 02:51.410
So we return an object from this each loop, and then we make it into a dot map instead.

02:51.410 --> 02:54.800
So returns an array of objects instead.

02:55.010 --> 02:57.650
So we change this dot each to a dot map.

02:57.800 --> 03:01.850
And let's call it jobs over here.

03:02.360 --> 03:07.400
And then let's see what we get if we do a console or log of jobs.

03:11.370 --> 03:17.010
So you can see here the output in my terminal looks kind of weird.

03:17.010 --> 03:23.760
It doesn't just list the title and the URLs, and that is because the dot map inside of cheerio, uh,

03:23.760 --> 03:28.560
outputs an array of jQuery elements or cheerio elements.

03:28.560 --> 03:34.260
It doesn't actually output the object return here, but it's quite easy to get it.

03:34.260 --> 03:39.780
To do that, you just say dot get after this map loop here.

03:40.140 --> 03:48.510
So now if I do that and we do a console log, or if we run the code, we can see we now get all of the

03:48.510 --> 03:50.820
titles and the URLs just fine.

03:51.630 --> 03:53.370
So this is, uh, pretty cool.

03:53.370 --> 03:53.790
By now.

03:53.790 --> 04:01.290
We got all the job URLs and the job titles, but what about all the other sites we have on this site?

04:01.320 --> 04:07.590
We only get for the first page right now, but we also want to get for all the other pages on Craigslist.

04:08.070 --> 04:09.930
So that's what we're going to take a look at.

04:09.930 --> 04:15.120
In the next section we're going to show you I'm going to show you how we can scrape all of the pages

04:15.120 --> 04:18.810
to get all of the jobs and the titles and so forth.
