WEBVTT

00:00.930 --> 00:06.270
So just to make it a little more clear for you what we're trying to do in this project, we're going

00:06.270 --> 00:13.410
to get the job titles from each of these job on this website, and we're going to scrape the job titles

00:13.410 --> 00:19.740
and then the URL of each of these jobs and their description of the job.

00:19.890 --> 00:24.450
And we're going to do that on every page that we can access on this website.

00:24.450 --> 00:30.840
So we're going to scrape through all of these pages and get all of the jobs and the descriptions.

00:31.500 --> 00:37.770
So let's go on and initialize the project and import some packages we'll be using.

00:37.980 --> 00:43.830
So I'm going to go inside of the terminal and I will make a empty folder.

00:43.830 --> 00:46.350
Let's call it Praxis Scraper.

00:47.940 --> 00:51.210
And then let's open it up inside of VS code.

00:53.440 --> 00:55.390
So here I have my empty folder.

00:55.390 --> 00:57.760
There's absolutely nothing inside here.

00:58.000 --> 01:01.720
And let's go ahead and import the packages we'll be using.

01:05.650 --> 01:10.930
Inside the terminal type in npm I cheerio and then axios.

01:10.930 --> 01:12.220
So cheerio.

01:12.250 --> 01:17.860
We have used before for selecting the CSS selectors and getting that from the HTML.

01:17.890 --> 01:23.830
Axios is a new package we started using now it is another Http request client.

01:24.010 --> 01:29.230
Uh library, just like the request package we use before.

01:29.410 --> 01:32.530
Axios is also very popular in react frontend.

01:32.530 --> 01:36.820
So if you used react frontend before, you probably touched on Axios before.

01:36.820 --> 01:39.130
So you feel right at home with Axios.

01:39.130 --> 01:45.190
And if you haven't used Axios before, don't worry, it's very easy to start using it and we're going

01:45.190 --> 01:48.460
to be using it to get the HTML from the website.

01:48.970 --> 01:51.040
So now on to the next section.

01:51.040 --> 01:56.620
That's actually where we're going to get the first index page and already start scraping some jobs from

01:56.620 --> 01:57.880
the Praxis website.

01:57.880 --> 02:00.160
So I'll see you in the next section.
