WEBVTT

00:00.080 --> 00:06.770
So a lot of my students were banned or blocked by Craigslist when they were trying to follow my older

00:06.770 --> 00:09.890
request tutorial using Node JS request.

00:10.100 --> 00:16.640
So it seems that Craigslist has kind of gotten too many requests maybe from my students, and they are

00:16.640 --> 00:22.460
now actively trying to block any sort of scraping attempt using Node JS request.

00:23.790 --> 00:30.240
But don't worry, if you got blocked, you can still access the site using just a regular browser such

00:30.240 --> 00:32.010
as Chrome or Firefox.

00:32.220 --> 00:34.080
So that's what we're going to do now.

00:34.080 --> 00:38.310
We're going to scrape Craigslist using an automated browser.

00:38.340 --> 00:44.190
Instead, we're going to control the browser by using a tool called Puppeteer.

00:45.180 --> 00:51.870
So this is a great example of when you want to switch over to puppeteer instead of using request.

00:51.900 --> 00:58.620
Maybe the site doesn't allow you to use request to scrape it and you need to make a switch over to puppeteer.

00:59.550 --> 01:06.810
It's going to use more resources compared to requests, but it still gets the job done and you get the

01:06.810 --> 01:07.980
data that you need.

01:09.960 --> 01:16.710
I'm also going to be showing you how to limit your requests so you avoid Craigslist banning you or blocking

01:16.710 --> 01:17.400
your IP.

01:17.430 --> 01:21.870
If you make too many requests, even in the chromium browser.

01:21.990 --> 01:28.170
So I hope you have fun doing this project with me and I will be seeing you in the next section.