WEBVTT

00:01.020 --> 00:06.300
Hi, this particular project is related to automobile domain.

00:06.600 --> 00:10.390
So this one is a used car price prediction program.

00:10.410 --> 00:13.200
So again, we will be dealing with a regression problem.

00:14.070 --> 00:17.010
This particular dataset contains around

00:19.440 --> 00:28.200
Orlac, twenty three thousand rows of data, and these data contains twenty five features of various

00:28.200 --> 00:30.660
used cars and included devices.

00:31.740 --> 00:36.490
Your task is to predict the price of the used cars in the future.

00:37.080 --> 00:39.530
So let us go ahead here.

00:39.940 --> 00:44.730
You can see that this particular price column is the target that you need to predict.

00:45.900 --> 00:53.710
Next is the idea column, which again is not very much of use for you.

00:54.090 --> 01:01.350
And then we have this Eurail, which contains the links for these cars.

01:02.400 --> 01:05.070
You might not really be interested in these.

01:05.070 --> 01:05.880
You are less.

01:05.880 --> 01:09.180
Well, then we have various regions.

01:10.570 --> 01:20.230
Region you are is whether we have the year in which the car was born, then the manufacture of the car,

01:20.270 --> 01:27.480
the model, the condition, the number of cylinders, the fuel odometer and different details about

01:27.480 --> 01:27.780
that.

01:28.380 --> 01:32.490
Here we have different details which are present.

01:32.820 --> 01:39.420
Then the image you all again, might not be much of use, but there is a lot of information presented

01:39.420 --> 01:41.870
in this description column.

01:42.330 --> 01:52.050
So what you can do is you can try, including this particular information by applying a little bit of

01:52.050 --> 01:58.230
MLV on top of it, including various features using this particular description which is given here,

01:59.340 --> 02:05.550
and then you can use it to predict the prices again.

02:05.550 --> 02:07.000
You already know the drill.

02:07.320 --> 02:09.900
The task here is to find out the prices.

02:10.060 --> 02:15.750
So the first thing which you will be doing is you will get rid of the unwanted columns, find out the

02:15.750 --> 02:24.180
important columns, create features from categorical columns to convert them into numerical columns.

02:24.450 --> 02:28.430
Then you will have to convert this particular text again.

02:28.770 --> 02:36.870
So you might think about converting this text into, again, different count vectors or dividing vectors

02:37.050 --> 02:43.530
and then use that for a solution has also been given for this particular problem.

02:43.890 --> 02:51.240
But I would suggest to work on this and then go to the project in case you need that.

02:51.810 --> 02:53.100
So thank you.

02:53.490 --> 02:57.960
And I would let you know about the next project once you're done with it.
