1
00:00:00,050 --> 00:00:06,290
Hi there and welcome to this new session in which we shall be building a linear regression model for

2
00:00:06,290 --> 00:00:08,570
predicting the price of second hand cars.

3
00:00:08,570 --> 00:00:14,990
In this section, we shall make use of different features, like the number of years the car has spent

4
00:00:14,990 --> 00:00:21,650
on the road, number of kilometers covered, the car rating, the car's condition, the current state

5
00:00:21,650 --> 00:00:26,180
of the economy, the top speed, the horsepower and the torque.

6
00:00:26,180 --> 00:00:34,010
In order to predict the current price of that car at the end of the section, we shall build our model

7
00:00:34,010 --> 00:00:42,260
such that we are able to obtain the model's predicted prices to be very close to the actual prices.

8
00:00:42,260 --> 00:00:47,990
Like you could see in this plot, we have the predicted prices in orange and the actual prices in blue.

9
00:00:48,020 --> 00:00:55,790
That said, in this section and in other machine learning projects we shall be working on, we will

10
00:00:55,790 --> 00:01:00,020
make use of this machine learning development life cycle.

11
00:01:00,020 --> 00:01:03,530
So we will start by defining the task.

12
00:01:03,530 --> 00:01:09,890
Once the task is defined, we will try to understand and prepare the data.

13
00:01:10,010 --> 00:01:14,810
Then once this is prepared, we will dive into building our model.

14
00:01:14,900 --> 00:01:20,240
After building the model, we shall come up with a way of measuring the error.

15
00:01:20,420 --> 00:01:22,610
Um, the model is going to make.

16
00:01:22,610 --> 00:01:28,460
And then from here we'll dive into training our model and optimizing it.

17
00:01:28,700 --> 00:01:35,180
Once the model is trained, we shall measure the model's performance based on the model's performance.

18
00:01:35,180 --> 00:01:41,780
We shall then carry out corrective measures in order to better that model.

19
00:01:41,900 --> 00:01:48,500
So now if we consider a simplified version of our task here we have only a single feature.

20
00:01:48,500 --> 00:01:50,960
We have just this one feature right here.

21
00:01:50,960 --> 00:01:55,310
And then we have our price in um, k dollars.

22
00:01:55,310 --> 00:02:02,390
Then we will train our model with this known input output pairs, which we have right here.

23
00:02:02,420 --> 00:02:06,320
Now our input um, in this case is just the horsepower.

24
00:02:06,320 --> 00:02:11,750
But as we had seen already in the notebook, we actually have many more inputs.

25
00:02:11,750 --> 00:02:16,070
So we have years kilometers ratings up to the torque.

26
00:02:16,070 --> 00:02:21,740
So just for simplicity purposes, we've decided to select only one input feature.

27
00:02:21,740 --> 00:02:23,960
So here we have our inputs.

28
00:02:23,960 --> 00:02:24,770
That's it.

29
00:02:24,770 --> 00:02:28,520
And then we have our output which is the current price.

30
00:02:28,520 --> 00:02:35,180
So getting back to our board uh what we were saying is we are going to train our model on this input

31
00:02:35,180 --> 00:02:36,080
output pairs.

32
00:02:36,080 --> 00:02:40,460
That's the horsepower and its corresponding price horsepower and price.

33
00:02:40,460 --> 00:02:47,900
And then once we're done with the training, we shall then be able to predict the price of a car when

34
00:02:47,900 --> 00:02:50,000
given only the horsepower.

35
00:02:50,000 --> 00:02:56,000
So when given a value of 80, we should be able to predict that the price is, um, let's say, for

36
00:02:56,000 --> 00:03:00,020
example, x, um, k dollars.

37
00:03:00,020 --> 00:03:08,210
Now we could represent this table right here by a plot like this one to the right where we would have

38
00:03:08,210 --> 00:03:13,340
on the x axis the horsepower and on the y axis the price in K dollars.

39
00:03:13,340 --> 00:03:21,380
And the idea now will be to come up with a mathematical function like, say, a straight line, um y

40
00:03:21,380 --> 00:03:29,780
equal m x plus c, where the values of m and c are going to be chosen.

41
00:03:29,780 --> 00:03:36,380
After we've trained the model on the different um input output pairs x y.

42
00:03:36,380 --> 00:03:44,540
So after training on x, y, um, the model will learn to choose the best m and and C which suits this

43
00:03:44,540 --> 00:03:52,340
data set such that if we have a new point, let's say if we have some new points around here, we shall

44
00:03:52,340 --> 00:03:56,930
be able to, um, estimate the price based on the horsepower.

45
00:03:56,930 --> 00:04:04,430
And so now, um, if we come up with this, um, function with the values of M and C gotten after training

46
00:04:04,430 --> 00:04:07,760
the model, we get a new input like this one.

47
00:04:07,760 --> 00:04:15,710
We shall make use of this to obtain the price of the car.
