WEBVTT

00:00.990 --> 00:03.270
Now let's discuss about the aggression.

00:04.470 --> 00:11.760
So let us assume that this is the of which we have in time and we have only one input variable, which

00:11.760 --> 00:13.500
is the education.

00:15.810 --> 00:20.580
And based on this education, we want to predict the income of the.

00:22.770 --> 00:31.380
Now, we can predict these by creating a linear regression and what linear regression suggests is that

00:31.620 --> 00:40.770
we will be drawing a particular line such that the line will be able to map all the values if possible.

00:42.730 --> 00:46.710
So just as you what lions can we create here?

00:48.120 --> 00:51.030
There could be a line created like this.

00:52.860 --> 00:54.420
There could be another nine.

00:55.400 --> 00:57.470
Created like this.

00:58.970 --> 01:01.680
And there could be another line creator.

01:03.410 --> 01:04.250
Like this.

01:08.340 --> 01:15.180
Now, each of these lines will be able to capture some of the data points, but none of the line is

01:15.180 --> 01:17.970
actually happening, capturing most of the data.

01:18.930 --> 01:25.430
So we have to find out which line will be able to capture maximum data points.

01:26.620 --> 01:30.560
So we will be finding the error which is present to you.

01:31.330 --> 01:37.150
Now let us see how will we find out if the values I correctly predicted or not?

01:37.900 --> 01:44.770
So for finding out if the values are predicted correctly or not, we will be dropping a perpendicular

01:44.920 --> 01:49.380
to each data point to find out the distance or the error value.

01:50.930 --> 01:54.350
So let us consider that this is the line.

01:55.980 --> 02:00.060
Which we have actually created for the Swedish.

02:03.080 --> 02:05.360
So when we are dropping the ball, when Nikolas.

02:11.100 --> 02:13.080
The distances will be this.

02:15.880 --> 02:16.510
This.

02:18.100 --> 02:18.730
This.

02:20.500 --> 02:21.160
This.

02:52.020 --> 02:52.650
And this.

02:54.830 --> 03:03.950
Now, let us compare the distances now, these instances when compared to distances from this particular

03:03.950 --> 03:04.300
line.

03:07.120 --> 03:14.570
Will be a lot more so the distances for this particular line will be larger than the earlier one.

03:14.590 --> 03:17.600
So let me drop distances to this particular line.

03:17.620 --> 03:21.070
So if we drop the distances here, the distance would be less.

03:22.010 --> 03:26.630
Hear the distance would be less here, the distance would be less.

03:29.280 --> 03:30.180
This is same.

03:32.860 --> 03:38.680
Now, after this point, all the distances will be more than the previous distance.

03:42.120 --> 03:51.630
So you can see that in comparison with these two lines, this first line actually provides a better

03:52.050 --> 03:52.550
result.

03:54.830 --> 04:01.850
So we will be creating the front lines and we will be comparing the distances between these lines.

04:02.450 --> 04:08.840
So how will it be creating these lines and how we will be reducing the distance is the next thing in

04:08.840 --> 04:09.080
my.

04:10.100 --> 04:16.580
So let us try to solve this and let us try to see that, how will we actually create this equation of

04:16.580 --> 04:20.090
this line and how will we create this model?

04:25.100 --> 04:30.950
So what are the different type of equations which we have, there are limited equations, quadratic

04:30.950 --> 04:34.400
equations, Kubic equations, so a linear equation.

04:35.510 --> 04:37.250
Will look something like this.

04:40.090 --> 04:49.270
And this equation will be something like, why is it going to be one x one plus two plus B three x three

04:49.270 --> 04:51.250
plus B for export and so on?

04:52.230 --> 04:55.580
And in the end, one constant V will be added to it.

04:56.800 --> 05:04.570
Because the original line of formula is Y is equal to a mixed blessing, which is the original equation

05:04.570 --> 05:07.510
for line where C is the.

05:08.460 --> 05:11.690
Intercept and M is the slope.

05:13.340 --> 05:19.580
Similarly, we will be creating this particular equation because here we have multiple X values, let

05:19.580 --> 05:21.230
us get back to the data which we have.

05:22.510 --> 05:29.860
So this is the data which we have, so the age amount, salary dependent sex and children will be different

05:29.860 --> 05:33.310
X values, that is X one, two, three, four, five.

05:33.550 --> 05:37.390
And the interest rate will be the value which we are predicting.

05:37.870 --> 05:43.750
So the equation will be some confusion multiplied with E under the age and a definition multiplied with

05:43.750 --> 05:50.200
amount on a double multiplied with salary under the compulsion to multiply the dependent on the condition

05:50.200 --> 05:54.970
multiplied by six, another coefficient multiplied by children and some of all of these.

05:56.580 --> 06:03.420
Should be equivalent to the value of Interest-free now out of this, the only thing which we need to

06:03.420 --> 06:10.410
find out is the value of the efficient because value.

06:11.780 --> 06:17.990
And the one, two, three values is something which we already have, so the only thing which we have

06:17.990 --> 06:23.050
to formulate is the value of BE1 be to be three before and so on.

06:24.790 --> 06:31.060
Now, while we are creating this equation, we have to make sure that the prediction, which we are

06:31.060 --> 06:33.470
making good predictions.

06:33.850 --> 06:36.830
So what are different features of a good prediction?

06:37.270 --> 06:41.040
So a good prediction model will give accurate results.

06:41.650 --> 06:45.050
So a good model will give accurate results.

06:45.220 --> 06:51.250
So a model which will build this particular equation will be a better model in comparison to the model

06:51.250 --> 06:53.110
which will give this particular equation.

06:55.460 --> 06:59.940
Next is the accuracy should not be limited to just one data point.

07:00.440 --> 07:07.700
So let us say we are looking at a particular city for which we are trying to find out how much water

07:07.700 --> 07:10.160
or how much food is needed in that particular.

07:11.090 --> 07:15.040
And similarly, we are looking for multiple cities in the country.

07:16.000 --> 07:26.470
So now our task is to find out the amount of food required as accurately as possible so that no specific

07:26.470 --> 07:33.010
city doesn't have extra food or no specific city has lesser amount of.

07:34.550 --> 07:41.870
So we're going to have to make sure that all the cities have sufficient amount of food available so

07:41.870 --> 07:43.780
that no one dies of hunger.

07:46.400 --> 07:54.740
We cannot do something like we predict one requirement for one particular city correctly, and the predictions

07:54.740 --> 08:02.240
are very wrong or very inaccurate for the other cities, because in that case, other cities, the people

08:02.240 --> 08:04.530
who are living in other cities will die of hunger.

08:05.210 --> 08:05.550
Right.

08:05.750 --> 08:15.110
So the predictions which we are making have to be in sync and accurate or good enough for all the data

08:15.110 --> 08:18.890
points or at least for the maximum number of data points.

08:20.830 --> 08:27.910
So now the model which we will be creating, the model which we will be creating or the predictions

08:27.910 --> 08:35.350
which we will be making, should be performing equally well on the testing data as on the training data.

08:35.780 --> 08:36.280
Now.

08:37.650 --> 08:44.280
As of now, we don't know what testing device and what printing device, so let us try to understand

08:44.280 --> 08:44.940
that a little.

08:46.830 --> 08:53.310
So let's take the example of preparing for examination again.

08:54.930 --> 09:02.610
Now for an examination we are preparing from a particular information or from a particular book.

09:03.730 --> 09:12.010
Now, this particular book is the invasion force material, which we have now, what happens is the

09:12.010 --> 09:16.200
book will have certain questions at the end of each and every chapter.

09:17.310 --> 09:24.540
So when we are learning from this particular book, we will be practicing on those practice questions,

09:24.540 --> 09:28.200
which we will be having at the end of each and every chapter.

09:29.560 --> 09:37.930
So when we will be doing self evaluation, while we will be training ourselves to give the exam, we

09:37.930 --> 09:44.290
will be self evaluating ourselves on the basis of these practice questions which are present at the

09:44.290 --> 09:45.190
back of the book.

09:46.420 --> 09:51.470
So we will be able to perform really well if we are really studying hard.

09:51.490 --> 09:56.070
We will be able to perform really well on the training data.

09:56.080 --> 10:01.720
That is, the evaluation questions or the practice questions is in the back of the book.

10:02.940 --> 10:11.700
So now the testing data is basically the exact exam, the exam, which we will be giving Igby after

10:11.700 --> 10:20.250
tomorrow, then we will have 14 questions, which will be not present in the book, but we will be based

10:20.250 --> 10:22.780
on the concept which are present in the book.

10:23.430 --> 10:29.070
So when we are learning from the data and we are learning from the information which is provided to

10:29.070 --> 10:35.630
us, we want to perform well on the training data or on the practice questions also.

10:35.790 --> 10:43.400
And we are performing good on practice questions, but we also want to perform well on the final examination

10:43.420 --> 10:43.650
on.

10:45.110 --> 10:51.620
So this is what it means, that it should perform equally well on the training data, that is a practice

10:51.620 --> 10:58.430
question on the testing data, which is the final evaluation, so that if this.

10:59.510 --> 11:00.310
A woodmore.

11:02.450 --> 11:10.340
Now, next thing is that a simpler model is a better model, complex models than do or would Woodfork.

11:11.090 --> 11:14.720
Now let's ignore what Overfitting means.

11:15.700 --> 11:19.310
OK, let us ignore what Overfitting feels for now.

11:19.500 --> 11:19.790
Five.

11:19.820 --> 11:26.130
Just try to understand that we want to make our model as simple as possible.

11:27.380 --> 11:33.710
And what complex model does we will look at in something so far?

11:33.740 --> 11:37.670
Now the model should give accurate results.

11:38.000 --> 11:41.680
The accuracy should not be limited to just one data point.

11:41.960 --> 11:47.410
It should perform equally well on the testing data as on the training data.

11:47.630 --> 11:52.640
And a simpler model is better model than the complex model.

12:05.780 --> 12:13.880
Now, let us see what this regression so did not what we have discussed is that regression is when we

12:13.880 --> 12:16.820
are trying to predict the continuous value.

12:17.990 --> 12:25.190
So regression takes a group of random variables to be predicting why?

12:26.610 --> 12:30.570
I'm trying to find a mathematical relationship between them.

12:32.260 --> 12:39.610
So this relationship is typically in the form of a straight line that best approximates all the individual

12:39.610 --> 12:40.330
data points.

12:43.050 --> 12:49.650
So what this means is that I have certain data, I have certain data, the gardening, education and

12:49.650 --> 12:57.420
income, so I want to set up a relationship between the education and the income of the person so that

12:57.600 --> 13:05.130
for when the next time someone tells me that this is their education, then I can easily guess what

13:05.130 --> 13:06.110
their income would be.

13:09.680 --> 13:17.560
So what is needed aggression, linear regression is used for continued static good values like Ege,

13:17.810 --> 13:21.290
sales interest, industry or house prices.

13:22.490 --> 13:29.930
So we can let us see if we can have data related to the number of bedrooms, the number of bathrooms,

13:30.170 --> 13:37.400
the the floor of the building and the details, the location of the building, and based on that, we

13:37.400 --> 13:40.580
will try to find out what is what should be the price of the house.

13:41.150 --> 13:49.280
Similarly, we can get details about a person's height and weight and try to predict that age.

13:50.240 --> 13:53.180
So these are different things which we can predict.

13:53.390 --> 13:59.450
So when we are trying to predict a continuous value, then it is called the linear regression from.

14:01.270 --> 14:08.650
Now, what is the expectation, the expectation is too long, the relationship between the independent

14:08.650 --> 14:19.210
input data and the dependent on good values, the independent input data is the legacy for the for the

14:19.540 --> 14:20.610
loan problem.

14:20.890 --> 14:29.290
The independent data could be the age of the person, the gender of the person, the amount which the

14:29.350 --> 14:33.410
user is requesting the loan for the number of dependents.

14:33.550 --> 14:35.020
So all of these will be the.

14:36.230 --> 14:41.930
Independent input data and independent target value will be the interest rate, which we are trying

14:41.930 --> 14:42.470
to predict.

14:44.240 --> 14:52.860
Now for finding out this will calculate the slope and the positions to create a linear equation which

14:52.860 --> 14:55.150
could align do given data property.

14:55.850 --> 14:59.230
So we want to create a linear equation.

14:59.420 --> 15:03.860
So the equation consists of the slope and the.

15:07.510 --> 15:09.670
Coefficients of the values.

15:11.920 --> 15:14.230
So that is what we are trying to predict here.

15:15.210 --> 15:18.390
And in this entire thing, we want to minimize the cost.

15:19.150 --> 15:26.730
Now, let us try to ask you, what exactly is a predictive word like we have discussed about what the

15:26.730 --> 15:27.420
question is?

15:27.630 --> 15:32.680
But our main concern here is to know what caused this.

15:32.880 --> 15:35.040
We want to understand what the cost is.

15:35.400 --> 15:38.230
So how will we understand what is cost?

15:38.820 --> 15:44.220
So to understand the cost, we have to understand what our predictive models.

15:45.630 --> 15:55.890
So for us, a predictive model is just like the application or just like a box in which we provide our

15:55.970 --> 15:56.820
training data.

15:58.090 --> 16:05.050
So we will provide the training data, which is the independent input data and the dependent that it

16:05.080 --> 16:13.720
values, and we will provide it to this box, which we have, and then this box will try to decrypt

16:13.930 --> 16:18.610
the relationship between these independent, independent variables.

16:20.270 --> 16:26.830
So it will try to establish a relationship between this and a group that I do for you to function internally

16:26.960 --> 16:34.130
so that it will apply that function or it will apply that equation on top of these X values.

16:34.400 --> 16:38.570
And using this equation, it will be able to find out the Vivan.

16:39.920 --> 16:48.380
So when we are planning the model or when we are letting the machine learn from the data, we are basically

16:48.380 --> 16:55.760
providing the input data and the output data, the input X values and the output values to the machine.

16:55.970 --> 17:03.160
And the machine will try to find old patterns in the relationship from the state of digitize and create

17:03.180 --> 17:04.220
the equation in.

17:06.490 --> 17:15.250
And then once this equation has been created, then what happens is whenever I have any of the problem

17:15.250 --> 17:22.300
in hand, whenever I have any other input data, again, I can just provide this input data to the machine

17:22.450 --> 17:30.010
and it will apply the formula on this data and automatically provide me the output value as a prediction.

17:34.450 --> 17:42.580
Now, when it is providing me this output data is a prediction, this output data, which is very hard

17:42.580 --> 17:48.070
to value, will not be exactly the same as the value which is expected.

17:48.980 --> 17:54.500
OK, the value which we have, the value which we are predicting will not be.

17:55.480 --> 17:59.590
Exactly the same as the value which is in the real life scenario.

18:01.150 --> 18:09.730
So what do we do now, this gap, this difference between the original value and what we have predicted?

18:10.060 --> 18:14.180
So, for example, we are trying to predict the height of a person.

18:15.200 --> 18:21.650
And so when we are creating a model, we will have given some detail about the person and the height

18:21.650 --> 18:23.330
of the person to the model.

18:23.990 --> 18:31.250
So now the model will try to analyze the characteristics of the person and it will find try to find

18:31.250 --> 18:36.340
out the relationship between the characteristics and the height of the person.

18:37.490 --> 18:42.850
So it will try to do that by creating information internally.

18:43.700 --> 18:51.140
Now, while it is creating the equation internally, it means that I do think the equation to the data,

18:51.140 --> 18:52.430
which it already has.

18:53.120 --> 18:56.210
So it will try to check for the data again.

18:56.420 --> 18:59.800
So it will at one moment it will learn from the data.

19:00.620 --> 19:07.850
At one moment it will learn from this data on this value of the interest rate, and later it will try

19:07.850 --> 19:10.660
to predict this increased interest rate.

19:11.090 --> 19:13.210
So it will try to practice again.

19:13.850 --> 19:20.300
You remember the example for the book, so it will try to evaluate itself based on the questions on

19:20.300 --> 19:22.040
the back of the chapter.

19:23.140 --> 19:29.450
So this jocking from the back of the job, though, so we try to find out what is the exact sort of

19:29.480 --> 19:36.390
the question and what answer has the machine given to us and the difference between the exact answer,

19:36.400 --> 19:43.120
the actual answer and the answer, which the machine has given, is called the error.

19:44.260 --> 19:49.860
It is the editor value between the actual design and what the machine has predicted.

19:51.030 --> 19:59.970
And when we add on the edit for all the observations, because we will not have just one observation

19:59.970 --> 20:04.770
rate, we will not have just one practice question, there will be multiple practice questions.

20:05.400 --> 20:12.060
So then we will be evaluating on all the practice questions, the added value for all the questions

20:12.060 --> 20:16.310
will be summed up, and that is the cost of the model.

20:20.170 --> 20:28.090
That is called the cost of the model, which is the sum of all the ED films and.

20:30.050 --> 20:40.040
Reveals just one particular error, though, I guess, is to reduce the entire cost that invasion cost

20:40.040 --> 20:48.890
might reduce by reducing land value also or by a cumulative reduction in all the edibles.

20:49.730 --> 20:57.890
So our target is to reduce the cumulative Edwardo instead of reducing the error for just one.

20:57.890 --> 21:05.360
But because if we are reducing the error for just one value in that case, the for other values might

21:05.360 --> 21:05.940
increase.

21:06.230 --> 21:07.570
So we don't want that.

21:07.850 --> 21:10.490
We want to reduce the overall.

21:16.290 --> 21:25.580
So let us see, so we have this right is equal to a forfeit, so we have different X values.

21:25.980 --> 21:34.780
So out of those different X values, we want to find out a function which we give right as the output.

21:35.670 --> 21:41.710
So this function is actually the linear equation which we will be forming.

21:42.300 --> 21:51.480
So the equation will be doubled plus the next one plus B, that extra plus Lusby, that's been too three

21:51.750 --> 21:54.120
plus wait for export and so on.

21:56.970 --> 22:05.010
So why is actually that target value, for example, it is the price of the flight ticket or height

22:05.010 --> 22:11.100
of the person or the interest rate, which we are trying to find out, which is the fundamental value.

22:13.450 --> 22:22.510
I'm the one extra extra export, these are the independent values or the features or the attribute,

22:22.900 --> 22:28.900
which is the values which we provide so that the relationship between these values and this value can

22:28.900 --> 22:29.440
be found.

22:29.830 --> 22:31.180
So what are these values?

22:31.180 --> 22:36.510
These values could be season or decision destination, the nearest holiday date.

22:36.780 --> 22:43.480
So these kind of events which actually impact the value of the target value.

22:46.090 --> 22:55.140
And when we are having the training, they originally Devinder X values are already present with us.

22:55.390 --> 23:01.580
So we already know what the V and X values are, hence the lion X value constant.

23:04.070 --> 23:11.120
And then we try to say to various be the values we want to bring, bring the predictions closer to the

23:11.120 --> 23:12.340
actual value.

23:14.310 --> 23:17.730
Now, the value, which we predict is very.

23:18.540 --> 23:24.120
So when we subtract Vizag from the actual value, we get the added value.

23:25.140 --> 23:30.930
So we have sold in value, which we have predicted and subtracted from the actual value.

23:30.960 --> 23:32.570
And this is all the added value.

23:35.210 --> 23:42.290
So in this equation, we already know what is we already know what makes one extra extra is we only

23:42.290 --> 23:44.690
need to find out the be done all be done.

23:44.900 --> 23:46.490
That will be that revalues.

23:51.770 --> 23:59.660
So what is the process, the process will be to convert all the variables into numerical, whether they

23:59.660 --> 24:09.360
are or the that they are categorically that we will have to convert all these variables into numerical.

24:10.190 --> 24:16.720
And after that, the categorical variables should be converted into that many variables.

24:17.420 --> 24:20.190
We have to remove the outliers in the data.

24:20.450 --> 24:26.480
And in case there are any missing values, we will have to imbue those basic values by.

24:28.660 --> 24:33.020
Placing the mean value or the median value in the people?

24:34.800 --> 24:41.460
Once we have done this, the transformations made to bringing data should also reflect in the testing

24:41.460 --> 24:41.860
data.

24:42.360 --> 24:48.840
So whatever transformation we will be doing, all the training data, the same information we will be

24:48.840 --> 24:50.170
doing all the testing, the.

24:52.460 --> 25:00.020
Because the machine does not really know what the interest rate is or what the age of a person is,

25:01.070 --> 25:09.770
the machine will not understand what the machine will only understand what the numbers are.

25:10.920 --> 25:19.730
So it will simply apply the same operations on the numbers which see sees in the training data, all

25:19.740 --> 25:22.380
the lines which it sees in the testing data.

25:23.190 --> 25:29.150
So that is the reason why the transformation's, which are made in the training data, should be made

25:29.160 --> 25:37.530
in the testing data so that the formula which the machine has created should be applicable on both the

25:37.950 --> 25:38.390
six.

25:43.670 --> 25:55.190
Metrics are the types of measures which are used to evaluate the more so different type of things,

25:55.190 --> 25:59.290
that might mean square error and the mean absolute.

26:01.280 --> 26:02.810
So you mean square?

26:02.810 --> 26:07.850
And it is doing that would also mean squared error.

26:08.900 --> 26:12.160
I mean, of the absolute is.

26:12.920 --> 26:17.280
And why have the applied, the squared and the absolute value.

26:17.660 --> 26:21.740
This is because there is no detection in the.

26:23.030 --> 26:29.090
We will understand this in a while, but for now there is no specific direction in the error.

26:29.100 --> 26:35.930
That is why we use one square and the mean absolute and the goodness of the predictions would depend

26:35.930 --> 26:38.690
on the scale of the target values.

26:38.930 --> 26:45.440
So if we are trying to predict how good the values, how how would our predictions.

26:46.040 --> 26:54.260
So that is not something which will be decided on our own basis, but it will be decided on the scale

26:54.260 --> 26:54.980
of the data.

26:56.130 --> 27:03.210
So let's say we are walking with some chemicals, we are working in a chemical industry and we have

27:03.210 --> 27:05.340
to create some products based on that.

27:06.990 --> 27:14.850
So when we are working on a chemical industry and we are creating a particular kind of company or some

27:15.180 --> 27:17.880
specific kind of chemical.

27:18.120 --> 27:27.570
So in that case, even a small millimeter or small animal of the chemical can change the composition

27:27.570 --> 27:39.390
of what we think is right when we are looking in a legacy soil the packaging industry, so that if we

27:39.390 --> 27:48.060
are dealing with gallons or we are dealing with tons of salt, then small milligram of salt will not

27:48.060 --> 27:48.960
make a difference.

27:49.800 --> 27:53.950
So you see, it is depending on the target value.

27:53.970 --> 27:56.280
What exactly are you dealing with?

27:56.550 --> 28:00.080
The scale of the production will change.

28:00.900 --> 28:07.440
So if the production value is off by, what, even one million, the chemical industry, it will cause

28:07.440 --> 28:08.460
a lot of issues.

28:08.910 --> 28:17.970
But if the prediction is off five, five, five grams or 10 grams in the packaging industry, then it

28:17.970 --> 28:19.730
will not make much difference.

28:21.620 --> 28:24.860
So this is what witness prediction?

28:27.600 --> 28:35.760
So now let's say we have the actual price as a thousand nine eighty nine fifty one thousand twenty one

28:35.760 --> 28:41.430
thousand fifty, and we have predicted the prices using the formula.

28:41.610 --> 28:43.980
And these are the predicted prices, which we have.

28:46.000 --> 28:53.880
Now, these predicted prices, you can see the difference between this is the difference between diesel

28:54.250 --> 28:54.820
will be.

28:55.970 --> 28:58.460
Here, the differences here.

28:58.490 --> 29:02.650
The differences mistake here, the difference is minus Hundert.

29:03.020 --> 29:11.860
So now if I have these errors, if I add these errors, the sum comes out to be only 10.

29:15.120 --> 29:19.710
But that does not mean that the issue is of only the.

29:21.250 --> 29:29.320
The editing is actually of 371, but because these negative signs are being canceled out, the edited

29:30.400 --> 29:38.020
out the week, that this is the reason why they need to consider the absolute leader, the absolute

29:38.020 --> 29:46.450
edit will actually keep in consideration all the added values, whether it is a negative or positive

29:46.450 --> 29:46.600
it.

29:48.780 --> 29:53.710
So the main focus and predictive analysis is to reduce the overall.

29:54.900 --> 29:59.420
Not what is the overall rate of the overall and it is VITAC.

30:00.560 --> 30:07.370
This is the revenue which is generated from the function which we have applied, so the added value

30:07.370 --> 30:14.360
will be via minus by half and the sum of all the other values will be the cost of the.

30:15.540 --> 30:16.860
Model, which we have done.

30:23.310 --> 30:27.660
Now, us have a look at this, so this is the input value.

30:31.050 --> 30:39.510
This is the model which we have generated, so this input value of input into the machine learning model

30:39.840 --> 30:48.270
should have already should have generated value, but because our model is not perfect, so it will

30:48.270 --> 30:49.350
generate a value.

30:49.350 --> 30:49.710
Right.

30:51.900 --> 30:55.590
So this is the actual output and this is the predicted.

30:56.160 --> 30:57.670
This is what we have predicted.

30:57.960 --> 31:01.140
So the added value will be the actual value.

31:02.640 --> 31:09.770
And predicted values difference, so we will find out the difference between the value and the actual

31:09.770 --> 31:11.740
value, and that is the added value.

31:13.170 --> 31:21.900
And then we will add these added values after applying the absolute function or by squaring these values

31:21.900 --> 31:25.620
so that the clients do not actually cancel out the devil values.

31:27.690 --> 31:29.580
So here we have a block.

31:30.900 --> 31:37.140
This plot is between height and weight, so you can see there is a positive relationship between height

31:37.140 --> 31:40.710
and weight, so as the height increases, the breathing freezes.

31:41.040 --> 31:43.980
So we have the situation of like.

31:45.290 --> 31:46.640
Between height and weight.

31:48.250 --> 31:54.970
And then there the line intercepts that Y-axis is called the intercept.

31:56.870 --> 32:04.550
And the scale of the equation is change of five divided by the changing X.

32:04.670 --> 32:06.820
This is called the slope of the equation.

32:07.910 --> 32:12.620
So the equation will be very easy, equal to.

32:14.750 --> 32:25.250
The school into X plus that B double, so this is the creation of the line, which we have, that is

32:25.250 --> 32:35.840
why is equal to a mixed blessing where M is the slope, X is the X value, and C is the intercept where

32:35.840 --> 32:38.050
the point intercept the Y axis.

32:43.540 --> 32:48.460
In the next section, we will discuss about the cost and the function.