WEBVTT

00:01.810 --> 00:08.590
In this session, we will discuss about decision trees, so now we have discussed about the Linnean

00:08.590 --> 00:17.110
models, we have seen that the linear models try to create an equation of a straight line and try to

00:17.110 --> 00:19.180
fit the data out with.

00:20.100 --> 00:28.200
In case of decision trees, we create a sequence of decision nodes or a tree like a structure where

00:28.350 --> 00:36.120
we will make decisions at different point and at the end of all the decisions, we will get a clear

00:36.120 --> 00:36.610
winner.

00:36.810 --> 00:38.990
What should be the target value?

00:41.030 --> 00:42.620
So let us have a look at that.

00:45.180 --> 00:50.100
So in this particular tree, we will be having different.

00:51.690 --> 01:02.820
Values now we have values like dependent variables and the independent variables in this kind of a tree,

01:03.030 --> 01:08.040
the dependent variable is the data which is present inside the three.

01:09.870 --> 01:20.100
That is our data, which will be pointing to either a child should play or not play and the independent

01:20.100 --> 01:26.250
variables or the features or the attribute become the decision points.

01:27.580 --> 01:36.670
So each decision point will be created from the feature columns, so in this particular data, the feature

01:36.670 --> 01:43.420
column contains the outlook of the day, the humidity of the day, and if the day's windy or not.

01:45.410 --> 01:52.820
So this kind of structure would be generated and based on this structure at the end of the tree, I

01:52.820 --> 01:58.190
believe we will be getting if a child should play or not play.

01:59.680 --> 02:02.840
So let us have a look at this thing.

02:05.210 --> 02:09.280
Types of bombs, which we have in case of decision.

02:10.100 --> 02:19.050
So the first bomb is to promote the rule of law, to represent the entire population or the sample.

02:19.610 --> 02:24.350
And this further gets divided into two or more homogeneous sect.

02:26.440 --> 02:35.080
So here we have this entire three and the place where the tree begins is called the root node.

02:36.110 --> 02:47.960
This room contains all the data that is all 14 rows of data which we have and for each decision, which

02:47.960 --> 02:53.340
we think divides these 14 rows of data into smaller chunks.

02:54.260 --> 03:03.440
So these 14 rows of data on the basis of the decision node that its outlook is subdivided into five

03:03.440 --> 03:09.830
rows of data here, five rows of data in this section and another five rows of data in this section.

03:12.120 --> 03:19.530
Now, that child structure, the story structure where we have one child at the top and then a. at the

03:19.530 --> 03:26.760
bottom is called a parent child relationship, where the node above is called the parent node.

03:26.850 --> 03:30.840
And the node which is created from it, is called the child.

03:31.560 --> 03:39.060
Similarly, this is a parent node I the node which is created from it is called the child.

03:40.600 --> 03:47.050
And in the field of parent child would be this as the parent and this as the child.

03:48.130 --> 03:55.840
So let us look for the now what a splitting splitting is a process of dividing a node into two one more

03:55.840 --> 03:56.530
sub not.

03:58.260 --> 04:01.960
So here we have the first node where we have 14 rows of data.

04:02.340 --> 04:05.830
This note is subdivided into three nodes.

04:06.630 --> 04:12.540
So this node is divided on bases of one rule that there's one condition.

04:12.720 --> 04:13.980
So what is the condition?

04:14.160 --> 04:16.350
The condition is what is the outlook?

04:16.650 --> 04:20.820
Is the outlook sunny or is it overcast or is it rainy?

04:22.670 --> 04:32.960
So based on this decision, the note on the door has been subdivided, this subdivision of Naude on

04:32.960 --> 04:37.340
basis of a coalition is called splitting of the node.

04:41.140 --> 04:44.020
Now, the next is decision not.

04:44.830 --> 04:51.790
When a sub node split into four those sub nodes, then it is called the decision node.

04:52.970 --> 04:55.080
So this is the road north.

04:55.790 --> 05:04.240
Now we have another road, which is a sub node, this sub node is again divided into different Jagels.

05:04.610 --> 05:09.050
So this is subdivided into different Miloje based on one condition.

05:09.320 --> 05:17.330
Now here the condition is humidity, which is either the humidity is less than equal to 70 or the humidity

05:17.330 --> 05:18.710
is greater than 70.

05:20.080 --> 05:28.560
In this particular case, this sub node is subdivided into two sub nodes where the condition is Vendy,

05:28.800 --> 05:31.400
the decision which is being made is windy.

05:31.660 --> 05:36.490
So these two modes are known as the decision nodes.

05:38.730 --> 05:48.690
Next is the leaf node or to the node that Doolan's plate is called the need for determining node, that

05:48.690 --> 05:56.320
is the node at the very end of the tree are known as the leaf node or the dominant.

05:58.830 --> 06:05.980
Now, what is a branch or a subtree, a subsection of the entire tree is called a branch or a subtree.

06:06.780 --> 06:08.530
So this is a subsection.

06:08.790 --> 06:11.190
Again, this one is another subsection.

06:11.340 --> 06:16.020
So both of these are called branches or subtree.

06:19.050 --> 06:20.770
Next is parent and child.

06:21.030 --> 06:27.210
So, again, the node which is above the sub node, which is above, is called the betting node and

06:27.210 --> 06:31.590
the node which is created from the better node is called the child.

06:36.370 --> 06:36.820
Now.

06:37.850 --> 06:45.320
Let us have a look at the process of the decision tree generation, they now what we have seen is how

06:45.320 --> 06:46.850
the decision tree looks like.

06:47.480 --> 06:55.310
Now we need to learn about what is the process of creating a decision tree now for creating a decision

06:55.310 --> 06:55.660
tree.

06:55.850 --> 06:57.840
We need to have the rule of law.

06:58.070 --> 07:02.400
We need to have an initial node and all the rules of detainment.

07:03.230 --> 07:10.730
Now, after we have all these rules of data, we need to define one particular rule on the basis of

07:10.730 --> 07:14.060
which we will be splitting the nodes.

07:14.840 --> 07:22.030
And once we have selected this particular rule, then we will be doing this split.

07:22.340 --> 07:28.700
And after this particular split has been made, then we will again evaluate all the rules which are

07:28.700 --> 07:36.260
present and we will compare all the rules and find out the next best rule on basis of which we will

07:36.260 --> 07:37.430
be making the split.

07:39.260 --> 07:46.850
Now, how we will generate these rules and how we will compare these rules is a secondary thing and

07:46.850 --> 07:50.640
we will discuss that in some in some time.

07:51.170 --> 07:55.780
Now, let us have a look at how we will make the decision.

07:56.700 --> 08:01.970
So we will have certain rules on the on the basis of which we will be subdividing.

08:02.760 --> 08:07.170
So I'm the final stage when we have this leave node.

08:07.350 --> 08:13.950
So based on the majority values of the leaf node, we will decide if the child should play or not play.

08:14.280 --> 08:18.540
So here we have two values for play and zero values for born to play.

08:19.020 --> 08:26.580
So what we will do is we will decide if that the child should play because the majority is with play.

08:28.550 --> 08:35.180
So let us look at the process so the trees are generated by selecting the best route possible at every

08:35.180 --> 08:35.490
split.

08:35.750 --> 08:42.450
So we will have a lot of rules, we will create a lot of rules, and we will have to select the best

08:42.470 --> 08:44.750
rule to make the flick at every point.

08:45.620 --> 08:51.950
Now, there are two types of problems, which is classification, and another one is regression.

08:53.230 --> 09:00.280
Classification is where we want to find out a particular class, for example, if a loan should be approved

09:00.280 --> 09:09.040
or not, if someone will be defaulting on the loan or not, if a person should play or not, if I should

09:09.040 --> 09:10.420
be cooking today or not.

09:11.980 --> 09:14.950
So these would be classification problems.

09:15.190 --> 09:16.930
Now, what is a regression problem?

09:17.110 --> 09:20.810
A regression problem will be what is the height of the person?

09:20.980 --> 09:22.700
What is the income of the person?

09:22.900 --> 09:26.410
What should be the interest rate for a particular loan?

09:26.590 --> 09:31.510
So these are all values which are continuous in nature would be a regression problem.

09:31.900 --> 09:37.770
And Decision Tree will help us to find out solutions to the classification.

09:37.780 --> 09:39.890
Also, I do regression also.

09:40.210 --> 09:43.780
We can solve both type of problems using a decision to.

09:45.450 --> 09:47.270
Now, how will we do that?

09:47.730 --> 09:55.620
So in case of classification, we can find out either the probabilities or the hype, plus if we want

09:55.620 --> 10:01.830
to find out the probabilities, then we can find out the probability by opening the proportion of the

10:01.830 --> 10:03.150
value that the methanol.

10:03.450 --> 10:07.460
So we have these leads not so we can find out the probability of.

10:08.070 --> 10:11.670
So here the probability of playing would be one here.

10:11.670 --> 10:14.850
The probability of playing would be zero here.

10:14.850 --> 10:18.600
The probability of not playing would be one here.

10:18.600 --> 10:21.300
The probability of not playing will be zero.

10:21.600 --> 10:25.320
So this is all we can find out, the probability of the classes.

10:25.710 --> 10:28.710
Now, how can we find out the hard class?

10:28.830 --> 10:34.820
If you want to find out the actual high gas, what we can do is either we can select on the basis of

10:34.820 --> 10:35.430
the majority.

10:36.000 --> 10:39.780
So here we have two words for playing, zero words for not playing.

10:39.780 --> 10:40.770
So we would play.

10:41.280 --> 10:44.790
Here we have zero words for playing and three words we're not playing.

10:44.790 --> 10:47.070
So majority is for don't play.

10:48.810 --> 10:56.180
Next is applying the cutoff on the probability score, so many will be finding out the probability value.

10:56.310 --> 10:58.080
So we will apply a cutoff.

10:58.230 --> 11:02.410
You see, we can have a cutoff of, say, 60 percent.

11:02.580 --> 11:07.050
So if the probability is zero point six, then go out to play.

11:07.050 --> 11:10.880
If the probability is less than zero point six, then don't play.

11:11.010 --> 11:12.690
So we can do something like that.

11:14.310 --> 11:21.540
Now, in case of regression, so in case of regression, what we do is we make the decision and it is

11:21.780 --> 11:26.550
dependent on the average of the target value, either Dominello or leave.

11:27.120 --> 11:32.100
Now, in case of classification problem, we have values as play.

11:32.100 --> 11:35.130
I don't play present in the leave.

11:37.670 --> 11:46.400
But in case of a problem of regression in the lymph node, we will have continuous values.

11:47.510 --> 11:54.710
We will have values of, let's say, 10 rows of data or 20 rows of data, which will have different

11:54.710 --> 11:59.680
interest rate values or different salani values or different height values.

11:59.960 --> 12:03.720
Now, based on all of those different values which are present here.

12:04.010 --> 12:11.720
We will be taking an average of all of those values and that average will actually be the result of

12:11.720 --> 12:12.890
this decision that.

12:17.480 --> 12:25.790
Now we need to create a decision tree and we will be creating this decision tree using a lot of splitting,

12:25.790 --> 12:28.020
which will be doing now.

12:28.250 --> 12:36.650
We cannot keep doing this splitting endlessly, because if we keep on splitting the decision tree,

12:36.860 --> 12:40.920
then at the end we will have a very huge decision day.

12:41.210 --> 12:51.320
With all the rules of data in a different block of liefooghe, each data rule will convert into one

12:51.320 --> 12:59.030
leaf node and the tree will be very large in structure and also it will become very complex in nature.

12:59.540 --> 13:06.110
And the forest guideline, which we have for machine learning, is that the simpler, the better.

13:06.740 --> 13:16.190
We will always decide on going towards a simpler model in comparison to a complex model because a complex

13:16.190 --> 13:18.310
model tends to overfit.

13:18.950 --> 13:26.030
This is the reason why we will choose a simpler model and hence we will have to stop splitting at one

13:26.030 --> 13:26.800
point of time.

13:28.260 --> 13:34.650
Now, hi, when will we stop this bleeding and how we will decide when should we stop?

13:34.650 --> 13:42.180
The thing is based on a few great ideas and a few set of conditions, which we have already decided.

13:42.810 --> 13:44.430
So what are those conditions?

13:44.620 --> 13:51.080
We will have a look at them once we actually know that, how we will start creating the decision to.

13:54.310 --> 13:58.510
Now, let us have a look at the Roy Creation's.

14:00.690 --> 14:08.220
So for creating rules for the new music video, we will split up then the ranges.

14:10.640 --> 14:17.490
So we will have to split up the random positions, so let's say we have a numeric variable each.

14:18.170 --> 14:21.720
Now this numeric variable, each can have any value.

14:21.920 --> 14:24.320
See five fifteen twenty four.

14:25.100 --> 14:27.480
Twenty nine fifty six hundred.

14:27.500 --> 14:29.300
Anything would be of value of each.

14:29.810 --> 14:35.900
Now there is no specific range or any specific distance on basis of which we will see.

14:36.950 --> 14:43.150
Now here you can see the rules are like if the humidity is less than equal to 70 or greater than 70.

14:43.430 --> 14:50.780
So there is any value can be created, converted into a rule, but the rule will be created randomly

14:51.500 --> 14:52.850
so it is not fixed.

14:53.270 --> 14:59.630
We will have a rule at zero, then under the rule at five, then next rule as then, then 15, then

14:59.630 --> 15:00.700
20, then 25.

15:01.100 --> 15:03.050
No, that is not a criteria.

15:03.200 --> 15:11.730
We can have one rule at age greater than 10 and under the rule at age less than 15 and next at 16.

15:12.410 --> 15:20.210
So there is no specific criteria or no specific range or interval which is present in creating a rule

15:20.360 --> 15:22.700
out of the numeric variables.

15:23.300 --> 15:26.840
Any value can be used for generating this rule.

15:28.780 --> 15:35.770
Now, let us create a rule for categorical video we have seen for numeric video, but the next rule

15:35.770 --> 15:38.650
will be generated on the categorical variable.

15:38.680 --> 15:39.850
How do we create the rule?

15:39.850 --> 15:45.890
Uncategorically, video will be created by splitting on the value created then zero point five.

15:46.210 --> 15:54.970
So let's say we have one categorical column as gender, another categorical column as Sipi Wichai values,

15:54.970 --> 15:58.250
let's say New York, Paris and Moscow.

15:58.720 --> 16:07.560
Now we can avoid the categorical column into numerical column by the creation or by one Hawtin pooling.

16:07.930 --> 16:11.680
So the gender column has been converted in the male column.

16:11.680 --> 16:18.250
Or we could have created a female column and if the value is zero, then we are saying it is a female

16:18.400 --> 16:21.510
and if the value is one, we are saying that it is Amedi.

16:22.920 --> 16:29.580
So the condition which we can evaluate here is if the value of milk is greater than zero point five,

16:30.450 --> 16:33.590
then it would mean that we are pointing towards me.

16:34.170 --> 16:37.320
Otherwise it would mean that we are pointing towards Fehmi.

16:39.410 --> 16:48.050
Similarly, we can create less city and New York, greater than 50 divide is greater than zero point

16:48.050 --> 16:48.330
five.

16:48.560 --> 16:54.440
So now, again, the value will be compared with the city value because it is one.

16:54.440 --> 16:56.980
So it will mean that we are pointing towards New York.

16:57.290 --> 17:00.740
If it is Paris, then we are pointing towards one.

17:00.990 --> 17:03.470
So this is accordingly evaluated.

17:05.670 --> 17:13.440
Now, the next thing which we will be discussing about is the splitting, so we will discuss about splitting

17:13.440 --> 17:19.400
and how we will do the splitting and different rule generation for the split in the next session.
