WEBVTT

00:01.220 --> 00:08.630
High in this session, we will be working with neural network implementation, so let us begin with

00:08.630 --> 00:09.130
the food.

00:09.500 --> 00:15.390
So here we have the import statement.

00:15.480 --> 00:21.800
So first of all, we will import Binder's as we can also import numpties and be.

00:26.680 --> 00:33.430
And after doing the board, we will work on the details of this particular dataset, which I have selected

00:33.730 --> 00:43.210
is a data set of spying, and this will basically help us find out the details of who should be having

00:43.210 --> 00:46.860
our lower back pain and who should not be having a lower back pain.

00:47.140 --> 00:49.790
So it has data related to the same.

00:50.020 --> 00:56.440
So we will import data and read the data by using the CSP function.

00:56.680 --> 01:02.200
And the data underscores fine is the data frame which has been activated.

01:02.470 --> 01:04.930
Now let us have a look at the data set.

01:05.080 --> 01:10.070
So this data set has three hundred and ten rows and 14 columns.

01:10.300 --> 01:18.340
Now we don't really know the levels of the columns, so these it has around 12 columns.

01:18.520 --> 01:26.320
Apart from that, it has class attribute, which shows values like if it is abnormal or normal, and

01:26.320 --> 01:32.580
there is a last column which is unnamed, which does not have much details about that.

01:32.830 --> 01:37.520
And it is saying that the prediction is done by using binary classification.

01:37.780 --> 01:39.550
So now let us go for the.

01:41.400 --> 01:49.440
So from this fine dataset, we will remove the following one and unnamed column 13.

01:49.770 --> 01:55.110
So this column one and column named 13 will be removed.

01:55.770 --> 02:00.230
Now we have data from underscored spine.

02:00.480 --> 02:09.150
So this will have all the details from column two to column number 12, along with the class attribute,

02:09.390 --> 02:12.990
which is declasse Liebl, actually, which we want to predict.

02:13.230 --> 02:18.400
So here we want to predict if the spine is normal or abnormal.

02:19.230 --> 02:24.750
So the first thing that we will have to do is we will have to convert this into a dummy variable where

02:24.870 --> 02:28.470
we will change normal and abnormal to zero and one.

02:29.490 --> 02:34.090
So all the data which we have is numeric in nature.

02:34.320 --> 02:42.630
So all we will have to identify here is if there is any specific correlation is present in this particular

02:42.630 --> 02:43.200
dataset.

02:43.470 --> 02:49.000
So for that, again, we have already stated what other methods you can apply.

02:49.020 --> 02:50.970
You can use find us profiling.

02:51.180 --> 02:55.330
You can use VIFF calculator, you can use correlation matrix.

02:55.330 --> 03:03.720
So any of those could be used for the if you want to use you can use feature importance from random

03:03.720 --> 03:04.380
forest.

03:04.500 --> 03:08.550
So there are well established methods which we have discussed.

03:08.700 --> 03:12.090
So you can use any of those methods which you want.

03:13.750 --> 03:18.110
So now we have this data set and now it has 12 columns.

03:18.430 --> 03:20.970
Now these are the headers which we have.

03:21.160 --> 03:24.810
So we already have the headers for these data.

03:24.910 --> 03:29.910
So we will apply the headers by doing the underscores my DOT columns.

03:30.460 --> 03:35.140
So and then we will assign the header stored until we get the header values.

03:36.280 --> 03:45.850
So it does pelvic incidents, pelvic tilt, then different films related to biology are present.

03:45.870 --> 03:48.370
And finally, the class which you want to predict.

03:49.380 --> 03:55.710
The next thing which we will be doing is we will be checking if there is any data which is not present

03:55.710 --> 03:56.670
or is not.

03:56.910 --> 04:00.950
So we are taking the data from the phone records.

04:00.960 --> 04:01.370
Fine.

04:01.800 --> 04:06.630
We are checking if there are any rules where the spine is not.

04:06.870 --> 04:13.350
And we are checking if there is any of these in axis equal, the one that doesn't any column which has

04:13.620 --> 04:17.210
null value in it and we are getting the go offered so it gives out.

04:17.630 --> 04:20.610
So there are no null values in this particular dataset.

04:21.850 --> 04:29.110
So now we will check the data types so you can easily see that all the data is flawed, so we need not

04:29.110 --> 04:30.320
do anything about it.

04:30.550 --> 04:35.080
The only thing which we need to take care of is the class.

04:35.470 --> 04:44.530
So we will import preprocessing and apply preprocessing dot label in order and then transform the class

04:44.830 --> 04:45.730
column here.

04:46.750 --> 04:54.740
So we can see the top then samples, so we are doing the sample and getting the values out of it.

04:55.000 --> 05:00.890
So this is one function which will allow us to take the values in a sample form.

05:01.150 --> 05:09.190
So here you can see the glasses have me created and you can see the glasses which have been generated

05:09.190 --> 05:09.510
here.

05:10.150 --> 05:12.930
So the class values have changed to one and zero.

05:14.800 --> 05:22.030
Now, let us find out the correlation value so we can simply find out the correlation using dot seawater

05:22.300 --> 05:23.330
for correlation.

05:23.590 --> 05:29.250
So these are the values for correlation and we can find out the correlation values.

05:29.590 --> 05:33.530
So let us plot these correlation values.

05:33.560 --> 05:40.350
So here you can see there are no light colored correlation values, which is fine.

05:40.360 --> 05:43.230
So we don't have any highly correlated values.

05:44.350 --> 05:49.300
If we look at the like that and so here we can see there are certain.

05:50.560 --> 05:56.120
Values which are going towards minus four, which is, again, not a critical value.

05:56.260 --> 05:58.470
So we are going to go here.

05:58.480 --> 06:06.010
We don't have any highly correlated problems present so we can go ahead without deleting any particular

06:06.010 --> 06:06.630
columns.

06:07.060 --> 06:10.720
So we are importing the three best split now.

06:11.050 --> 06:15.100
So we have to get the X and Y data frame separated.

06:15.340 --> 06:22.260
So X date of frame will be separated by the ifs fine from the class column and invite.

06:22.270 --> 06:24.760
We will keep only the class column.

06:26.460 --> 06:34.950
Now we will check the data again, so we have all the columns except for the last column, so let us

06:34.950 --> 06:40.830
check that if it is fine so you can see that the glass column is missing from here.

06:40.840 --> 06:42.210
So it is perfectly fine.

06:42.480 --> 06:49.290
And in Divided of the Glass column, should be the only one percent which is rightly shown here.

06:49.680 --> 06:51.350
Now we will split the data.

06:51.360 --> 06:54.000
So we split the data using the split.

06:54.210 --> 06:59.300
So we provide the X and Y data frame and deepest slice, which we want.

06:59.460 --> 07:06.480
We want that size to be on the zero point too, that this 20 percent, because we don't have enough

07:06.480 --> 07:07.430
rules of data.

07:07.620 --> 07:12.540
So we would like to have a majority share of our data for testing.

07:12.900 --> 07:20.080
Now, one thing to note here is, let's see, we have a very large amount of data.

07:20.430 --> 07:22.740
Let's see if I have an.

07:23.690 --> 07:32.170
One hundred thousand rows of data then I could have guessed even a very low amount of data for this,

07:32.400 --> 07:39.440
but maybe 10 percent or maybe five percent because I want to bring my model 50.

07:39.480 --> 07:40.670
So that is the target.

07:40.680 --> 07:46.100
If the data size is very huge, we can reduce the size of testing data.

07:46.320 --> 07:51.120
And again, if the testing if the data size is very small.

07:52.460 --> 07:59.360
Then again, we will like to reduce the size of our testing data and keep more of the data for training

07:59.370 --> 08:07.640
focus, if the size is medium for the data said, you can keep around a ratio of 70, 30 or seventy

08:07.640 --> 08:07.970
five.

08:07.970 --> 08:08.560
Twenty five.

08:08.570 --> 08:10.050
That is completely up to you.

08:10.280 --> 08:18.850
So just make sure that you have enough data points to to train, also enough data points to test it

08:18.860 --> 08:19.340
also.

08:21.480 --> 08:29.900
So we split the data, so after splitting, we get extreme X test by train and rightest.

08:31.020 --> 08:39.450
So now we will import the multileveled Perceptron now this multiplayer Perceptron is nothing but the

08:39.450 --> 08:41.520
neural network which we will be using.

08:42.270 --> 08:49.420
Is it given the name MLT because it is simply multilayered neural network.

08:49.440 --> 08:54.470
So it is just another name which is present for multilayered neural network, which is multilayered

08:54.480 --> 08:55.350
Perceptron.

08:55.740 --> 09:01.570
So Perceptron is of neural network within which all nodes are connected with each other.

09:01.770 --> 09:06.800
That is why we are considering this as a multilayered Perceptron.

09:06.900 --> 09:15.410
If we were supposed to do some changes, which will be a part of a higher topic under the word opaque,

09:15.570 --> 09:22.230
so we can take that up little and like those could be studied later, but that is out of the scope of

09:22.230 --> 09:22.890
this course.

09:23.070 --> 09:30.000
So for now, what you can understand this is that MLP, that this multimeter percent is nothing but

09:30.540 --> 09:33.060
a fully connected neural network.

09:35.250 --> 09:44.040
So in Escalon Metric, we are the driving, the musical and from selection, we are picking up the randomizer

09:44.040 --> 09:46.330
TV because they just want to try.

09:46.350 --> 09:47.950
Only a few combinations.

09:47.970 --> 09:48.150
You.

09:50.480 --> 09:58.790
So I'm giving the barometer's so the parameters are great if I want to have different variants in my

09:58.790 --> 09:59.490
learning rate.

09:59.840 --> 10:07.730
So these are if I want the learning to be constant, if I want my learning, they be changing.

10:07.970 --> 10:14.000
And if I want by learning to be actually adaptive to how my learning is going.

10:14.000 --> 10:15.510
So I choose adaptive for that.

10:15.920 --> 10:18.340
These are the hidden layer sizes.

10:18.560 --> 10:22.070
What this means is this is a couple which I'm giving it.

10:23.170 --> 10:31.180
So this means that the forest here, the left will have five nodes, the next Hitler will have their

10:31.180 --> 10:34.330
nodes and the next to the left will have five.

10:36.190 --> 10:40.490
So how will this look like so let me show this to you.

10:41.080 --> 10:43.270
So it will look something like this.

10:43.360 --> 10:44.800
Apologies for the drawing.

10:45.350 --> 10:55.390
It will have 12 in less than 12 input points, one output point, then five nodes in the layer one,

10:55.870 --> 11:00.390
then nodes in the layer two and five nodes in the layer three.

11:00.550 --> 11:05.110
And all of these points, all of these points will be connected to these points.

11:05.470 --> 11:11.110
The all of layer one point will be connected with the layer to point.

11:11.410 --> 11:16.300
All the layer to points will be connected with the layer three points and so on.

11:18.090 --> 11:20.800
Now, let us go further with the training.

11:20.820 --> 11:25.650
So these are different features, so we have four values.

11:25.650 --> 11:28.320
So these are different values which we will be having.

11:28.890 --> 11:36.330
This is the activation functions which we are having logistic bendu and then we adjust to creating the

11:36.330 --> 11:43.560
object of my dealer, Perceptron, and creating an object of randomizer TV in which we have provided

11:43.560 --> 11:50.430
the model name, the number of models we want to select out of these then the cross-validation which

11:50.430 --> 11:51.390
we want to have.

11:52.890 --> 11:58.950
I'm the scoring method, which you want to use, then we have done random search dog search, so it

11:58.950 --> 12:03.320
is fitting five candidates out of all the combinations that are present here.

12:03.670 --> 12:10.400
You can run a grid search also so that you get an extensive result, a better result out of this.

12:10.920 --> 12:14.390
So here you can see these are the results which I have got.

12:14.400 --> 12:15.980
Let me get the best estimate.

12:16.440 --> 12:24.360
So my best estimate has the activation value logistic and for value zero point one that says auto B.W.

12:24.360 --> 12:25.450
one zero point nine.

12:25.740 --> 12:27.110
And these are different values.

12:27.360 --> 12:30.860
So for this, I will simply read the report.

12:30.870 --> 12:37.740
So the best model has accuracy level of zero point eight seven six.

12:39.130 --> 12:50.050
And this what is Elby If Jesus is adaptive, Hitler is 21, Martin, and Alpha value is zero point one,

12:50.050 --> 12:52.390
activation for action is logistic.

12:54.620 --> 13:02.780
So you can see that this is the best performing model and the worst one is zero point six nine percentage

13:02.780 --> 13:10.220
atkins', so now I will simply get the details with the model again and I will make the predictions

13:10.220 --> 13:13.700
and show you the predictions in the dorms of crosstab.

13:14.820 --> 13:20.550
So this is the cross matrix for this, so you can see.

13:21.770 --> 13:30.260
These are the confusion matrix deal, so you can see that Ford predicted values and actual value one.

13:31.820 --> 13:36.090
This is forward and these are different values which have been theater.

13:36.440 --> 13:40.000
Now let us have a look at the classification report.

13:40.030 --> 13:48.650
So here the precision for a precision is zero point nine legal is zero point seventy three point seven

13:48.650 --> 13:49.030
five.

13:49.490 --> 13:53.280
And if one score is zero point eight four point sixty.

13:53.600 --> 14:00.170
Now, as you can see, the precision value is zero point nine zero and for one it is zero point five

14:00.170 --> 14:00.470
five.

14:00.480 --> 14:05.380
So it is biased towards one zero plus zero.

14:05.390 --> 14:11.620
So you can change the model and train it again so that you can get a better result.

14:11.720 --> 14:17.240
So for that, probably you can use Garozzo TV so that obtain better results from it.

14:17.630 --> 14:25.610
So this is just an implementation which is not really fine tuned and in case of neural networks.

14:26.640 --> 14:34.090
You will have to try different combinations of different layers and see how it works out for you.

14:34.380 --> 14:37.280
So that is the process which we have for neural networks.

14:37.290 --> 14:42.630
You will have to try different combinations and find out the one which works the best.

14:44.450 --> 14:44.960
Thank you.