WEBVTT

00:01.230 --> 00:08.180
In last session, we discussed about gaming's clustering and we had already implemented agglomerate

00:08.200 --> 00:10.280
clustering using the iris deposit.

00:10.650 --> 00:18.510
So let us continue further and use the same dataset to further implement gaming's clustering and see

00:18.510 --> 00:19.680
how it is different.

00:20.310 --> 00:28.530
So in this for this implementation, we will import gaming's from Escalon cluster after this.

00:28.650 --> 00:36.110
We have imported the latest dataset and we have already dropped the unwonted columns.

00:36.120 --> 00:39.480
That is the simple idea.

00:39.480 --> 00:40.710
And these species.

00:42.480 --> 00:51.620
After that, we have scaled the data and printed the Dendrobium and linkage for the same for the details,

00:51.630 --> 00:58.700
who can view the previous implementation of hierarchical clustering later on the clusters?

00:58.710 --> 01:04.260
And we found out that the vast number of clusters came to be.

01:10.550 --> 01:16.820
Forty three for three, so we used that particular clustering algorithm.

01:17.360 --> 01:24.410
Next, we will model the clustering model so we will find out the clusters for the end cluster equal

01:24.410 --> 01:24.980
to three.

01:25.310 --> 01:30.520
And then we printed the same and created a plot for the same.

01:30.890 --> 01:33.410
And we got this particular plot.

01:35.140 --> 01:41.990
Now for finding out the gaming's cluster, we have imported Kamins from Escalon factory.

01:42.700 --> 01:49.930
Now we are again testing for any clusters ranging from two to 20, and we are running a low on key,

01:49.930 --> 01:52.300
which is valued between two and 20.

01:52.570 --> 01:58.750
And we are running game is for each of these values, for running game means we only provide the number

01:58.750 --> 02:02.070
of clusters and the data in this data.

02:02.200 --> 02:07.800
I have provided the sample and then set Bellbird for actual implementation.

02:07.810 --> 02:16.270
You will use all the columns of a job prison because that will provide better clusters and it will be

02:16.270 --> 02:17.270
more helpful.

02:17.770 --> 02:23.830
While this is just for visualization purposes and wanted to make this a little simple for you born the

02:23.830 --> 02:24.220
son.

02:24.550 --> 02:27.130
So I have used only these two columns.

02:27.640 --> 02:35.470
So we will apply start Stopford and after playing games Rodford, I will bring the Senate scores for

02:35.470 --> 02:35.760
all.

02:36.220 --> 02:44.140
And here you can see the Senate scores game out to be these values out of which the best Sillett score

02:44.140 --> 02:47.170
is two, which is zero point four.

02:47.620 --> 02:55.090
But it also shows the second best to be the Senate score three, which is forty three point four seven.

02:55.810 --> 03:01.000
This is because we don't have we have not considered all the columns.

03:01.000 --> 03:08.140
We have considered only two columns I busway destroying two columns as not as two clusters as the best

03:08.140 --> 03:09.160
number of clusters.

03:09.550 --> 03:14.500
So let us create the clusters with the same higher level.

03:14.650 --> 03:21.640
One is the label which has been generated using the clustering and labelled is the one which is generated

03:21.640 --> 03:23.080
from Thicky means clustering.

03:23.590 --> 03:27.420
I have simply taken the values which Wychwood predicted.

03:27.430 --> 03:33.970
The labels, which were predicted using the key means clustering in the label too and the IT in the

03:33.970 --> 03:35.020
eye distinctness it.

03:36.870 --> 03:46.850
Here I have plotted the leak of the data, the plot for four separate versus a Zeppelin, I'm here,

03:46.860 --> 03:54.420
you can clearly see why two clusters are more better in comparison to three clusters in this particular

03:54.420 --> 04:03.060
scenario where we have considered only two columns and same thing is depicted here in case of the cluster

04:03.060 --> 04:04.470
created for agglomerate.