WEBVTT

00:01.200 --> 00:06.900
Now, let us discuss about the third project that you need to create, which is the unsupervised cloning

00:06.900 --> 00:07.340
project.

00:08.380 --> 00:13.840
In this particular project, you will be finding out different clusters of data.

00:14.830 --> 00:19.580
So the data which we have is that the market segmentation data.

00:19.990 --> 00:25.210
So in this particular case, you can use any kind of clustering algorithms.

00:25.210 --> 00:29.890
You can use game means you can use the rescan or you can use hierarchical clustering.

00:30.890 --> 00:37.520
In this, the problem statement is that interacting with strangers on social networking estimates has

00:37.520 --> 00:44.690
become a rite of passage for teenagers around the world, and many millions of teenage customers use

00:44.690 --> 00:52.610
this site and have attracted attention for marketers struggling to find an edge in an increasingly competitive

00:52.610 --> 00:53.080
market.

00:53.270 --> 00:59.070
So they are trying to find out what is of interest for these students.

00:59.090 --> 01:05.240
I'm finding out segments of students in what in the market and what they would be like.

01:06.050 --> 01:12.200
So for this, we have this particular data set where we have the graduation year of the student, the

01:12.200 --> 01:19.430
gender of the student age, the number of friends they have on Encinas and their likings like boyfriends,

01:19.430 --> 01:23.980
basketball, football, soccer, so softball, volleyball and so on.

01:24.260 --> 01:27.530
So there are a lot of interests which these students have.

01:27.800 --> 01:35.780
And based on these, these people are trying to segment these students into different market customer

01:35.780 --> 01:36.300
segments.

01:36.890 --> 01:39.060
So this is what you have to implement.

01:39.380 --> 01:46.970
Now, the first thing which you can do is you can find out the number of clusters by implementing gaming's

01:46.970 --> 01:58.430
or clustering or DV Skåne, or you can apply Bazzani and visualize this data on two components using

01:58.430 --> 02:02.030
DME and find out how many clusters are formed.

02:03.070 --> 02:10.210
And based on the number of clusters that you get from this Disney, you can actually decide on the number

02:10.210 --> 02:16.620
of clusters or you can simply use the elbow method in case of game means I'm used to seeing.

02:17.350 --> 02:21.450
Now, the choice of number of clusters completely depends on you.

02:21.700 --> 02:29.430
So you need to find out the best clusters form and you will be using the solid score for that.

02:29.680 --> 02:36.790
So you can use good index and find out how good the clusters are formed and compare different models

02:36.790 --> 02:40.240
comparing different implementations and.

02:41.390 --> 02:48.560
Create the project, so I hope you would have learned a lot from this unsupervised learning sessions

02:48.740 --> 02:55.340
and you will be able to create this project very easily as the implementations for unsupervised learning

02:55.340 --> 02:57.800
are very simple and straightforward.

02:58.130 --> 02:59.720
So thank you.

02:59.760 --> 03:07.370
And you can use these datasets on all the data sets which have been provided in different ways and use

03:07.370 --> 03:14.530
them for either classification regression or unsupervised learning and practice a lot.

03:14.840 --> 03:22.580
And once you have been practicing using these datasets, there are a lot of data sets present online

03:22.580 --> 03:26.060
in GANDIL or any other data universities.

03:26.210 --> 03:29.420
So all of these data sets are present online.

03:29.450 --> 03:35.150
So I hope you will try those on practice a lot using these.

03:36.220 --> 03:45.100
So as an ending note, I would like to say that this entire course was curated for you so that you can

03:45.100 --> 03:49.540
learn machine learning from the very beginning to the very end.

03:50.200 --> 03:55.930
You don't have to get back again to go through these documents again and again.

03:56.170 --> 03:58.080
I want you to learn all of this.

03:58.090 --> 04:04.570
I'm known for Fullwood and learning about new topics instead of going back again.

04:04.580 --> 04:12.190
That is why I have explained everything here from the deep mathematics behind all the algorithms to

04:12.190 --> 04:13.150
the statistics.

04:13.310 --> 04:19.150
So I hope you have gained a lot from this particular obsession and this.

04:19.960 --> 04:20.710
Thank you.

04:20.960 --> 04:23.520
I have a great time learning.

04:23.890 --> 04:24.460
Thank you.
