WEBVTT

00:01.570 --> 00:05.050
Now let us talk about descriptive statistics.

00:07.790 --> 00:17.000
Descriptive Statistics is a summary statistic that quantitatively describes or summarizes the data.

00:17.870 --> 00:25.130
So whenever we want to quantitatively summarize the data, we use the descriptive statistics.

00:25.370 --> 00:28.130
We don't want to find out something out of the data.

00:28.130 --> 00:30.230
We just want to summarize the data.

00:30.500 --> 00:33.080
This is all descriptive statistics.

00:34.630 --> 00:41.110
Now, while the scope of statistics is the process of using analyzing those statistics.

00:42.120 --> 00:48.210
So the script, the statistics in the process of using and analyzing those statistics, we are just

00:48.210 --> 00:54.990
trying to find out a summary of this in biostatistics, which we have, and we are trying to find those

00:55.230 --> 00:57.360
and analyze these values.

00:57.540 --> 00:59.610
We are not trying to do much out of it.

00:59.790 --> 01:01.670
This is called descriptive statistics.

01:01.680 --> 01:04.410
We just want to find out what type of data we have.

01:08.480 --> 01:10.190
Measures of central tendency.

01:11.380 --> 01:19.030
Mean is a calculated central value of a set of numbers to calculate a mean.

01:19.330 --> 01:26.350
We add all the numbers and then divide it by the count of the numbers, which we have added.

01:27.830 --> 01:30.920
The next measure of central tendency is the median.

01:31.700 --> 01:41.270
Median is the middle number in a sorted in ascending or descending list of numbers.

01:42.450 --> 01:46.740
The million is sometimes used as opposed to mean.

01:46.980 --> 01:52.770
Then there are outliers in the sequence that might skew the average of the values.

01:54.220 --> 02:00.070
Mode is a set of values in the values that appear the most often.

02:01.270 --> 02:03.790
Let us see what these are in-depth.

02:06.570 --> 02:09.360
So let us consider these odd values.

02:09.990 --> 02:18.390
These are sorted and we have values as one, two, three, four, five, five, six, seven, eight,

02:18.390 --> 02:19.290
nine and 12.

02:19.800 --> 02:21.870
These are total order values.

02:22.140 --> 02:26.250
The minimum value is one, while the maximum value is two.

02:26.250 --> 02:35.580
In the range will be the maximum value minus the minimum value, which is 12 -111.

02:37.260 --> 02:40.670
Now, let us go back to the definition of me.

02:41.740 --> 02:49.210
Means that we add all the numbers and divide the sum by the count of the number added.

02:50.080 --> 02:52.990
So we add all of these values.

02:53.440 --> 03:01.870
So we add one plus two plus three, four plus five plus five plus six plus seven plus eight plus nine

03:02.140 --> 03:02.740
plus two it.

03:04.270 --> 03:10.000
And then divide the sum by the total number of values which represent.

03:10.030 --> 03:16.510
So we have one, two, three, four, five, six, seven, eight, nine, ten, 11 values in total.

03:16.780 --> 03:19.270
So we divide the sum by 11.

03:20.450 --> 03:23.750
So here the mean is 5.60.

03:25.540 --> 03:29.500
Now let us find out the media as we have discussed.

03:29.800 --> 03:32.740
Median is the median most value.

03:33.750 --> 03:35.390
So that might be the most valuable.

03:35.400 --> 03:35.670
We.

03:37.510 --> 03:44.770
293847565

03:44.770 --> 03:46.390
is the middle most value.

03:47.260 --> 03:49.210
So out of 11 values.

03:49.660 --> 03:56.050
The six value comes out to be the middle most value, which is the median.

03:57.860 --> 04:00.380
The next value is more.

04:01.290 --> 04:07.170
Mode, as the definition says, is the value, which appears most often.

04:08.180 --> 04:10.760
So he is out of all of these values.

04:11.090 --> 04:14.660
Five appears twice, which is most often.

04:14.870 --> 04:16.190
So mode is five.

04:18.090 --> 04:27.450
Now please note that when we are calculating median for the old values, the median is the middle most

04:27.450 --> 04:27.810
value.

04:28.830 --> 04:34.380
I mean, is all a sum of all the values divided by the thought of the values.

04:35.820 --> 04:37.080
Right now.

04:37.260 --> 04:38.610
Let us come to the end of the day.

04:39.930 --> 04:44.070
Now, here we have all the violence which will present in these old values.

04:44.280 --> 04:48.050
And additionally, we have an outlier value hundred.

04:49.080 --> 04:57.090
This outlier value means that this value is something very different from the original sequence, which

04:57.090 --> 04:57.540
we have.

04:58.920 --> 05:03.690
So for these values, the minimum number is one.

05:04.530 --> 05:06.540
The maximum number is 100.

05:07.500 --> 05:10.860
The range will be 100 minus one, which is 99.

05:11.310 --> 05:16.920
Mode still remains five because five is the maximum opening value.

05:18.190 --> 05:24.100
Now median is the by the mean of the middle two values.

05:24.340 --> 05:29.470
So the sum of the middle most values, which is five and six.

05:29.680 --> 05:31.570
Because these are even number of values.

05:31.570 --> 05:33.490
So there will be two values in the middle.

05:34.150 --> 05:35.380
So five and six.

05:35.530 --> 05:42.400
So we add five and six and divide it by two to find out the average of these two values, which is the

05:42.400 --> 05:45.460
median of the even values.

05:47.120 --> 05:54.620
So one thing which we find out is that median of even values is some of the median values divided by

05:54.620 --> 05:54.920
two.

05:56.270 --> 05:59.450
By median of the old values is the middle most number.

06:00.440 --> 06:08.750
Now, in case of finding out the mean, we saw that when we have these 11 values out of these 11 values,

06:08.750 --> 06:15.110
when we found out the mean, it came out to be 5163, because these values are near to each other.

06:16.170 --> 06:21.660
Now, just by the presence of one outlier, the mean value has changed a lot.

06:21.930 --> 06:23.550
I've added all these values.

06:23.550 --> 06:24.780
One divided by 12.

06:25.800 --> 06:28.320
To find out the meaning of these even values.

06:28.680 --> 06:31.080
So this comes out to be 13.5.

06:32.460 --> 06:35.460
Line here, the mean was 5.63.

06:36.430 --> 06:45.780
This means that presence of even one outlier causes a very huge change in the value of meme.

06:46.880 --> 06:56.780
So we cannot really explain the sequence using the meme because when I see the meme of these values

06:57.050 --> 06:58.940
is 13.5.

06:59.660 --> 07:09.080
I am kind of seeing that the median value is hit between 12 and 100 but is not correct because the majority

07:09.080 --> 07:12.500
of the values which are lying from values with.

07:14.090 --> 07:17.420
Should have been around 5.60.

07:18.570 --> 07:25.500
So which is the reason why we cannot really use these measures of central tendency to explain much?

07:25.770 --> 07:34.590
So when the value such as the mean is calculated in presence of an outlier, it cannot really depict

07:34.830 --> 07:39.840
the measure of central tendency more so in this particular situation.

07:40.080 --> 07:46.950
We will try to explain the sequence or explain the data using median value.

07:48.360 --> 07:54.420
Because the median value will not be impacted by the outlier much.

07:55.260 --> 08:01.260
So here when an outlier was added, still the median did not change much.

08:02.130 --> 08:03.740
It is still 5.5.

08:05.050 --> 08:05.640
I'm here.

08:05.650 --> 08:06.760
The median was five.

08:07.000 --> 08:13.930
So by adding one outlier, still the median remains almost the same and did not change as much.

08:13.930 --> 08:14.650
The mean that.

08:18.520 --> 08:20.440
Now which value to use when.

08:21.490 --> 08:24.430
So mean is highly impacted by outliers.

08:24.700 --> 08:26.130
Just what we saw here.

08:28.310 --> 08:37.010
I'm median is robust against outliers, which is the reason why we when we have outliers present, then

08:37.040 --> 08:40.940
instead of using me, we can better in the data using median.

08:42.720 --> 08:46.470
And what is the value that is most likely to be some?

08:47.400 --> 08:54.240
So when we are taking a sample of a population, then the value of the mode will be something which

08:54.240 --> 08:57.240
will be something the most number of times.
