1
00:00:01,100 --> 00:00:09,260
‫In this video we will discuss about the bias media straight off so as I told you indeed a strange split

2
00:00:09,320 --> 00:00:10,370
‫lecture.

3
00:00:10,560 --> 00:00:19,140
‫Our agenda is to find the model that lowest test at a now fundamentally there are three contributors

4
00:00:19,260 --> 00:00:28,230
‫to the expected test added these three contributors are called variants bias and the variance of errata

5
00:00:28,710 --> 00:00:38,350
‫which is represented by either this third term comes from the fact that there is some inherent randomness

6
00:00:38,470 --> 00:00:48,190
‫in the process and the given sample observations also do not follow the intended true function.

7
00:00:48,300 --> 00:00:56,010
‫So this is an irreducible error and since we cannot do much about it we will not focus on it will focus

8
00:00:56,010 --> 00:01:05,060
‫on these two other terms and let us talk about them one by one towards variance variance refers to the

9
00:01:05,060 --> 00:01:07,810
‫amount by which effort change.

10
00:01:07,970 --> 00:01:16,390
‫If we change our training dataset and bias therefore to that part of it which is introduced by approximating

11
00:01:16,390 --> 00:01:26,380
‫a complicated real life relationship with a simpler model so let's look at them one by one so as I told

12
00:01:26,370 --> 00:01:32,590
‫you variance therefore to the amount by which the predicted function would change if I change my training

13
00:01:32,600 --> 00:01:40,570
‫dataset if you remember when we talked about simple linear regression I told you that there is this

14
00:01:40,660 --> 00:01:47,170
‫true population line even by this recall which is the best line.

15
00:01:47,170 --> 00:01:55,880
‫If we were putting the line on the whole population but when we are putting it on a sample the sample

16
00:01:55,880 --> 00:02:02,540
‫regression line is different from the population regression line and if my sample data changes the sample

17
00:02:02,540 --> 00:02:05,250
‫regression line also changes.

18
00:02:05,310 --> 00:02:08,750
‫So basically variance is capturing the part of error.

19
00:02:08,960 --> 00:02:17,320
‫Is just coming from that particular sample so if we have two models one of them is more flexible than

20
00:02:17,320 --> 00:02:26,020
‫the other which one will have more variance and the more flexible method we'll be trying to that each

21
00:02:26,020 --> 00:02:33,090
‫and every point even if I change one or two points it will give all the completely different than the

22
00:02:33,090 --> 00:02:36,910
‫triple function to accommodate this small change.

23
00:02:36,910 --> 00:02:40,600
‫This means that more flexible methods of high variance

24
00:02:43,260 --> 00:02:45,320
‫This is shown graphically as well.

25
00:02:45,420 --> 00:02:47,040
‫This first graph on the left.

26
00:02:47,640 --> 00:02:55,840
‫We are trying to predict this relationship with a straight line straight line is a very less flexible

27
00:02:55,880 --> 00:03:01,120
‫method even if I change one or two data points.

28
00:03:01,370 --> 00:03:10,730
‫This Blue Point these lope and the intercept of this line will not change as much however if you look

29
00:03:10,730 --> 00:03:13,450
‫at the function on the date.

30
00:03:13,490 --> 00:03:20,800
‫If I change even one or two points on this go the predicted output function will be very different.

31
00:03:23,070 --> 00:03:26,910
‫So you can see that the variance is very high.

32
00:03:27,200 --> 00:03:36,810
‫If the flexibility in the covers I seem more flexible the matter I will be the variance this phenomenon

33
00:03:37,290 --> 00:03:39,400
‫of following the data too closely.

34
00:03:39,540 --> 00:03:46,980
‫As you see in the right graph that we are even following the error in the observations is called overweighting

35
00:03:48,220 --> 00:03:49,130
‫when we overdo it.

36
00:03:49,420 --> 00:03:55,980
‫We do get low training error but the test data increases.

37
00:03:55,990 --> 00:04:04,150
‫Now let's talk about bias bias refers to that part of data which is introduced by approximating a complicated

38
00:04:04,150 --> 00:04:07,720
‫real life relationship with a simpler model.

39
00:04:07,750 --> 00:04:15,270
‫For example we may be trying to fit a linear model between dependent and independent variables where

40
00:04:15,270 --> 00:04:17,390
‫a linear relationship is highly unlikely.

41
00:04:18,790 --> 00:04:26,180
‫You can see in this graph the points can never be fitted with a straight line much still but still if

42
00:04:26,180 --> 00:04:30,410
‫we select a linear model it is always going to have some error.

43
00:04:30,740 --> 00:04:33,080
‫And that part of it is called bias

44
00:04:35,750 --> 00:04:39,550
‫and how is bias related flexibility of model.

45
00:04:39,580 --> 00:04:46,750
‫You can see that linear model which is less flexible is enabled offered this data if I increased flexibility

46
00:04:46,870 --> 00:04:51,380
‫and allow it to go then it will better fit the point.

47
00:04:51,460 --> 00:04:59,230
‫So generally if we increase flexibility the bias error reduces so you can see where the bias variance

48
00:04:59,240 --> 00:05:08,360
‫tradeoff is coming from as we increase flexibility error due to variance increases and error due to

49
00:05:08,360 --> 00:05:17,840
‫bias decreases although we want to decrease both but when we try to decrease one the other one starts

50
00:05:17,840 --> 00:05:27,160
‫to increase so the challenge is to find that point where this summer's minimum this is depicted graphically

51
00:05:27,160 --> 00:05:27,430
‫here.

52
00:05:28,340 --> 00:05:34,640
‫This orange line is showing us the variance which is increasing with flexibility and this blue line

53
00:05:34,760 --> 00:05:42,850
‫is for bias which is decreasing with flexibility and this red line it is some of the of these two others.

54
00:05:43,100 --> 00:05:49,870
‫We want to find this minimum point where the sum is the minimum although we will not be able to compute

55
00:05:50,050 --> 00:05:52,690
‫bias and variance what our model.

56
00:05:52,690 --> 00:05:59,350
‫This concept will be used when we will be comparing different models and the potential accuracy in predicting

57
00:05:59,350 --> 00:06:00,400
‫dependent variables.