1
00:00:04,780 --> 00:00:09,880
In the previous video, we looked at how to calculate the least squares solution to estimating a constant

2
00:00:09,880 --> 00:00:10,340
vector.

3
00:00:10,990 --> 00:00:16,210
We did not use any information about the noise, apart from the fact that the expected value zero or

4
00:00:16,210 --> 00:00:17,160
it had zero mean.

5
00:00:17,680 --> 00:00:18,460
But what happens?

6
00:00:18,460 --> 00:00:21,130
We do know some information about the noise, what happens.

7
00:00:21,130 --> 00:00:26,180
We know how accurate each measurement is or how some measurements might be more accurate than others.

8
00:00:26,410 --> 00:00:28,900
How can we include this information into the solution?

9
00:00:29,150 --> 00:00:34,570
How can we use this information to calculate a better solution than if we didn't have this information?

10
00:00:35,530 --> 00:00:39,310
So now let's look at the simple example that we used last time, but let's extend it.

11
00:00:39,910 --> 00:00:43,830
So again, we have an engine and we have a temperature sensor attached to it.

12
00:00:44,530 --> 00:00:48,940
We're going to look at the relationship between the temperature and the RPM.

13
00:00:48,970 --> 00:00:53,950
So, again, from the previous video, we know that it's going to be a linear relationship that looks

14
00:00:53,950 --> 00:00:54,390
like this.

15
00:00:54,400 --> 00:01:01,690
So as the RPM increases, the temperature increases and it's going to be a function of the two families

16
00:01:01,690 --> 00:01:02,940
x1 x2.

17
00:01:03,790 --> 00:01:08,350
So we're going to take a series of measurements at different CPMs and we're going to have the measurement

18
00:01:08,350 --> 00:01:15,700
y representing the temperature at 8pm r.i so we can have a series of measurements that follow this relationship

19
00:01:15,700 --> 00:01:16,030
here.

20
00:01:16,510 --> 00:01:19,690
And of course each of the measurements is not going to be perfect.

21
00:01:19,700 --> 00:01:21,650
We're going to have some noise on each of the components.

22
00:01:21,680 --> 00:01:23,080
So this is the V.I. again.

23
00:01:25,660 --> 00:01:31,970
So we're going to estimate the premise of this line of best fit X1 next to from all this data here.

24
00:01:32,620 --> 00:01:34,870
So this is what we did on the previous video.

25
00:01:35,020 --> 00:01:37,750
But now we're going to include some extra information.

26
00:01:38,000 --> 00:01:42,000
What happens for if for each of these measurements, we actually have a bit more information?

27
00:01:42,370 --> 00:01:46,210
How about what happens if we knew how accurate each measurement might be?

28
00:01:46,240 --> 00:01:52,870
So what happens if we know that each measurement might have a certain variance of a sigma squared?

29
00:01:53,350 --> 00:01:56,570
So you can see here that each measurement might have a different uncertainty.

30
00:01:57,100 --> 00:02:02,980
So using this information, how can we use this information to come up with a better estimate of what

31
00:02:02,980 --> 00:02:04,300
the line of best fit would be?

32
00:02:04,660 --> 00:02:06,430
The solution to the Matrix would be.

33
00:02:08,100 --> 00:02:14,290
So let's put this into mathematical terms, we have the Matrix equation shown here that we want to solve.

34
00:02:14,580 --> 00:02:20,850
So why eyes are going to be the estimates matrix is going to be the model matrix, the linear matrix

35
00:02:20,850 --> 00:02:25,140
that describes how the measurements are a function of the estimates.

36
00:02:25,140 --> 00:02:29,850
So X are going to be the estimates and V.I. It's going to be the noise.

37
00:02:30,540 --> 00:02:34,860
So this is the same problem as we had last time, but we're going to include a bit more information

38
00:02:34,860 --> 00:02:35,920
in the problem definition.

39
00:02:36,480 --> 00:02:41,350
First off, again, we're going to assume that noise has your name is going to be uncorrelated.

40
00:02:41,370 --> 00:02:46,830
So this is just saying that for each of these values here, the size of one does not affect the other.

41
00:02:46,860 --> 00:02:47,900
There's no correlation.

42
00:02:47,910 --> 00:02:53,940
So if the one is large, is no reason to think that the one of two is small or large or anything that

43
00:02:53,940 --> 00:02:55,750
has to do with the noise up here.

44
00:02:56,100 --> 00:02:58,620
So each of these measurements can be considered independent.

45
00:02:59,760 --> 00:03:02,180
But now we're also going to include a bit more information.

46
00:03:02,190 --> 00:03:05,070
We know that each of the measurements is going to have a certain accuracy.

47
00:03:05,430 --> 00:03:12,400
So we know that the variance so V squared the expected value of that is going to equal our sigma squared.

48
00:03:12,420 --> 00:03:15,420
So this is going to be how much uncertainty we have on each of the measurements.

49
00:03:15,880 --> 00:03:19,950
We won't know what the measurement value is exactly, but we know how accurate it might be.

50
00:03:20,730 --> 00:03:26,460
So using this information here, we can form a covariance matrix so that we've gone through that in

51
00:03:26,460 --> 00:03:26,970
the past.

52
00:03:27,190 --> 00:03:31,710
We're going to define the covariance matrix, are adjusted covariance of the different measurements.

53
00:03:31,710 --> 00:03:37,170
And all that means is just we're going to have a diagonal matrix here and each of the diagonal elements

54
00:03:37,170 --> 00:03:40,100
is going to be the variance of that single noise measurement.

55
00:03:40,500 --> 00:03:45,060
And you can say there's going to be zeros everywhere else because each of these measurements are uncorrelated.

56
00:03:45,420 --> 00:03:47,280
So it's just a single diagonal matrix.

57
00:03:50,020 --> 00:03:53,860
So the problem equation that we want to solve again is this matrix equation here.

58
00:03:53,890 --> 00:03:58,150
So we have the measurements, we have the moral matrix, we have the estimate and we have the noise.

59
00:03:58,810 --> 00:04:03,540
We also know that the variance on each of the noise measurements is going to be equal to sigma squared.

60
00:04:03,910 --> 00:04:07,290
So each measurement can have a different variance or different uncertainty.

61
00:04:07,840 --> 00:04:13,150
And we know that the covariance matrix of all the noise or the uncertainties is just going to be equal

62
00:04:13,150 --> 00:04:18,550
to this matrix and it's just going to be a diagonal matrix with zeros on the other elements because

63
00:04:18,550 --> 00:04:20,830
there's no correlation between the different noise.

64
00:04:20,830 --> 00:04:22,270
Variables are all independent.

65
00:04:24,820 --> 00:04:30,340
Now, we know that to get a good estimate of X, we want to minimize the error residuals.

66
00:04:30,340 --> 00:04:34,870
So that's the difference between the measurements and what the predicted measurement would be if we

67
00:04:34,870 --> 00:04:36,070
had a good estimate.

68
00:04:37,120 --> 00:04:38,450
So this is the estimate.

69
00:04:38,500 --> 00:04:41,920
This is the measurement variance for each of the different measurements independently.

70
00:04:41,920 --> 00:04:47,530
And we can put them into a matrix equation again to work out the measurement residual.

71
00:04:49,390 --> 00:04:51,880
Now we're going to change the cost function from last time.

72
00:04:51,880 --> 00:04:57,460
So the cost function last time just minimize the sum of the residuals squared.

73
00:04:57,490 --> 00:05:03,660
But this time we're going to wait eight with each residual by the inverse of the measurement uncertainty.

74
00:05:03,670 --> 00:05:05,440
So I can see that we have these segments here.

75
00:05:06,190 --> 00:05:11,050
So this is basically saying the more uncertainty we have of each of the measurements, the less effect

76
00:05:11,050 --> 00:05:11,710
it's going to have.

77
00:05:12,220 --> 00:05:19,540
So if we knew this one very, very precisely, this time here is going to be weighted more heavily than

78
00:05:19,540 --> 00:05:22,000
this term over here if this was a lot larger.

79
00:05:23,140 --> 00:05:27,520
So basically saying the more certain we know, the more effect it's going to have on the cost function.

80
00:05:28,120 --> 00:05:35,740
So this least squares system here can be written out in matrix form again of the error residual matrix

81
00:05:35,740 --> 00:05:39,510
transpose times the inverse of the matrix.

82
00:05:39,520 --> 00:05:45,230
So the inverse of the uncertainty times the matrix measurement residual again.

83
00:05:45,730 --> 00:05:50,170
So this is very similar to last time, is that now we have this inverse matrix here.

84
00:05:52,420 --> 00:05:58,810
So again, to work out what our best estimate is going to be, it's just going to calculate by minimizing

85
00:05:58,810 --> 00:06:00,160
this cost function here.

86
00:06:00,640 --> 00:06:07,840
So if we minimize this cost function here with a X, what it means is going to make this X hat as close

87
00:06:07,840 --> 00:06:10,450
to being to the real value of X as possible.

88
00:06:13,890 --> 00:06:16,780
So now, like last time, let's expand the cost function.

89
00:06:17,070 --> 00:06:23,310
So this is the cost function and now we know that we can fill in our different equations for the Matrix

90
00:06:23,310 --> 00:06:23,920
residuals.

91
00:06:23,920 --> 00:06:31,280
So if we go ahead and do that, we fill in a Y minus X and again, so we can expand this whole out,

92
00:06:31,290 --> 00:06:36,390
so we can multiply each of these elements together, expand it out, and we end up with this long equation

93
00:06:36,390 --> 00:06:36,720
here.

94
00:06:37,650 --> 00:06:42,900
So this is going to look slightly different than they squares equation cost function, because now we

95
00:06:42,900 --> 00:06:45,660
have these inverse R functions inside the equation.

96
00:06:46,800 --> 00:06:51,270
So again, to find the minimum, the first step again that we want to do is we want to differentiate

97
00:06:51,480 --> 00:06:54,660
this cost function with respect to our estimate X.

98
00:06:55,690 --> 00:06:58,270
So if we do that, we end up with this equation here.

99
00:06:58,290 --> 00:07:02,010
So this is the derivative of the cost function with respect to the estimates.

100
00:07:02,800 --> 00:07:07,970
So you can see that this term here comes from these two terms here.

101
00:07:08,730 --> 00:07:11,130
This last term here comes from this last time here.

102
00:07:12,210 --> 00:07:14,970
So it's going to be the derivative of the cost function.

103
00:07:14,970 --> 00:07:18,090
And we know that this is going to be a quadratic surface.

104
00:07:18,450 --> 00:07:22,130
So there's only going to be one minimum point and that's going to be the stationary point.

105
00:07:22,470 --> 00:07:24,150
So define the stationary point.

106
00:07:24,570 --> 00:07:29,010
We set the derivative to equal to zero and then we solve it for X.

107
00:07:30,870 --> 00:07:33,330
When we do that, we end up with this equation here.

108
00:07:33,360 --> 00:07:39,030
So this looks very similar as last time, except now that we have our investors are here and invest

109
00:07:39,360 --> 00:07:39,780
here.

110
00:07:40,350 --> 00:07:45,920
So this equation is the least squares solution for when we have a weighted estimation.

111
00:07:45,930 --> 00:07:50,820
So when we have some measurement of uncertainty on each of the measurements of LLI.

112
00:07:53,430 --> 00:07:58,050
So now that we know some information about the uncertainty of the measurements, it would be good to

113
00:07:58,050 --> 00:08:02,810
now calculate how that uncertainty affects the uncertainty of the estimates.

114
00:08:02,820 --> 00:08:07,530
It'll be good to know how certain we are of the estimates so we know how much error there is in the

115
00:08:07,530 --> 00:08:08,070
measurements.

116
00:08:08,460 --> 00:08:12,660
We should be able to estimate how much error is in the estimate from from those measurements.

117
00:08:13,800 --> 00:08:17,780
So if you look at the way at least squares solution, we have a solution that looks like this.

118
00:08:17,790 --> 00:08:24,030
So the matrix here is this combined time here that we just calculated on the last slide.

119
00:08:25,140 --> 00:08:31,260
So if you multiply our errors, so if you multiply our measurements by this, a matrix, which is this

120
00:08:31,410 --> 00:08:33,900
matrix here we get the estimate of X.

121
00:08:34,560 --> 00:08:41,970
Now, if we look at this form here, we want to work out how errors in Y affect errors in X, and we

122
00:08:41,970 --> 00:08:45,030
can do this by looking at the principle of transformation of uncertainty.

123
00:08:45,330 --> 00:08:52,950
So in general, if we have an X Factor, a matrix and another effecta, if we know the uncertainty in

124
00:08:52,950 --> 00:09:00,840
X, we can transform it to the uncertainty in F by using the model Matrix A and we use this relationship

125
00:09:00,840 --> 00:09:01,150
here.

126
00:09:01,770 --> 00:09:09,150
So the covariance of X or the uncertainty of X, multiply it by our AI and then a transpose is going

127
00:09:09,150 --> 00:09:17,640
to be our uncertainty or covariance matrix in F so we can apply this process here to this matrix solution

128
00:09:17,640 --> 00:09:18,200
over here.

129
00:09:18,720 --> 00:09:24,540
So we know the uncertainty in our Y measurements is going to be our metrics.

130
00:09:24,810 --> 00:09:31,980
We can just fill in this equation so we end up with the R, the uncertainty in Y multiplied by a matrix,

131
00:09:31,980 --> 00:09:33,900
which is this transformation matrix here.

132
00:09:34,410 --> 00:09:38,760
We're going to end up with the uncertainty index and we're going to call that uncertainty and the covariance

133
00:09:38,760 --> 00:09:39,870
of X as P.

134
00:09:40,380 --> 00:09:43,410
So pay is a covariance matrix of the uncertainty.

135
00:09:44,640 --> 00:09:50,190
So if we fill in these equations here and fill it in, fill in, are we actually come up with a more

136
00:09:50,190 --> 00:09:51,510
simplified form?

137
00:09:51,780 --> 00:09:57,930
So we end up knowing that the uncertainty in X or how good the estimation, how good the estimate of

138
00:09:57,930 --> 00:10:04,380
X is, is actually just a function of the inverse of the variances of the measurements multiplied by

139
00:10:04,380 --> 00:10:08,160
our matrix of the model and take the inverse of that.

140
00:10:08,490 --> 00:10:14,580
So this is how we can work out how accurate our solution is or how good our estimate is based on our

141
00:10:14,700 --> 00:10:18,120
understanding of the noise in the system or the noise of the measurements.