1
00:00:03,970 --> 00:00:09,130
We have looked at using random variables individually in the past examples, so now we're going to look

2
00:00:09,130 --> 00:00:12,820
at the concept of using multiple random variables in a single experiment.

3
00:00:13,700 --> 00:00:17,900
So let it be a random variable with a PDF of our effects here.

4
00:00:18,260 --> 00:00:26,090
Also, let let's why be a random variable with a PDF of F why here it is now possible to define a PDF

5
00:00:26,090 --> 00:00:32,930
for the joint probability that is a probability of X and Y, and we define this as a probability density

6
00:00:32,930 --> 00:00:37,700
function of F, X, Y or for shorthand notation.

7
00:00:37,700 --> 00:00:41,660
We can just write it as F with X and Y being the parameters of the input.

8
00:00:42,050 --> 00:00:48,100
So we automatically know that this is a probability density function for random variables X and Y,

9
00:00:48,560 --> 00:00:50,870
so we can either use either of these annotations.

10
00:00:52,190 --> 00:00:58,400
So just like in basic probability, the joint probability gives us the probability of X and Y happening

11
00:00:58,400 --> 00:00:59,220
at the same time.

12
00:00:59,570 --> 00:01:04,970
So from the probability density function, the joint one, we can work out the joint probability and

13
00:01:04,970 --> 00:01:08,660
we can just do this by using our probability equation that we had before.

14
00:01:09,080 --> 00:01:12,780
But we're going to extend it to being a function of two variables.

15
00:01:13,190 --> 00:01:21,530
So now we have our ingroup from our limits and B for the X variable and we integrate the function with

16
00:01:21,530 --> 00:01:28,130
respect to X, and then we also integrate the function again with respect to Y between C and D to get

17
00:01:28,130 --> 00:01:29,270
the limits for C and D.

18
00:01:29,660 --> 00:01:35,030
So this joint probability equation here gives us the probability that the random variable X is between

19
00:01:35,180 --> 00:01:41,580
limits A and B, with the joint probability that Y is also between limits C and D.

20
00:01:42,170 --> 00:01:47,990
So this is just an extension of our equation to work out the probability from the density function using

21
00:01:47,990 --> 00:01:48,610
the integral.

22
00:01:48,830 --> 00:01:53,630
But we now we have to do a double integral because we got two parameters that we want to integrate over.

23
00:01:55,510 --> 00:02:00,790
We can also work out our marginal density functions so we have a joint probability density function

24
00:02:00,790 --> 00:02:06,850
for random variables, X and Y, if we integrated over the opposite variable, we can get the probability

25
00:02:06,850 --> 00:02:08,950
density function for the other variable.

26
00:02:09,280 --> 00:02:15,580
Sorry to work out our density function for a random variable X, we have to integrate our joint probability

27
00:02:15,610 --> 00:02:18,330
over Y and to do the opposite.

28
00:02:18,340 --> 00:02:24,100
If you want to work out our density function for the random variable, why we have to do our indefinite

29
00:02:24,100 --> 00:02:27,400
integral over the function with respect to X.

30
00:02:29,370 --> 00:02:33,990
We can also have a look at the expected value for when we're using a joint probability density function,

31
00:02:34,470 --> 00:02:39,780
so we recovered in the past that the expected value for when we're using a single random variable is

32
00:02:39,780 --> 00:02:41,180
just this equation change here.

33
00:02:41,220 --> 00:02:45,260
So so we have a function of X and we want to work at the expected value of it.

34
00:02:45,900 --> 00:02:53,130
The answer this is just simply the integral of that function as a function of X multiplied by the probability

35
00:02:53,130 --> 00:02:56,730
density function of X integrated, all with respect to X..

36
00:02:57,450 --> 00:03:03,780
Now, when we go to our multiple random variables, so now we just have to extend it by using a double

37
00:03:03,780 --> 00:03:06,090
integral over our X and Y.

38
00:03:06,120 --> 00:03:11,250
So it's pretty much the same equation, but now we're using a double integral for both parameters.

39
00:03:13,760 --> 00:03:19,340
So just like in basic probability, we can have independent random variables, and what this means is

40
00:03:19,340 --> 00:03:25,760
that if we have events and B, if they're independent, it means a total probability of event A and

41
00:03:25,760 --> 00:03:26,470
B happening.

42
00:03:26,510 --> 00:03:32,510
Is this the multiplication of the probability that happens multiplied by the probability that Event

43
00:03:32,510 --> 00:03:33,360
B happens?

44
00:03:33,890 --> 00:03:36,590
So if they're independent, this relationship holds up.

45
00:03:37,490 --> 00:03:43,200
Now, a similar relationship can be found for the probability density functions for all random variables.

46
00:03:43,520 --> 00:03:49,610
So if we have a joint density function of X and Y, if events, all the random variables, X and Y are

47
00:03:49,610 --> 00:03:54,830
independent, it means that the density function is just going to be the same as multiplying the two

48
00:03:54,830 --> 00:03:57,350
individual marginal density functions together.

49
00:03:58,250 --> 00:04:04,220
So if this definition holds, it means that the random variables X and Y are independent of each other.

50
00:04:06,020 --> 00:04:11,370
So now let's have a look at the expected value for multiplication of independent random variables.

51
00:04:11,750 --> 00:04:16,910
So we want to work out the expected value when we multiply a random variables, X and Y together.

52
00:04:17,540 --> 00:04:22,820
So now using the equation that we showed on previous slide, we can work out our integral.

53
00:04:23,120 --> 00:04:27,710
So we just multiplying our random variables X and Y, so that becomes a function and we multiply that

54
00:04:27,710 --> 00:04:29,120
by our density function.

55
00:04:30,260 --> 00:04:32,280
So we can expand this equation out further.

56
00:04:32,630 --> 00:04:37,850
We actually know that our joint probability density function is independent, so we know that we can

57
00:04:37,850 --> 00:04:40,720
split it up into our F and Y random.

58
00:04:41,180 --> 00:04:45,680
So we know that we can split it up into our X and Y probability density functions.

59
00:04:46,070 --> 00:04:48,420
So we end up with this equation shown here.

60
00:04:48,440 --> 00:04:53,750
So we're basically just splitting up our joint density function into the two individual component density

61
00:04:53,750 --> 00:04:54,290
functions.

62
00:04:55,220 --> 00:05:01,000
So next we can move and separate out X and Y variables together so we can separate them out.

63
00:05:01,010 --> 00:05:03,500
And now we have two integrals multiplied together.

64
00:05:03,530 --> 00:05:09,470
So we have all the X terms in this integral and all Y terms in this integral area here.

65
00:05:10,160 --> 00:05:15,200
Now, if you have a look closer at these two integrals, you might look at them and they might seem

66
00:05:15,200 --> 00:05:15,840
pretty familiar.

67
00:05:16,310 --> 00:05:22,700
This is because this term here and this time here are simply just the expected values for the X and

68
00:05:22,700 --> 00:05:24,740
Y random variables.

69
00:05:25,310 --> 00:05:32,720
So using this here, we know that our definition, assuming independent expected value for random variables

70
00:05:32,720 --> 00:05:39,980
X multiplied, is just going to be the expected value of the X random variable multiplied by the expected

71
00:05:39,980 --> 00:05:41,850
value of the Y random variable.

72
00:05:42,260 --> 00:05:46,130
And of course, this only holds if X and Y are independent.

73
00:05:49,280 --> 00:05:55,400
We can also have a look at the expected value of the sum of independent random variables, so a function

74
00:05:55,400 --> 00:06:01,970
Z here is a function of two parameters, X and Y, and this function is just the sum of two independent

75
00:06:02,480 --> 00:06:09,930
functions, a function G, which is just a function of X and function F and and function Hech, which

76
00:06:09,930 --> 00:06:11,090
is a function of Y.

77
00:06:11,930 --> 00:06:18,770
So we want to work out the expected value of our function Z and this just turns into our expected value

78
00:06:18,800 --> 00:06:19,640
of our function.

79
00:06:19,640 --> 00:06:20,030
G.

80
00:06:20,590 --> 00:06:21,440
Function H.

81
00:06:23,190 --> 00:06:28,920
So now riding out our expected value equation, we can put in a function and put the function that we

82
00:06:28,920 --> 00:06:33,690
want to work at the expected value of in front, you multiply it by the probability density function.

83
00:06:34,170 --> 00:06:37,060
Now, we can expand this sum out here into two terms.

84
00:06:37,350 --> 00:06:40,300
We can also expand out our probability density function.

85
00:06:40,320 --> 00:06:46,650
So if we assume X and Y are independent, we know that we can break it down into two different individual

86
00:06:46,650 --> 00:06:48,090
probability density functions.

87
00:06:48,780 --> 00:06:51,370
So when we do that, we can get this equation here.

88
00:06:51,750 --> 00:06:55,230
So we splitting out our function of G into this one.

89
00:06:55,230 --> 00:06:57,590
A function of how much into this one over here.

90
00:06:57,960 --> 00:07:01,260
Multiply it by our two individual probability density functions.

91
00:07:02,100 --> 00:07:07,590
And like we did in the last example, now we're going to split this in this double integral into an

92
00:07:07,590 --> 00:07:13,750
integral respect to X, an integral in respect to Y, because this is all just a single multiplication.

93
00:07:14,340 --> 00:07:20,130
So when we do that, we end up with this term here and we can also have a look at a different individual

94
00:07:20,160 --> 00:07:23,370
integrals and we might notice some commonalities between the two.

95
00:07:24,030 --> 00:07:29,010
Now, if you look at this integral here, this integral here is just going to equal to one, because

96
00:07:29,010 --> 00:07:34,290
this is just summing up the probability density function and we know that the probability density function

97
00:07:34,500 --> 00:07:35,610
has to equal one.

98
00:07:36,390 --> 00:07:39,840
We can do the same thing over here for X, that also has to equal one.

99
00:07:40,230 --> 00:07:47,370
So now we just end up with the sum of this integral and this integral and going back to our basic expectation

100
00:07:47,370 --> 00:07:54,000
in the integral, we can see that this integral here and this integral here are just the expected values

101
00:07:54,000 --> 00:07:56,730
for a function G and a function H.

102
00:07:58,130 --> 00:08:03,980
So this is basically saying that when we say this is saying that our expected value for the sum of two

103
00:08:03,980 --> 00:08:09,650
independent random variables is just the sum of the expected values for the two independent variables.

104
00:08:10,220 --> 00:08:17,180
So if we had our function, Z is equal to X plus, why then our expected value for this function is

105
00:08:17,180 --> 00:08:20,520
just going to be expected value for X plus expected value for Y.

106
00:08:21,230 --> 00:08:24,260
And again, this only holds when X and Y are independent.

107
00:08:25,040 --> 00:08:29,660
So these are two different identities for independent random variables that are useful to know.

108
00:08:32,550 --> 00:08:38,430
So now we can also look at dependent random variables, so the two random variables might not be independent,

109
00:08:38,590 --> 00:08:40,830
there might be some correlation between the two.

110
00:08:41,070 --> 00:08:46,760
So every time we query the joint custody function, we might end up with some correlated examples.

111
00:08:46,800 --> 00:08:52,860
So we can see on the right here, we have a graph that has the X random variable and Y random variable.

112
00:08:53,160 --> 00:08:57,450
And you can see that every time we create the density function, we might get different points that

113
00:08:57,450 --> 00:08:58,050
look like this.

114
00:08:58,560 --> 00:09:02,370
And it's pretty clear to see that these are correlated in some regard.

115
00:09:02,850 --> 00:09:07,340
We can see that there's basically the following in this case, a line of correlation.

116
00:09:07,740 --> 00:09:13,730
And this is actually a positive correlation because this means that as X gets bigger, Y gets bigger,

117
00:09:14,100 --> 00:09:16,700
you could have cases of negative correlation.

118
00:09:17,010 --> 00:09:21,560
So that would be on the other side when when it gets bigger, Y gets smaller.

119
00:09:21,600 --> 00:09:26,430
So it'll be along the opposite slope or you might have zero correlation.

120
00:09:26,430 --> 00:09:30,310
So there might be independent and that would just mean that random all over the place.

121
00:09:30,310 --> 00:09:31,440
There's no pattern to them.

122
00:09:34,490 --> 00:09:38,870
So to work out the correlation between the different components, we can look at the covariance.

123
00:09:39,720 --> 00:09:45,260
So in the past we looked at the variance of a parameter and that was just the expected value of the

124
00:09:45,260 --> 00:09:48,100
random variable, minus the main squared.

125
00:09:48,450 --> 00:09:53,090
So the covariance is very similar, but instead of being a function of a single random variable is now

126
00:09:53,090 --> 00:09:54,750
a function of two random variables.

127
00:09:55,160 --> 00:10:02,240
So here we have the expected value of the multiplication of a random variable X minus its main multiplied

128
00:10:02,240 --> 00:10:04,730
by a random variable, Y minus this mean.

129
00:10:05,450 --> 00:10:09,090
And we can expand this out and we can end up with this equation here.

130
00:10:09,120 --> 00:10:12,790
So so this is how we can work out the covariance coefficient.

131
00:10:13,040 --> 00:10:19,130
So we basically look at the expected value of X and Y minus the mean of X and Y.

132
00:10:19,820 --> 00:10:23,440
So using the covariance we can work out a correlation coefficient.

133
00:10:23,480 --> 00:10:25,100
So that's basically this term here.

134
00:10:25,460 --> 00:10:31,490
So using the variance, multiply that by the standard deviation for the random variable, X and Y gives

135
00:10:31,490 --> 00:10:35,030
us a correlation coefficient between negative one and one.

136
00:10:35,690 --> 00:10:42,350
Negative one means that negatively correlated positive one mean positively correlated zero means are

137
00:10:42,360 --> 00:10:43,450
not correlated at all.

138
00:10:45,150 --> 00:10:49,030
So here is a summary of all the different concepts we have gone over in this video.

139
00:10:49,080 --> 00:10:53,970
We've talked about the joint probability density function and how to work out the probability from it.

140
00:10:55,140 --> 00:11:00,270
We've described how to work out the expected value equation for when we have a joint probability density

141
00:11:00,270 --> 00:11:00,660
function.

142
00:11:01,440 --> 00:11:06,780
We looked at the independence condition for a joint density function, just being the sum of the two

143
00:11:06,810 --> 00:11:08,310
individual density functions.

144
00:11:08,850 --> 00:11:11,310
We've looked at the concept of covariance.

145
00:11:11,580 --> 00:11:12,300
So how much?

146
00:11:12,300 --> 00:11:14,840
One parameter is correlated with another parameter.

147
00:11:15,600 --> 00:11:21,210
We've worked out how to work out a marginal density functions from our joint density function here and

148
00:11:21,210 --> 00:11:21,740
we've gone.

149
00:11:21,840 --> 00:11:27,810
A few independent expectation identities will come in handy every now and again of that.

150
00:11:27,810 --> 00:11:33,870
The expected value of the sum that the expected value of the multiplication of random variables is just

151
00:11:33,870 --> 00:11:34,650
going to be expected.

152
00:11:34,650 --> 00:11:39,720
Value multiplied together and likewise with the sum is just going to be the sum of the expected values

153
00:11:39,720 --> 00:11:40,170
together.

154
00:11:41,870 --> 00:11:47,360
Now that we've been talking about multiple random variables happening at the same time, we can generalize

155
00:11:47,360 --> 00:11:49,970
the single random variable into a vector form.

156
00:11:50,180 --> 00:11:57,290
So now we have a random vector and a basically a random vector is just a vector where each element inside

157
00:11:57,290 --> 00:11:59,480
the vector is its own random variable.

158
00:11:59,870 --> 00:12:07,550
So, for example, here we have a random vector X and we have random X1 x2 all the way up to X and all.

159
00:12:07,550 --> 00:12:14,400
We have this random vector here Y and that has random variable is Y one y two all the way down to Y.

160
00:12:14,520 --> 00:12:20,360
And so this is just a mathematical way of writing groups of random variables together in vector form.

161
00:12:22,030 --> 00:12:27,820
Now, the main of the random vectors can be calculated in the same way as a single random variable,

162
00:12:27,820 --> 00:12:30,130
but it's done on per element basis.

163
00:12:30,430 --> 00:12:37,090
So if we have a random vector X here, the main vector or the expected value of this random vector X

164
00:12:37,360 --> 00:12:42,010
is simply just the expected value of each of the different random variables individually.

165
00:12:44,230 --> 00:12:50,440
We can also work out the covariance so the covariance between two random variable vectors can be calculated,

166
00:12:50,440 --> 00:12:52,820
which forms a covariance matrix.

167
00:12:52,840 --> 00:12:55,370
So instead of a single value is now a matrix.

168
00:12:55,780 --> 00:13:03,190
So let's say we have a random variable vector X and Y where we have the random variables x1 all the

169
00:13:03,190 --> 00:13:10,600
way to end and Y one all the way to Y, and we can work out the covariance matrix using the expected

170
00:13:10,600 --> 00:13:18,850
value of our X random variable vector minus the X main multiplied by Y random variable minus Y mean

171
00:13:18,850 --> 00:13:19,810
transpose.

172
00:13:20,350 --> 00:13:23,950
And again, we can simplify this down into this equation here.

173
00:13:24,430 --> 00:13:28,770
So when we do this operation here, we end up with a covariance matrix.

174
00:13:29,200 --> 00:13:35,650
So each element inside this covariance matrix is a covariance value for each of those random very,

175
00:13:35,650 --> 00:13:39,520
very each of those random variables with respect to a different random variable.

176
00:13:39,970 --> 00:13:44,170
So this term here is a covariance or random variable, X one with Y one.

177
00:13:44,530 --> 00:13:47,960
Here is a variance of X1 with Y two.

178
00:13:48,370 --> 00:13:55,660
Likewise over across four X one all the way up to I m and down the matrix is similar all the way down

179
00:13:55,660 --> 00:13:57,430
to extend and Y.

180
00:14:00,020 --> 00:14:06,440
Since each element in the random variable Vector X itself is a random variable, it is possible to calculate

181
00:14:06,440 --> 00:14:10,770
the covariance of each element in the vector with respect to each other element in the vector.

182
00:14:11,150 --> 00:14:14,170
And when we do this, we get the autocorrelation matrix.

183
00:14:14,180 --> 00:14:18,140
So this is just the covariance matrix for the random vector and itself.

184
00:14:18,620 --> 00:14:25,010
So here we are working out the expected value of our X random variable vector minus its main multiplied

185
00:14:25,010 --> 00:14:26,610
by the same thing transpose.

186
00:14:26,630 --> 00:14:30,830
And this gives us the covariance matrix of our random variable vector X.

187
00:14:31,340 --> 00:14:36,200
And if you have a look at what's contained inside this covariance matrix, it becomes a variance and

188
00:14:36,200 --> 00:14:39,350
cross covariance, which described the probability density function.

189
00:14:39,860 --> 00:14:42,730
So on the diagonal here we have the variances.

190
00:14:42,740 --> 00:14:48,060
So Sigma squared for X1 all the way down to our sigma squared X.

191
00:14:48,110 --> 00:14:53,150
And so these are the variances for the different random variables and our of diagonal turns.

192
00:14:53,150 --> 00:14:55,790
Here are the code, the crosscourt variances.

193
00:14:56,070 --> 00:15:01,700
So these turn to these talk about how correlated the different variables are with respect to each other.

194
00:15:02,060 --> 00:15:07,940
So here we have the cross covariance between random variable X1 and random variable x2.

195
00:15:08,750 --> 00:15:11,300
And down here you can see that we have a mirror image.

196
00:15:11,300 --> 00:15:18,560
We have the cross covariance of random variable x2 with respect to x1 the of diagonal terms, you are

197
00:15:18,560 --> 00:15:21,090
going to be symmetrical and going to be replicated.

198
00:15:22,040 --> 00:15:27,560
Now if we look at the properties of the covariance matrix, all the autocorrelation matrix, the covariance

199
00:15:27,560 --> 00:15:29,020
matrix is symmetrical.

200
00:15:29,030 --> 00:15:35,330
So this is basically because our correlation or across the variances for IJA is the same as the variance

201
00:15:35,330 --> 00:15:38,070
for Jaci because it's the same random variables.

202
00:15:38,510 --> 00:15:45,200
This means that overall our covariance matrix, our see X is equal to, say, X transpose, because

203
00:15:45,200 --> 00:15:50,620
if we transpose this matrix, the diagonal matrix stays the same and the terms flip around.

204
00:15:50,630 --> 00:15:52,060
It become the mirror image of each other.

205
00:15:52,430 --> 00:15:55,370
But we know that this time here is equal to this time here.

206
00:15:55,730 --> 00:15:58,080
Likewise, this time here is equal to this time here.

207
00:15:58,100 --> 00:16:01,910
So we flip around the diagonal, we end up with the same matrix.

208
00:16:03,170 --> 00:16:09,040
And we also know that our coverage matrix is square because it is and by N Matrix.

209
00:16:09,800 --> 00:16:14,630
Now, for a convergence matrix to be valid, it always has to be positive, semi definite.

210
00:16:14,960 --> 00:16:21,920
So this just means that if we multiply our variance matrix by any vectors that he transpose times by

211
00:16:21,920 --> 00:16:27,000
Z, the resulting operation is always going to be greater than or equal to zero.

212
00:16:27,140 --> 00:16:30,060
And this this means that matrix is positive, definite.

213
00:16:30,440 --> 00:16:32,210
So this is important later on.

214
00:16:32,390 --> 00:16:36,770
So any probability distribution that has a covariance magic that's positive, essentially definite is

215
00:16:36,770 --> 00:16:37,250
valid.

216
00:16:37,460 --> 00:16:42,650
If it's not positive, something definite and it's not symmetrical, it's not a valid distribution for

217
00:16:42,650 --> 00:16:43,970
probability density function.
