1
00:00:00,090 --> 00:00:05,580
We've talked about the probability distributions for both discrete random variables and continuous random

2
00:00:05,580 --> 00:00:06,360
variables.

3
00:00:06,390 --> 00:00:12,240
Now we want to drill down a little further into a specific type of discrete random variable, which

4
00:00:12,240 --> 00:00:14,310
is the binomial random variable.

5
00:00:14,670 --> 00:00:20,250
So a binomial random variable for now, we'll just call it the discrete random variable x, but it's

6
00:00:20,250 --> 00:00:25,370
a special type of discrete random variable that meets these four criteria.

7
00:00:25,380 --> 00:00:32,790
We often indicate the binomial random variable like this with these parameters and and p, so we might

8
00:00:32,790 --> 00:00:37,980
write X of NP or B of an P for binomial.

9
00:00:37,980 --> 00:00:39,840
We'll see it this way as well.

10
00:00:39,840 --> 00:00:46,230
But the binomial random variable is always modeling what we call the number of successes in some sequence

11
00:00:46,230 --> 00:00:48,420
of NW independent experiments.

12
00:00:48,420 --> 00:00:51,600
That's where that first criteria comes in independent trials.

13
00:00:51,600 --> 00:00:54,540
So let's take the simplest example of flipping a coin.

14
00:00:54,540 --> 00:01:01,470
When we flip a coin multiple times and we call each coin flip one trial, those are independent trials

15
00:01:01,470 --> 00:01:07,470
because each coin flip is not affected by any of the other coin flips, whether we flip heads or tails

16
00:01:07,470 --> 00:01:12,210
on one flip is totally independent of whether we flip heads or tails on another flip.

17
00:01:12,210 --> 00:01:13,860
So we have independent trials.

18
00:01:13,860 --> 00:01:18,390
We define the outcome of each trial as a success or a failure.

19
00:01:18,390 --> 00:01:24,390
And these ideas here of success or failure don't really have anything to do with indicating some measurement

20
00:01:24,390 --> 00:01:26,130
of good or bad.

21
00:01:26,130 --> 00:01:31,950
By success, we just mean the outcome that we're investigating or the outcome we're looking for.

22
00:01:31,950 --> 00:01:36,990
So if we're looking at the number of times that we flip heads, we would just define flipping heads

23
00:01:36,990 --> 00:01:40,200
as a success and anything else as a failure.

24
00:01:40,200 --> 00:01:43,680
In this case, the only other possible outcome is flipping tails.

25
00:01:43,680 --> 00:01:45,510
So flipping tails would be a failure.

26
00:01:45,510 --> 00:01:47,250
Flipping heads would be a success.

27
00:01:47,250 --> 00:01:52,830
But if we're investigating the number of tails that we flip instead, then we would simply define flipping

28
00:01:52,860 --> 00:01:57,750
tails as the success and everything else, in this case flipping heads as the failure.

29
00:01:57,750 --> 00:02:02,640
So we define the outcome we want as success and everything else as a failure.

30
00:02:02,640 --> 00:02:08,430
We have to be able to categorize outcomes in that way, which is partly why this can only be a discrete

31
00:02:08,430 --> 00:02:09,360
random variable.

32
00:02:09,360 --> 00:02:15,540
In order to model this situation with a binomial random variable, we also have to have a fixed number

33
00:02:15,540 --> 00:02:21,210
of trials, so we can't use a binomial random variable to model a situation where we want to look at

34
00:02:21,210 --> 00:02:22,020
probability.

35
00:02:22,020 --> 00:02:27,990
When we flip a coin an infinite number of times we have to specify we're flipping the coin ten times

36
00:02:27,990 --> 00:02:29,760
or we're flipping the coin 50 times.

37
00:02:29,760 --> 00:02:32,160
The number of trials has to be fixed.

38
00:02:32,160 --> 00:02:36,180
We have to predetermine how many times we're going to run this trial.

39
00:02:36,180 --> 00:02:39,090
And that fixed number of trials is this number.

40
00:02:39,090 --> 00:02:39,960
And here.

41
00:02:39,960 --> 00:02:42,180
So this is a fixed number of trials.

42
00:02:42,180 --> 00:02:48,660
RN And then the probability of that success that we defined earlier has to be constant.

43
00:02:48,660 --> 00:02:54,570
So with the coin flip for every trial, for every time we flip the coin, the probability of success

44
00:02:54,570 --> 00:03:02,370
is always one half or one over two or 50% or 0.5, and that stays constant because each coin flip is

45
00:03:02,370 --> 00:03:03,060
independent.

46
00:03:03,060 --> 00:03:07,290
The probability of success can't change from trial to trial.

47
00:03:07,290 --> 00:03:13,440
If we want to model a situation with a binomial random variable, it has to stay constant and that probability

48
00:03:13,440 --> 00:03:15,720
of success is this value.

49
00:03:15,750 --> 00:03:23,010
P So when we see a binomial random variable expressed as B of NP or X of NP, these values indicate

50
00:03:23,010 --> 00:03:28,320
the number of trials that were performing and then the probability of success on any particular trial.

51
00:03:28,320 --> 00:03:35,580
Remember too, that total probability always sums to one or 100%, which means that if we define the

52
00:03:35,580 --> 00:03:42,810
probability of success as P, then the probability of failure has to be one minus P, all of the other

53
00:03:42,810 --> 00:03:45,840
probability other than this probability of success.

54
00:03:45,840 --> 00:03:53,910
And so we often define failure as one minus P, and sometimes we write that as Q So Q is one minus P,

55
00:03:53,940 --> 00:03:55,320
This is the probability of success.

56
00:03:55,350 --> 00:03:59,580
P And the probability of failure one minus P, sometimes written as.

57
00:03:59,580 --> 00:04:06,330
Q So these are the criteria that we have to meet in order to model our situation with a binomial random

58
00:04:06,330 --> 00:04:06,960
variable.

59
00:04:06,960 --> 00:04:12,480
Keep in mind that we don't just have to limit ourselves either to a situation where there are only two

60
00:04:12,480 --> 00:04:16,140
outcomes, like with a coin flip, there's only two outcomes heads or tails.

61
00:04:16,140 --> 00:04:18,540
We can have more than two outcomes.

62
00:04:18,540 --> 00:04:25,080
So for instance, if we are rolling a six sided die and we want to define success as rolling a two,

63
00:04:25,110 --> 00:04:27,540
we would just define success as rolling a two.

64
00:04:27,570 --> 00:04:31,380
And then failure is rolling a one, three, four, five or six.

65
00:04:31,380 --> 00:04:33,870
It's rolling anything other than a two.

66
00:04:33,900 --> 00:04:39,540
Our probability of success would of course be one over six because the probability of rolling a two

67
00:04:39,570 --> 00:04:43,020
is one over six, and so the probability of success would be one over six.

68
00:04:43,020 --> 00:04:46,050
The probability of failure would be five over six.

69
00:04:46,050 --> 00:04:51,420
So not only can we have more than two possible outcomes, like with a die roll, but that probability

70
00:04:51,420 --> 00:04:54,330
of success doesn't have to be exactly one half.

71
00:04:54,330 --> 00:04:59,310
With the coin flip, we have just two equally likely outcomes heads or tails.

72
00:04:59,310 --> 00:04:59,970
And.

73
00:05:00,000 --> 00:05:03,150
The probability that either of those show up is equivalent.

74
00:05:03,150 --> 00:05:09,180
It's one half for heads and one half for tails, but we can have more than two outcomes, and the probabilities

75
00:05:09,180 --> 00:05:11,730
of success and failure don't have to be equivalent.

76
00:05:11,730 --> 00:05:17,670
So we could use a binomial random variable for a coin flip because each coin flip is independent of

77
00:05:17,670 --> 00:05:18,840
all other coin flips.

78
00:05:18,840 --> 00:05:22,410
We can define the outcome of each trial as a success or a failure.

79
00:05:22,410 --> 00:05:27,570
If we're looking at how many times we flip heads, then we would define heads as a success and tails

80
00:05:27,570 --> 00:05:28,500
as a failure.

81
00:05:28,500 --> 00:05:30,330
We can define a fixed number of trials.

82
00:05:30,330 --> 00:05:35,370
Let's say we're flipping a coin at 20 times and the probability of success remains constant because

83
00:05:35,370 --> 00:05:40,110
the probability of flipping heads remains one half for every coin flip.

84
00:05:40,290 --> 00:05:47,700
But we could also use this to model a die roll If, let's say we define success as rolling a two because

85
00:05:47,700 --> 00:05:53,700
every die roll is independent, the role of the die won't be affected by any of the previous roles.

86
00:05:53,700 --> 00:05:56,250
We can define our success specifically.

87
00:05:56,250 --> 00:05:58,590
So let's say we want to roll a two.

88
00:05:58,590 --> 00:06:03,210
Then we would define rolling a two as a success and rolling anything else as a failure.

89
00:06:03,210 --> 00:06:07,200
So rolling a1a3 of four of five or a six would be a failure.

90
00:06:07,410 --> 00:06:09,180
We can define a fixed number of trials.

91
00:06:09,180 --> 00:06:14,520
So let's say we roll the dice ten times and the probability of success remains constant.

92
00:06:14,520 --> 00:06:18,420
The probability of rolling a two always remains one over six.

93
00:06:18,420 --> 00:06:22,530
That probability remains constant for every trial, for every die roll.

94
00:06:22,560 --> 00:06:28,530
Contrast this with something that doesn't meet the criteria for a binomial random variable, like pulling

95
00:06:28,530 --> 00:06:32,850
cards from a deck of cards without replacing the cards.

96
00:06:32,850 --> 00:06:39,330
So if we're pulling cards from a standard 52 card deck and we're not replacing them, then right away

97
00:06:39,330 --> 00:06:45,240
we know that our trials are independent because when we pull the first card, we start out with 52 cards

98
00:06:45,240 --> 00:06:46,050
in the deck.

99
00:06:46,050 --> 00:06:50,340
And so the probability of pulling any one card is going to be one over 52.

100
00:06:50,370 --> 00:06:55,590
If we don't replace that card and we go to pull a second card, then the probability of pulling any

101
00:06:55,590 --> 00:07:01,380
of the remaining cards from the remaining deck is going to be one over the 51 remaining cards.

102
00:07:01,380 --> 00:07:04,260
So right away, the trials are not independent.

103
00:07:04,260 --> 00:07:07,110
The probability of success is going to be changing.

104
00:07:07,110 --> 00:07:13,650
We could not model that kind of scenario with a binomial random variable Now that we know when we can

105
00:07:13,650 --> 00:07:20,610
use a binomial random variable, let's talk about the probability associated with this kind of variable.

106
00:07:20,610 --> 00:07:24,720
What we're interested in here is given by this formula.

107
00:07:24,750 --> 00:07:29,280
This is the probability of K successes in N attempts.

108
00:07:29,280 --> 00:07:31,860
So remember we said that we had a fixed number of trials.

109
00:07:31,860 --> 00:07:38,580
N What we want to be able to calculate is the probability that we come up with K number of successes

110
00:07:38,580 --> 00:07:41,250
in N number of fixed trials.

111
00:07:41,250 --> 00:07:49,050
So if we're rolling a die 20 times and we want to know the probability that we roll heads 13 times,

112
00:07:49,050 --> 00:07:54,180
then we would calculate here the probability of K equals 13 in N equals 20.

113
00:07:54,300 --> 00:07:58,740
And the way that we calculate that is with this binomial coefficient here.

114
00:07:58,740 --> 00:08:03,390
We talked about this earlier when we looked at permutations and combinations.

115
00:08:03,390 --> 00:08:10,050
So the formula, again as a reminder for this binomial coefficient is K successes in n attempts.

116
00:08:10,050 --> 00:08:16,950
It's the combination and choose K and we calculate it as nn factorial divided by k, factorial times

117
00:08:16,950 --> 00:08:19,260
the quantity and minus k factorial.

118
00:08:19,290 --> 00:08:24,000
This is just a combination, so we have to calculate this value first.

119
00:08:24,000 --> 00:08:31,320
Plug it into this formula here for the binomial coefficient and then we multiply that by p the probability

120
00:08:31,320 --> 00:08:35,280
of success for our binomial random variable raised to the power.

121
00:08:35,280 --> 00:08:40,770
This is the exponent here of K, the number of successes that we're looking for, and then we multiply

122
00:08:40,770 --> 00:08:48,360
that by this one minus p which remember we looked at here is the probability of failure one minus P,

123
00:08:48,360 --> 00:08:50,820
we also call it Q It's the probability of failure.

124
00:08:50,820 --> 00:08:52,950
Where P is the probability of success.

125
00:08:53,130 --> 00:09:00,150
So one minus P is that failure probability and we raise that to the power of N minus K, where n again

126
00:09:00,150 --> 00:09:04,140
is the number of trials and K is the number of successes we're looking for.

127
00:09:04,140 --> 00:09:10,560
So let's just say to take an example that we're running a rideshare company, we're looking at a particular

128
00:09:10,560 --> 00:09:18,090
city where we operate our ridesharing company and we want to define success as one of our drivers arriving

129
00:09:18,090 --> 00:09:23,100
within 10 minutes of the moment when the passenger has requested the ride.

130
00:09:23,100 --> 00:09:27,720
We're interested in that figure because we don't want our passengers waiting longer than 10 minutes

131
00:09:27,720 --> 00:09:31,200
from the moment when they request a ride through our app.

132
00:09:31,200 --> 00:09:35,760
So we're going to define each trial as one ride.

133
00:09:35,760 --> 00:09:41,670
We're going to define success as a driver arriving within 10 minutes and failure as it takes longer

134
00:09:41,670 --> 00:09:44,130
than 10 minutes for the driver to arrive.

135
00:09:44,130 --> 00:09:47,910
We're just going to look at five trials or five rides.

136
00:09:47,910 --> 00:09:52,980
So we're going to randomly select five rides from our database and we're going to say that from the

137
00:09:52,980 --> 00:09:58,710
data we already have about the rides we've been operating in this city, that the probability of success

138
00:09:58,710 --> 00:09:59,700
for any particular.

139
00:09:59,930 --> 00:10:01,700
Ride is 75%.

140
00:10:01,700 --> 00:10:09,200
So in the past, 75% of passengers who request a ride have their driver arrive within 10 minutes of

141
00:10:09,200 --> 00:10:10,100
the request.

142
00:10:10,100 --> 00:10:12,640
So PE is going to be 0.75.

143
00:10:12,650 --> 00:10:18,650
So in that circumstance we meet these conditions for a binomial random variable to model this situation

144
00:10:18,650 --> 00:10:20,240
with a binomial random variable.

145
00:10:20,330 --> 00:10:26,630
And now let's say we want to look at the probability of finding K successes in N attempts.

146
00:10:26,630 --> 00:10:28,490
So we set n was going to be five.

147
00:10:28,490 --> 00:10:31,400
We're going to choose five rides randomly from our database.

148
00:10:31,400 --> 00:10:37,580
What's the probability that zero of those five have the driver arrive in 10 minutes or less?

149
00:10:37,580 --> 00:10:41,750
What's the probability that one in five arrives in 10 minutes or less?

150
00:10:41,750 --> 00:10:47,120
So in other words, this probability that we're going to be calculating allows us to understand, basically,

151
00:10:47,120 --> 00:10:53,540
if we choose five rides at random, what's the probability that the driver will arrive within 10 minutes

152
00:10:53,540 --> 00:10:55,220
for zero of those rides?

153
00:10:55,220 --> 00:10:59,000
One of those rides, two of those rides all the way up to five of those rides.

154
00:10:59,000 --> 00:11:02,480
So here's what some of those calculations would look like.

155
00:11:02,480 --> 00:11:08,750
The probability that zero drivers arrive within 10 minutes if we pull a random sample of five rides

156
00:11:08,750 --> 00:11:10,340
would be calculated this way.

157
00:11:10,340 --> 00:11:15,320
We have this binomial coefficient and choose k or five choose zero.

158
00:11:15,320 --> 00:11:19,550
We're selecting five rides and we want the probability that we have zero successes.

159
00:11:19,550 --> 00:11:24,020
Zero drivers arrive within 10 minutes and then we take our probability of success.

160
00:11:24,020 --> 00:11:26,690
P which we said was 0.75.

161
00:11:26,690 --> 00:11:30,230
There's a 75% chance that a driver arrives within 10 minutes.

162
00:11:30,230 --> 00:11:37,790
So we raise 0.75 to the power of zero because we're looking here for zero successes, K successes.

163
00:11:37,790 --> 00:11:45,560
And then we take our probability of failure, which is just one minus P, So one -0.75 is 0.25 and we

164
00:11:45,560 --> 00:11:47,570
raise that to the end minus K.

165
00:11:47,570 --> 00:11:50,480
In this case, that's five -0 or five.

166
00:11:50,480 --> 00:11:56,390
And we can use a calculator or software to calculate an approximate decimal value.

167
00:11:56,420 --> 00:12:00,320
Here we've rounded all of our decimal values to six decimal places.

168
00:12:00,560 --> 00:12:05,510
Then this second line here calculates the probability that of five rides.

169
00:12:05,510 --> 00:12:06,410
Exactly.

170
00:12:06,410 --> 00:12:12,140
One driver arrives within 10 minutes and we could continue calculating the probability that two of the

171
00:12:12,140 --> 00:12:13,490
five arrive within 10 minutes.

172
00:12:13,490 --> 00:12:19,670
Three of the five, four of the five all the way down to the last option, which is that of five rides,

173
00:12:19,670 --> 00:12:24,740
all five drivers arrive within 10 minutes of the ride request.

174
00:12:24,740 --> 00:12:27,500
So we could build out that table of probability.

175
00:12:27,500 --> 00:12:31,220
We could get a value for each of those number of successes.

176
00:12:31,220 --> 00:12:38,960
k012, three, four, five And then we can use this information to build out the probability distribution

177
00:12:38,960 --> 00:12:41,090
for a binomial random variable.

178
00:12:41,180 --> 00:12:46,730
Remember, since that binomial random variable is a discrete random variable, it's a special kind of

179
00:12:46,730 --> 00:12:48,080
discrete random variable.

180
00:12:48,200 --> 00:12:55,160
Our probability distribution will be a probability mass function that looks like this.

181
00:12:55,160 --> 00:13:00,200
Here is the probability distribution for this specific scenario with the ride sharing company.

182
00:13:00,230 --> 00:13:06,050
This should remind us of the general form of a probability distribution, a probability mass function

183
00:13:06,050 --> 00:13:09,530
for a discrete random variable along the horizontal axis.

184
00:13:09,530 --> 00:13:17,000
Here on the bottom, we place all of the discrete countable possible outcomes for our variable.

185
00:13:17,000 --> 00:13:19,580
These are the numbers of our possible successes.

186
00:13:19,580 --> 00:13:26,150
We can have zero successes, one success, two successes, all the way up to five out of five successes.

187
00:13:26,150 --> 00:13:35,780
We see these values here coming from these values of k01, two, three, four, all the way up to five.

188
00:13:35,780 --> 00:13:41,750
So we put those values along the horizontal axis and then along the vertical axis for our probability

189
00:13:41,750 --> 00:13:45,770
mass function, for our special type of discrete random variable.

190
00:13:45,800 --> 00:13:51,650
We plot the probability of each of these particular values along the horizontal axis.

191
00:13:51,650 --> 00:13:57,920
So the probability that we got zero successes in five attempts was almost zero.

192
00:13:57,920 --> 00:14:00,890
We found 0.000977.

193
00:14:00,890 --> 00:14:06,290
That probability is so small that our little bar of probability doesn't even appear here.

194
00:14:06,320 --> 00:14:12,710
The probability that we get one success in five attempts is just over 1%.

195
00:14:12,710 --> 00:14:22,700
And so we see here if this is 10% and this is 5% along our graph, we see here our 1% value, about

196
00:14:22,700 --> 00:14:29,690
a 1% chance that exactly one of our five drivers arrives in 10 minutes or less and we could continue

197
00:14:29,690 --> 00:14:35,930
plotting all of these probability values all the way up to the probability that all five of the five

198
00:14:35,930 --> 00:14:38,390
drivers arrive in 10 minutes or less.

199
00:14:38,390 --> 00:14:45,560
That probability for all five drivers arriving for five successes is just under 24%.

200
00:14:45,560 --> 00:14:47,180
And we see that value here.

201
00:14:47,180 --> 00:14:49,370
So we have here 20%.

202
00:14:49,370 --> 00:14:57,470
And then this line here representing 25% or 0.25, and we see our just less than 24% probability right

203
00:14:57,470 --> 00:14:58,910
under that line there.

204
00:14:58,910 --> 00:14:59,600
So this.

205
00:14:59,690 --> 00:15:04,590
Then becomes the probability distribution for this specific binomial random variable.

206
00:15:04,610 --> 00:15:11,000
The probability distribution will look different for every binomial random variable depending on the

207
00:15:11,000 --> 00:15:12,860
values of N and P.

208
00:15:12,890 --> 00:15:19,910
So in this scenario we chose N equals five, but if we chose instead, N equals 20 to run 20 trials

209
00:15:19,910 --> 00:15:26,150
to look at 20 different rides that take place in our app, then instead of just the values zero through

210
00:15:26,150 --> 00:15:30,080
five along the horizontal axis, we would have the values zero through 20.

211
00:15:30,080 --> 00:15:36,080
And then the exact distribution here depends on the fact that our historical data told us that we had

212
00:15:36,080 --> 00:15:40,460
a 75% chance of success, but maybe in a different city.

213
00:15:40,490 --> 00:15:47,990
Our chance of success or chance that a driver arrives within 10 minutes is 60% or 90% or only 20%.

214
00:15:47,990 --> 00:15:53,000
Whatever value we have for that chance of success, of course, that is going to affect the shape of

215
00:15:53,000 --> 00:15:53,870
our distribution.

216
00:15:53,870 --> 00:16:02,510
So this distribution is unique to n equals five trials with a P equals 0.75 chance of success.

217
00:16:02,510 --> 00:16:05,660
But that is how we build the probability distribution.

218
00:16:05,810 --> 00:16:09,830
And of course, once we build it, it should roughly make sense to us.

219
00:16:09,830 --> 00:16:18,020
If we run five trials or five experiments and our expected probability of success is 75%, well, what's

220
00:16:18,020 --> 00:16:20,000
75% of five?

221
00:16:20,000 --> 00:16:21,800
It's 3.75.

222
00:16:21,800 --> 00:16:28,190
And so roughly, we would expect that most of our mass here, in our probability, mass function in

223
00:16:28,190 --> 00:16:32,540
this distribution would be clustered around the 3.75 mark.

224
00:16:32,540 --> 00:16:34,730
And we see that that's roughly true.

225
00:16:34,760 --> 00:16:40,490
Most of the weight of our distribution is clustered around these three, four or five values.

226
00:16:40,490 --> 00:16:46,070
More specifically, even the four value, which of course is the value closest to 3.75.

227
00:16:46,070 --> 00:16:53,420
So very roughly with some general intuition, this distribution looks like it could model our scenario

228
00:16:53,420 --> 00:16:57,740
where we have five trials and a 75% chance of success.

229
00:16:57,770 --> 00:17:03,770
That being said, let's get a little bit more specific and look at the properties of A binomial random

230
00:17:03,770 --> 00:17:10,730
variable B of P, And let's say here that yes, in fact the calculation we just did is the calculation

231
00:17:10,730 --> 00:17:13,760
that we use for the mean of a binomial random variable.

232
00:17:13,760 --> 00:17:19,910
So once we know that we have a binomial random variable, it meets these conditions, we're going to

233
00:17:19,910 --> 00:17:23,900
use a binomial random variable to model our scenario.

234
00:17:23,930 --> 00:17:30,710
Then we unlock all of these formulas that apply to the binomial random variable or the binomial distribution.

235
00:17:30,710 --> 00:17:34,700
Here the mean is always going to be end times p.

236
00:17:34,700 --> 00:17:42,440
In our case, five times 0.75 or 3.75.

237
00:17:42,440 --> 00:17:46,280
So our mean our expected value is 3.75.

238
00:17:46,280 --> 00:17:52,910
So if we randomly select five rides, then we expect roughly 3.75 of them.

239
00:17:52,910 --> 00:17:59,870
In terms of sort of this long term average, we expect 3.75 of those five rides to have the driver arrive

240
00:17:59,870 --> 00:18:01,130
within 10 minutes.

241
00:18:01,130 --> 00:18:03,260
Of course, this is a discrete random variable.

242
00:18:03,260 --> 00:18:08,510
So out of five rides we'll never actually see exactly 3.75 rides meet.

243
00:18:08,510 --> 00:18:13,370
That criteria will always see either three or four or five or two or one or zero.

244
00:18:13,400 --> 00:18:17,450
We'll never see exactly 3.75, but that gives us a long term average.

245
00:18:17,450 --> 00:18:20,870
It gives us our expected value or our mean.

246
00:18:20,870 --> 00:18:26,270
So we can always find that as n times p in this case five times 0.75.

247
00:18:26,270 --> 00:18:30,710
And then the variance here is given by NP times one minus P.

248
00:18:30,770 --> 00:18:36,890
Remember sometimes two, we express this one minus P probability of failure value as Q.

249
00:18:36,920 --> 00:18:39,740
So we also see this written sometimes as NP.

250
00:18:39,740 --> 00:18:53,270
Q In our case, that's five times 0.7, five times 0.25 or 0.9375.

251
00:18:53,270 --> 00:18:58,760
That's going to be the variance of this binomial distribution for the binomial random variable.

252
00:18:58,760 --> 00:19:02,450
And then standard deviation is always just the square root of variance.

253
00:19:02,450 --> 00:19:10,400
So for standard deviation, we would take the square root of 0.9375, and that's approximately equal

254
00:19:10,400 --> 00:19:14,900
to if we round 0.9682.

255
00:19:14,900 --> 00:19:20,120
And so now we have our mean variance in standard deviation for the binomial distribution.

256
00:19:20,120 --> 00:19:25,370
So to summarize, let's just say that we always have to start with these four conditions.

257
00:19:25,370 --> 00:19:31,100
We have to meet these four conditions in order for a binomial random variable to even be on the table.

258
00:19:31,100 --> 00:19:37,310
We can't model a situation as a binomial random variable unless we're meeting these four criteria.

259
00:19:37,310 --> 00:19:42,860
So we always have to go through and verify that each of these four criteria is holding.

260
00:19:43,040 --> 00:19:49,940
If we're meeting all of these criteria, then we can use a binomial random variable which then unlocks

261
00:19:49,940 --> 00:19:53,150
for us all of these probability calculations.

262
00:19:53,150 --> 00:19:59,570
We can specifically find the probability of K successes in N trials using this formula here.

263
00:19:59,600 --> 00:20:04,430
Here where this is the binomial coefficient that we calculate this way.

264
00:20:04,450 --> 00:20:10,480
We learned about this when we talked about permutations and combinations, so we can calculate the probability

265
00:20:10,480 --> 00:20:19,600
for every possible value of K, and K is always going to take on values between zero and N.

266
00:20:19,600 --> 00:20:27,010
So if n is our number of trials, K is always going to be given by the set of integers between zero

267
00:20:27,010 --> 00:20:32,740
and N, so k can be equal to zero and then it can be one, two, three, four, five, six, etc. all

268
00:20:32,740 --> 00:20:35,980
the way up to whatever our value of DN is.

269
00:20:35,980 --> 00:20:42,400
So we can calculate probability for all of those different values of K, and then we can use those probability

270
00:20:42,400 --> 00:20:45,930
values to build the binomial distribution.

271
00:20:45,940 --> 00:20:50,950
This visual representation of the probability associated with our binomial random variable.

272
00:20:50,950 --> 00:20:56,680
And then regardless of whether or not we've already calculated all of this probability or built this

273
00:20:56,680 --> 00:21:01,930
binomial distribution, even if we don't have either of those things, all we need to find the mean

274
00:21:01,930 --> 00:21:08,140
variance in standard deviation of this distribution are the values of GN and P, which we have from

275
00:21:08,140 --> 00:21:11,080
our criteria that we unpacked at the beginning.

276
00:21:11,080 --> 00:21:12,880
We know our fixed number of trials.

277
00:21:12,880 --> 00:21:14,920
NW and we know the probability of success.

278
00:21:14,920 --> 00:21:20,680
P And those are the only two values we need to find mean variance in standard deviation so we can go

279
00:21:20,680 --> 00:21:26,950
directly to calculating these values even before we've calculated all of this probability and used it

280
00:21:26,950 --> 00:21:29,980
to build this binomial distribution.

