1
00:00:05,430 --> 00:00:08,820
Welcome everyone to this section of the course on probability.

2
00:00:08,820 --> 00:00:12,630
What we're going to be discussing rules, theorems and applications.

3
00:00:13,720 --> 00:00:20,590
Probability is the branch of mathematics that deals with how likely an event is to occur or how likely

4
00:00:20,590 --> 00:00:23,920
is it that a proposition or hypothesis is true?

5
00:00:24,130 --> 00:00:28,900
So this gives us two clear avenues to pursue with learning probability.

6
00:00:28,930 --> 00:00:35,710
We can try to analyze the probability of future events or try to analyze the probability of our own

7
00:00:35,710 --> 00:00:36,940
hypothesis.

8
00:00:37,180 --> 00:00:42,640
So in this section of the course, we're really focused on the first one probability of future events.

9
00:00:43,390 --> 00:00:50,800
So we're going to focus on this first aspect, such as what are the odds of so-and-so or Event X happening

10
00:00:50,800 --> 00:00:51,700
in the future?

11
00:00:51,880 --> 00:00:58,570
The actual idea of checking the probability of some hypothesis, such as an AB test that's known as

12
00:00:58,570 --> 00:01:02,950
hypothesis testing and a later section in the course will focus on that.

13
00:01:02,950 --> 00:01:05,830
So probability is a very wide ranging topic.

14
00:01:05,830 --> 00:01:08,380
And for this section we're really focused on the idea.

15
00:01:08,380 --> 00:01:14,260
If you have an event that will occur in the future, what's the probability that it actually happens?

16
00:01:15,800 --> 00:01:20,420
Keep in mind that many of the ideas pertaining to calculating probability of future events is related

17
00:01:20,420 --> 00:01:21,670
to hypothesis testing.

18
00:01:21,680 --> 00:01:26,690
So a lot of what you learn here, you're going to be able to directly apply to hypothesis testing later

19
00:01:26,690 --> 00:01:27,110
on.

20
00:01:28,700 --> 00:01:34,490
Now, being able to make intelligent estimations of probability of future events occurring is a crucial

21
00:01:34,490 --> 00:01:37,010
element to a data driven organisation.

22
00:01:37,070 --> 00:01:41,810
And in this section of the course, we're going to be discovering a variety of tools available to us

23
00:01:41,960 --> 00:01:47,330
to understand different ways of calculating probabilities depending on the actual structure of the question

24
00:01:47,330 --> 00:01:48,020
at hand.

25
00:01:49,690 --> 00:01:53,830
Organizations often need to make decisions where the outcomes are uncertain.

26
00:01:54,040 --> 00:02:00,730
Using probability effectively can help mitigate risks and also help develop backup plans for low probability

27
00:02:00,730 --> 00:02:01,510
events.

28
00:02:03,400 --> 00:02:08,320
Keep in mind, often information needs to be reported to others in terms of probability.

29
00:02:08,650 --> 00:02:13,900
Imagine a hardware manufacturer needing to report the likelihood of defective devices.

30
00:02:15,290 --> 00:02:21,770
So you may want to have some sort of report, such as there is a 1% chance that a phone received by

31
00:02:21,770 --> 00:02:24,410
a customer will have a defective antenna.

32
00:02:24,530 --> 00:02:31,280
So often in data driven organizations, you're thinking of things in terms of percent chances or probabilities

33
00:02:31,280 --> 00:02:32,150
or odds.

34
00:02:33,520 --> 00:02:39,490
Now there will be situations where you can't actually test every phone made, or there could be a situation

35
00:02:39,490 --> 00:02:45,070
where defective antennas only occur on a certain percentage of phones after a certain time period.

36
00:02:45,400 --> 00:02:48,460
Perhaps a defective rate could even increase with time.

37
00:02:48,490 --> 00:02:53,140
These are all different situations where we're going to have to apply different theories of probability

38
00:02:53,140 --> 00:02:53,860
to them.

39
00:02:55,190 --> 00:02:58,250
And keep in mind, this isn't just some sort of thought exercise.

40
00:02:58,280 --> 00:03:04,010
Many companies deal with such issues, and understanding probability is critical to the company's response.

41
00:03:04,040 --> 00:03:07,250
You can take a look at Antennagate with the iPhone four.

42
00:03:07,280 --> 00:03:10,760
That's actually where the previous example was based off of.

43
00:03:10,790 --> 00:03:11,360
Apple.

44
00:03:11,360 --> 00:03:16,080
The company had issues with antennas in their version of iPhone four.

45
00:03:16,100 --> 00:03:21,710
So in order to understand what the best approach is, you need to understand the probability that a

46
00:03:21,710 --> 00:03:24,830
customer is actually going to receive a defective phone.

47
00:03:24,980 --> 00:03:31,450
If it's a large percentage of customers, then you may want to start off in official recall program.

48
00:03:31,460 --> 00:03:37,490
If it's a small percentage of customers, you may want to do some sort of more targeted effect to help

49
00:03:37,490 --> 00:03:38,680
out those customers.

50
00:03:38,690 --> 00:03:43,160
And again, you have to think about this in terms of probability, since you're not going to know for

51
00:03:43,160 --> 00:03:48,860
100% certain whether or not every single phone by every single customer is defective or not.

52
00:03:50,810 --> 00:03:55,780
And you also experience probabilistic thinking every day, such as in weather reports.

53
00:03:55,820 --> 00:03:58,340
You can see weather report for Miami, Florida.

54
00:03:58,340 --> 00:04:03,390
And notice that precipitation itself is denoted as probability of happening.

55
00:04:03,410 --> 00:04:08,480
You don't know for sure if it's actually going to happen, but you do know the probability or odds of

56
00:04:08,480 --> 00:04:09,920
precipitation happening.

57
00:04:11,540 --> 00:04:17,990
Clearly we need tools and frameworks to help us understand how to report probabilistic information within

58
00:04:17,990 --> 00:04:19,130
an organization.

59
00:04:19,339 --> 00:04:25,610
We also need to understand the benefits of probabilistic thinking in general, since most outcomes are

60
00:04:25,610 --> 00:04:27,320
not going to be known for certain.

61
00:04:29,100 --> 00:04:34,470
So in this section of the course, we're first going to try to understand probability concepts such

62
00:04:34,470 --> 00:04:40,290
as the law of large numbers, the addition rule, unions and intersections, as well as Venn diagrams

63
00:04:40,290 --> 00:04:44,400
and how we can use Venn diagrams to understand probabilistic statements.

64
00:04:45,680 --> 00:04:51,620
Then we're also going to discuss concepts about likelihood, things like Bayes Theorem, discrete probability

65
00:04:51,620 --> 00:04:52,970
and random variables.

66
00:04:54,360 --> 00:04:59,880
Before we dive into a discussion of probability, concepts and mathematics, let's quickly cover some

67
00:04:59,880 --> 00:05:03,360
notation conventions that are commonly used in probability.

68
00:05:03,390 --> 00:05:09,150
In my own opinion, this is something that makes probability hard for beginners at first, is that you're

69
00:05:09,150 --> 00:05:14,550
thrown into this world with very specific notation that you may not have seen before if you haven't

70
00:05:14,550 --> 00:05:17,190
taken a probability or statistics class.

71
00:05:18,770 --> 00:05:25,220
So we're often trying to calculate or express the probability of an event occurring, such as what is

72
00:05:25,220 --> 00:05:30,170
the probability that a randomly chosen part from an assembly line has a defect?

73
00:05:31,770 --> 00:05:34,440
So let's go through some basic notation.

74
00:05:34,590 --> 00:05:40,640
We use p parentheses to represent the probability of what is inside the parentheses.

75
00:05:40,650 --> 00:05:48,180
So for example, if we call the event a as when you pick up a defective part from the assembly line,

76
00:05:48,180 --> 00:05:54,770
then P of a or probability of a is a probability that event a will occur.

77
00:05:54,780 --> 00:06:00,870
So in other words, p of a or P with a inside the parentheses is the same thing as saying what's the

78
00:06:00,870 --> 00:06:05,070
probability that you're going to pick up a defective part from the assembly line?

79
00:06:05,400 --> 00:06:12,600
Now something to note about this notation is that it typically is quite flexible and requires some context.

80
00:06:12,600 --> 00:06:18,850
So I gave you the context that we knew event a occurring was the same as finding a defective part.

81
00:06:18,870 --> 00:06:22,590
That way I could just use the notation p of a.

82
00:06:22,620 --> 00:06:28,470
However, you're going to sometimes see more descriptive events if you don't want to include the context.

83
00:06:28,470 --> 00:06:31,710
That event A is equal to finding a defective part.

84
00:06:31,740 --> 00:06:37,890
You sometimes see people write entire statement inside those parentheses or some sort of shortened version

85
00:06:37,890 --> 00:06:38,730
of the statement.

86
00:06:38,730 --> 00:06:45,060
So instead of P of A or event, A is finding a defective part, you may see actually someone write P

87
00:06:45,060 --> 00:06:49,860
of part defective, which is a little easier to read if you know some context.

88
00:06:49,860 --> 00:06:53,520
So it's a probability of the part being defective.

89
00:06:53,610 --> 00:06:56,280
Now keep in mind that's the same thing of PFA here.

90
00:06:56,280 --> 00:07:03,240
If I define event A as finding a defective part, clearly to understand this notation, you need a little

91
00:07:03,240 --> 00:07:04,320
bit of context.

92
00:07:04,320 --> 00:07:09,780
You can't just say to someone what's the probability of event A without describing what event A is to

93
00:07:09,780 --> 00:07:10,500
begin with.

94
00:07:12,460 --> 00:07:13,150
Keep in mind.

95
00:07:13,180 --> 00:07:14,610
Either way is okay.

96
00:07:14,620 --> 00:07:20,080
But you should also know that the nature of using probability means we're constantly analyzing different

97
00:07:20,080 --> 00:07:20,860
events.

98
00:07:20,860 --> 00:07:25,660
So be prepared to see different ways of indicating events inside p parentheses.

99
00:07:27,360 --> 00:07:33,120
We should also note that different ways of reporting probabilities in mathematics, we commonly report

100
00:07:33,120 --> 00:07:40,500
a probability as a value between zero and one zero, meaning the event will not occur and one meaning

101
00:07:40,500 --> 00:07:42,690
the event will definitely occur.

102
00:07:44,520 --> 00:07:49,830
In business and common parlance, you often see these values translated to a percentage by multiplying

103
00:07:49,830 --> 00:07:50,690
by 100.

104
00:07:50,700 --> 00:07:57,120
And this allows you to report back a value between 0% and 100% rather than 0 to 1.

105
00:07:57,150 --> 00:08:03,230
However, for mathematical formulas you should be using zero and one instead of actual percentages.

106
00:08:03,240 --> 00:08:08,640
You typically don't report those percentages until you're discussing something like a decision maker

107
00:08:08,640 --> 00:08:13,530
or some sort of leadership position or just general context throughout your organization.

108
00:08:13,650 --> 00:08:18,330
Mathematically speaking, we're pretty much always dealing with probability in terms of value between

109
00:08:18,330 --> 00:08:19,560
zero and one.

110
00:08:21,670 --> 00:08:25,420
Probabilities also have their complement counterpart.

111
00:08:25,540 --> 00:08:32,590
For example, if we said P of A is the probability of a defective part, then how could we describe

112
00:08:32,590 --> 00:08:37,520
the opposite or the complement that is not finding a defective part?

113
00:08:37,539 --> 00:08:43,600
So picking up a piece of hardware from the assembly line and having it be OC otherwise not defective,

114
00:08:43,720 --> 00:08:48,880
that opposite or counterpart is known officially as the complement probability.

115
00:08:49,060 --> 00:08:57,040
The typical notation for this is going to be p of a with an apostrophe or P of a prime where a prime

116
00:08:57,040 --> 00:08:58,600
is the complement to event.

117
00:09:00,130 --> 00:09:07,330
You sometimes also see this notation as a C for the compliment where again a C is the complement event,

118
00:09:07,330 --> 00:09:13,960
essentially the complement counterpart to event A So if event A is finding a defective part, then the

119
00:09:13,960 --> 00:09:20,590
complement of a either a prime or a C is the probability of not finding a defective part.

120
00:09:23,250 --> 00:09:28,740
So again, the complement event, a prime is the event a not occurring.

121
00:09:29,100 --> 00:09:34,020
Note that this thinking is really not just for events with binary outcome.

122
00:09:34,050 --> 00:09:38,970
So in the example we've been talking about, there's really only two possibilities for the piece of

123
00:09:38,970 --> 00:09:39,590
hardware.

124
00:09:39,600 --> 00:09:46,530
Either it's defective or not defective, but that is actually a smaller range of outcomes where it's

125
00:09:46,530 --> 00:09:54,480
binary, either defective or not defective, which means it represents events a or events a prime,

126
00:09:54,480 --> 00:10:00,870
but complements of a particular event can actually contain multiple events as a complement.

127
00:10:01,830 --> 00:10:08,820
For example, let's imagine we had a production line that was making t shirts and we had three sizes

128
00:10:08,820 --> 00:10:11,490
small, medium and large.

129
00:10:11,580 --> 00:10:19,170
We could state that the probability of a customer choosing the size small as probability of s.

130
00:10:19,500 --> 00:10:23,660
Now, in this particular situation, what is the complement?

131
00:10:23,670 --> 00:10:31,590
That's the probability of a customer not choosing small, which actually means that the complement of

132
00:10:31,590 --> 00:10:39,780
probability of S or P of S prime means the customer chose a size medium or a size large.

133
00:10:39,900 --> 00:10:46,290
So don't think of compliments as just two events the original and then the prime.

134
00:10:46,290 --> 00:10:51,510
It can actually have a complement that is going to accept multiple events such as customer choosing

135
00:10:51,510 --> 00:10:54,870
medium or the event that a customer chooses large.

136
00:10:54,870 --> 00:10:58,470
And hopefully that kind of expands the idea of what a complement is.

137
00:10:58,500 --> 00:11:03,150
It's just the inverse of the original, the counterpart, so to speak.

138
00:11:04,810 --> 00:11:10,990
Now, one crucial idea behind compliments to keep in mind is that it's often the case that's actually

139
00:11:10,990 --> 00:11:16,210
a lot easier to calculate the compliment than the actual event we're interested in.

140
00:11:16,300 --> 00:11:20,620
This often has to do with the context of the question you're trying to answer or the problem you're

141
00:11:20,620 --> 00:11:21,560
trying to solve.

142
00:11:21,580 --> 00:11:26,350
But you should always keep in mind this little trick that you can calculate the compliment in order

143
00:11:26,350 --> 00:11:30,040
to calculate the probability of the actual event that you're interested in.

144
00:11:30,310 --> 00:11:35,380
In this case, you need to understand that probabilities are always expressed as a likelihood between

145
00:11:35,380 --> 00:11:36,040
zero.

146
00:11:36,100 --> 00:11:41,260
The event will not occur for certain and one the event will occur for certain.

147
00:11:42,450 --> 00:11:49,650
This means a compliment can also mathematically be defined as the following The probability of the compliment

148
00:11:49,650 --> 00:11:53,610
is equal to one minus the probability of the event.

149
00:11:53,610 --> 00:12:01,830
So p of a prime is equal to one minus the probability of event a occurring, which makes sense because

150
00:12:01,830 --> 00:12:10,230
the complement and the event itself, those probabilities summed up together need to equal one because

151
00:12:10,230 --> 00:12:15,120
either the complement is going to occur or the actual event will occur.

152
00:12:15,120 --> 00:12:22,350
And those two, the event occurring or the event opposite or counterpart occurring, those should equal

153
00:12:22,380 --> 00:12:26,880
to one because that represents the entire space of events that are possible.

154
00:12:29,690 --> 00:12:34,190
Now, one quick note on real world situations to probability.

155
00:12:34,340 --> 00:12:41,360
When you're assigning binary events, try to match events and complements to the real world counterparts.

156
00:12:41,360 --> 00:12:44,180
Positive outcome versus negative outcome.

157
00:12:44,540 --> 00:12:46,040
Let me explain what I mean by that.

158
00:12:47,440 --> 00:12:54,430
For example, let's imagine that we need to define event A as the hardware being defective.

159
00:12:54,670 --> 00:12:57,470
What does this actually mean in the real world?

160
00:12:57,490 --> 00:13:05,110
That means the probability of a being equal to zero means there's no chance that the hardware is defective.

161
00:13:05,260 --> 00:13:11,020
The probability of event a being equal to one would mean that the hardware is certain to be defective.

162
00:13:11,050 --> 00:13:14,210
That means any piece you pick off the line is defective.

163
00:13:14,230 --> 00:13:19,480
If probability of a is equal to one, if probability of a is equal to zero, then you have a perfect

164
00:13:19,480 --> 00:13:22,390
assembly line and there are no defective parts.

165
00:13:22,780 --> 00:13:24,970
If you pick up off the assembly line.

166
00:13:26,430 --> 00:13:29,760
Now, how would you actually write the alternative to this?

167
00:13:30,890 --> 00:13:37,610
Well, you could describe events A as the hardware being functional, essentially not defective, in

168
00:13:37,610 --> 00:13:43,100
which case you would say the probability of a equals zero means there's no chance the hardware is functional,

169
00:13:43,100 --> 00:13:47,870
and the probability of a being equal to one means the hardware is certain to be functional.

170
00:13:48,080 --> 00:13:51,500
Notice how we're basically just inverting the definition of event.

171
00:13:51,530 --> 00:13:56,900
A Going from hardware is defective to hardware is functional.

172
00:13:56,930 --> 00:14:03,800
The reason I bring this up is because often in real world situations it's easier to think as a certain

173
00:14:03,800 --> 00:14:12,130
event as 0% chance versus another opposite event as 100% chance or 0% chance.

174
00:14:12,140 --> 00:14:16,920
So you have a lot of context here that you can play around with for the definition.

175
00:14:16,940 --> 00:14:23,090
So in this particular example, you may want to ask leadership or different colleagues for reporting

176
00:14:23,090 --> 00:14:25,120
metrics that make the most sense.

177
00:14:25,160 --> 00:14:32,450
Does leadership or other colleagues want the percentages of probability to be reported for hardware

178
00:14:32,450 --> 00:14:36,410
that's defective or for hardware that is functional?

179
00:14:36,590 --> 00:14:39,470
One case may make more sense than the other.

180
00:14:39,470 --> 00:14:47,210
It's really up to you to define the events, parameter and context to understand which reporting metric

181
00:14:47,210 --> 00:14:48,530
makes the most sense.

182
00:14:48,620 --> 00:14:51,020
Either is 100% correct.

183
00:14:51,050 --> 00:14:56,270
You just need to provide enough context to understand what the event actually is.

184
00:14:57,760 --> 00:15:02,440
You should try to choose this based on the situation and what reporting metrics make sense.

185
00:15:04,230 --> 00:15:09,540
In real world situations, we often need to describe the probabilities based on multiple events.

186
00:15:09,570 --> 00:15:14,640
This includes concepts such as conditional probability intersections and unions.

187
00:15:14,790 --> 00:15:19,020
Let's explore the notation for these types of multiple event probabilities.

188
00:15:20,270 --> 00:15:26,090
Thinking back to our example of defective hardware, what if we had multiple events to consider?

189
00:15:26,330 --> 00:15:32,210
For example, what if there were multiple factories producing the same part, but the factories themselves

190
00:15:32,210 --> 00:15:34,520
had different quality assurance standards?

191
00:15:36,110 --> 00:15:43,760
So we now have two separate events, the events of picking a factory and then the other separate events

192
00:15:43,760 --> 00:15:45,470
of then picking a part.

193
00:15:45,650 --> 00:15:49,310
Now you may be wondering, is it actually a separate event?

194
00:15:49,340 --> 00:15:56,090
Well, in this case, you could treat picking a part as conditional to picking a factory, since the

195
00:15:56,090 --> 00:16:00,770
probability of being defective could vary based on the actual factory that you chose.

196
00:16:02,470 --> 00:16:07,300
So you may find yourself going from the original question of what is the probability that a randomly

197
00:16:07,300 --> 00:16:12,880
chosen part from an assembly line has a defect to an expanded question?

198
00:16:12,970 --> 00:16:18,760
So what's the probability that a randomly chosen part from an assembly line has a defect, given that

199
00:16:18,760 --> 00:16:22,270
it came from a specific factory like factory number two?

200
00:16:23,810 --> 00:16:27,380
Notice how the factory itself could be treated as an event.

201
00:16:27,380 --> 00:16:29,780
So choosing a particular factory.

202
00:16:29,930 --> 00:16:34,580
Again, this depends on the context of the scope of the question being asked.

203
00:16:34,610 --> 00:16:39,110
Do you actually have knowledge of what factory this came from, or are you just picking an assembly

204
00:16:39,110 --> 00:16:41,540
line at a particular factory?

205
00:16:43,120 --> 00:16:47,530
There are many real world situations where you will discover that you want to calculate the probability

206
00:16:47,530 --> 00:16:56,260
of event A given the condition that event B has occurred, such as event A being a defective part and

207
00:16:56,260 --> 00:16:59,890
event B being a specific factory that was chosen.

208
00:17:01,260 --> 00:17:03,120
So where am I actually going with this?

209
00:17:03,150 --> 00:17:08,460
Well, there's a particular notation for what's known as conditional probability, where we're dealing

210
00:17:08,460 --> 00:17:12,150
with multiple events and we have a condition on them.

211
00:17:12,150 --> 00:17:21,510
So the formula or notation looks like the following P of A given B, so the probability that event A

212
00:17:21,540 --> 00:17:25,560
will occur given that events B did occur.

213
00:17:26,770 --> 00:17:31,320
Know how this notation states that event B has in fact occurred.

214
00:17:31,330 --> 00:17:37,450
So we assume if event B has in fact occurred, what's now the probability of event A?

215
00:17:37,480 --> 00:17:44,440
So for example, if in fact you did choose a particular factory for event B, what's the probability

216
00:17:44,440 --> 00:17:46,900
of event A a part being defective?

217
00:17:48,560 --> 00:17:49,820
And again, take careful note.

218
00:17:49,820 --> 00:17:54,720
That event a and event be shown here are really just placeholders and just like before.

219
00:17:54,740 --> 00:17:59,660
You may see them swapped out for other variables in the real world, but the logic and statement of

220
00:17:59,660 --> 00:18:01,640
the notation actually remains.

221
00:18:02,270 --> 00:18:07,780
So we could say the probability that the event X will occur given that event Y occurred.

222
00:18:07,790 --> 00:18:12,210
So you can swap these out for any sort of variable you want as a placeholder.

223
00:18:12,230 --> 00:18:18,440
Just note that the notation doesn't change its probability that event X will occur given the event Y

224
00:18:18,440 --> 00:18:19,220
has occurred.

225
00:18:20,060 --> 00:18:25,190
So in other words, if we were to really spell this out, it's the probability that this blue event

226
00:18:25,190 --> 00:18:31,400
will occur, given the fact that this red event has in fact occurred, or in other words, the probability

227
00:18:31,400 --> 00:18:36,500
that the first event will occur, given the fact that the second event has in fact occurred.

228
00:18:38,150 --> 00:18:43,850
We can see the notation clearly states the probability of the first event being conditional on the second

229
00:18:43,850 --> 00:18:45,200
event having occurred.

230
00:18:45,230 --> 00:18:47,600
Thus the name conditional probability.

231
00:18:47,720 --> 00:18:51,860
You will see this notation quite a bit, especially when working with Bayesian statistics.

232
00:18:53,340 --> 00:18:58,170
When thinking of separate events, we should also take the time to consider if they are independent

233
00:18:58,170 --> 00:18:59,430
or dependent.

234
00:18:59,610 --> 00:19:01,590
In our defective hardware example.

235
00:19:01,620 --> 00:19:07,260
Clearly the probability of being defective is going to be dependent on the event of choosing a particular

236
00:19:07,260 --> 00:19:08,010
factory.

237
00:19:09,290 --> 00:19:15,320
However, there are going to be cases where events will be independent, in which case event a occurring

238
00:19:15,320 --> 00:19:18,860
has no effect on events be occurring and vice versa.

239
00:19:19,220 --> 00:19:23,170
Be very careful when labeling two events as completely independent.

240
00:19:23,180 --> 00:19:26,090
There could be factors actually linking the two events.

241
00:19:26,120 --> 00:19:28,580
Each case and situation will be different.

242
00:19:29,940 --> 00:19:35,280
For example, you may initially think that delays at one airport in New York City are independent of

243
00:19:35,280 --> 00:19:37,590
delays at another airport in Miami.

244
00:19:38,720 --> 00:19:43,730
However, if you traveled enough, you probably already know that weather delays at one airport can

245
00:19:43,730 --> 00:19:49,100
cause issues at another airport as landing times get pushed back across the entire nation.

246
00:19:49,220 --> 00:19:54,470
For example, if you have really common routes, then between Miami and New York City, then delays

247
00:19:54,470 --> 00:20:00,430
at Miami or New York City are going to affect landing times or takeoff times at the other airport.

248
00:20:00,440 --> 00:20:05,210
So even though you might initially think, hey, there are two different cities, delays are two completely

249
00:20:05,210 --> 00:20:10,700
different independent events, if you know more and more about the actual domain, you may realize that

250
00:20:10,700 --> 00:20:13,550
these are actually dependent events with each other.

251
00:20:14,910 --> 00:20:20,460
So in this case, it's probably wiser to calculate a conditional probability using delays at other airports

252
00:20:20,460 --> 00:20:23,580
to decide the probability of a delay at the Target airport.

253
00:20:25,010 --> 00:20:28,310
Finally, let's discuss intersections and unions.

254
00:20:29,410 --> 00:20:34,660
When studying probability of two events, it's often useful to visualize them using Venn diagrams.

255
00:20:35,860 --> 00:20:37,810
Imagine two separate events.

256
00:20:37,840 --> 00:20:38,860
A and B.

257
00:20:39,850 --> 00:20:45,930
For example, the event a of picking a defective phone and the event B of having the phone be the color

258
00:20:45,940 --> 00:20:46,600
black.

259
00:20:47,980 --> 00:20:50,260
You could describe these using a Venn diagram.

260
00:20:51,490 --> 00:20:56,950
If we run a test where both event a an event be occurred, then this would be known as the intersection

261
00:20:56,950 --> 00:21:01,720
of A and B, which is here in the Venn diagram, the middle part.

262
00:21:01,720 --> 00:21:07,120
So the fact where event occurred and event B occurred essentially a defective black phone.

263
00:21:08,310 --> 00:21:16,170
The intersection of A and B can then be denoted, or the notation for it is the following A and then

264
00:21:16,170 --> 00:21:18,900
this upside down use symbol with B.

265
00:21:21,310 --> 00:21:28,300
Intersections can also help us mathematically define mutually exclusive events, meaning they have no

266
00:21:28,300 --> 00:21:32,110
elements in common and cannot both occur at the same time.

267
00:21:33,480 --> 00:21:37,830
This is the first image that we actually showed where the circles are not touching.

268
00:21:37,830 --> 00:21:40,080
Showing an intersection is zero.

269
00:21:40,260 --> 00:21:46,140
That makes sense because there should be no overlap between event A and event B if they are completely

270
00:21:46,140 --> 00:21:47,560
mutually exclusive.

271
00:21:47,580 --> 00:21:51,420
So there's no elements in common and they cannot both occur at the same time.

272
00:21:51,420 --> 00:21:53,580
Meaning an intersection is impossible.

273
00:21:55,080 --> 00:22:02,160
So mutually exclusive events A and B would be defined as the following notation where you have an intersection

274
00:22:02,160 --> 00:22:08,970
equal to zero, meaning there's not going to be a situation that's possible for both event A and event

275
00:22:09,000 --> 00:22:10,260
B to occur.

276
00:22:10,350 --> 00:22:16,830
So maybe event A is going to be the fact that it's night time and event B could be the fact that it's

277
00:22:16,830 --> 00:22:17,700
daytime.

278
00:22:17,730 --> 00:22:20,850
There's no intersection where it's both night and day.

279
00:22:23,270 --> 00:22:26,810
A union of events can then describe the collection of all outcomes.

280
00:22:26,810 --> 00:22:33,170
So you have only event, a occurring only event be occurring or both events A and B occurring.

281
00:22:34,650 --> 00:22:40,440
The Union of A and B can then be denoted with the following notation A And then this one is easier to

282
00:22:40,440 --> 00:22:42,060
remember with you for union.

283
00:22:42,060 --> 00:22:43,020
And then B.

284
00:22:44,760 --> 00:22:49,980
Now that we understand the basic notation, let's start diving into some more probability topics in

285
00:22:49,980 --> 00:22:51,180
a lot more detail.

286
00:22:51,300 --> 00:22:55,500
Use these slides as a reference for the notation and how to read it.

287
00:22:55,500 --> 00:23:00,300
When we learn about more complex probability statements, we'll see you at the next lecture.

