1
00:00:05,400 --> 00:00:07,080
Welcome, everyone, to the course.

2
00:00:07,080 --> 00:00:12,180
In this very first lecture, we're going to have a brief overview of the topics we cover in this course.

3
00:00:12,840 --> 00:00:16,010
First off, we want to thank you so much for enrolling.

4
00:00:16,020 --> 00:00:20,160
And before you begin with this lecture, you should remember to check out the very first lecture in

5
00:00:20,160 --> 00:00:22,180
the course, which is right before this one.

6
00:00:22,200 --> 00:00:26,940
It's an article lecture, so it basically just has a bunch of links with important information like

7
00:00:26,940 --> 00:00:30,810
how to ask a question in the course as well as the link for things like the course slides.

8
00:00:30,810 --> 00:00:35,400
So definitely make sure to check out the first lecture before continuing on with the course.

9
00:00:35,960 --> 00:00:39,570
Okay, so what is the course curriculum and what do we cover in this course?

10
00:00:39,780 --> 00:00:45,510
This course focuses on the mathematics behind the key concepts for data science and analysis, allowing

11
00:00:45,510 --> 00:00:50,610
you to understand the intuition behind many methods and procedures that you'll see being conducted in

12
00:00:50,610 --> 00:00:52,170
the field of data science.

13
00:00:52,500 --> 00:00:59,400
Now, to do this, we start off with this curriculum, core data concepts, visualizing data, combinatorics,

14
00:00:59,400 --> 00:01:02,880
probability, joint distributions and data distributions.

15
00:01:03,030 --> 00:01:08,700
Then we move on to discussing the normal distribution sampling hypothesis, testing and regression.

16
00:01:08,700 --> 00:01:12,690
So we cover all of that in the curriculum and it's a lot of material.

17
00:01:12,720 --> 00:01:15,540
Let me give you a very brief overview of each section.

18
00:01:16,850 --> 00:01:19,880
So we start off the course with core data concepts here.

19
00:01:19,880 --> 00:01:24,860
We really just talk about very basic ideas about data, and we also make sure we start off with a level

20
00:01:24,860 --> 00:01:25,660
playing field.

21
00:01:25,670 --> 00:01:30,440
We're going to talk about measurements of data such as mean median mode and then measurements of dispersion

22
00:01:30,440 --> 00:01:32,390
like variance in standard deviation.

23
00:01:33,590 --> 00:01:35,600
Then we'll move on to visualizing data.

24
00:01:35,600 --> 00:01:40,340
The reason we also cover this earlier in the course is because we use a lot of visualizations when describing

25
00:01:40,340 --> 00:01:41,900
things like data distribution.

26
00:01:41,900 --> 00:01:46,670
So when learning about data distributions like the normal distribution or binomial distribution, it's

27
00:01:46,670 --> 00:01:48,830
important to understand what you're visually looking at.

28
00:01:48,830 --> 00:01:54,800
So we give you a tour of a bunch of different ways to visualize data, because communicating your results

29
00:01:54,800 --> 00:01:56,930
is a really important part of data science.

30
00:01:58,040 --> 00:02:01,630
Then we start moving on to the more math heavy parts of the course.

31
00:02:01,640 --> 00:02:03,170
So we start off with combinatorics.

32
00:02:03,170 --> 00:02:07,760
This is the study of counting, and this section helps you understand concepts like counting combinations

33
00:02:07,760 --> 00:02:12,950
and permutations of objects like the different number of ways to sort a deck of cards.

34
00:02:14,410 --> 00:02:19,210
Then we move on to probability and we talk a lot about the basics of probability in the section, but

35
00:02:19,210 --> 00:02:22,600
we also include things like conditional probability, such as Bayes Theorem.

36
00:02:22,660 --> 00:02:24,400
It's a pretty big section of the course.

37
00:02:25,280 --> 00:02:28,220
Then we move on to talking about things like distributions.

38
00:02:28,250 --> 00:02:32,990
Our very first discussion is about joint distributions, talking about the relationships between data

39
00:02:32,990 --> 00:02:33,350
sets.

40
00:02:33,350 --> 00:02:38,330
So we talk about here about concepts like covariance and the correlation coefficient.

41
00:02:39,460 --> 00:02:41,510
Then we move on to data distributions.

42
00:02:41,530 --> 00:02:45,130
We're going to start off with just an understanding of what is a data distribution like, What's the

43
00:02:45,130 --> 00:02:49,480
difference between a probability mass function versus a probability density function and how those relate

44
00:02:49,480 --> 00:02:51,570
to the idea of random variables.

45
00:02:51,580 --> 00:02:56,470
After that, we take you through a tour of the most popular and common data distributions such as Bernoulli

46
00:02:56,470 --> 00:03:00,340
distribution, Poisson uniform and a bunch of distributions.

47
00:03:00,340 --> 00:03:04,900
I should point out that we actually have a special section for the normal distribution, so we have

48
00:03:04,900 --> 00:03:08,680
an entirely separate section on the normal distribution due to its unique properties.

49
00:03:08,680 --> 00:03:12,460
And also it's one of the most common distributions that occur in real world data sets.

50
00:03:12,460 --> 00:03:14,110
So it's really important to understand this.

51
00:03:14,110 --> 00:03:19,210
And keep in mind this section in particular really focuses heavily on applying the statistics to real

52
00:03:19,210 --> 00:03:20,170
world applications.

53
00:03:20,170 --> 00:03:24,670
But I should mention pretty much all the sections, especially the questions at the end, focus on applying

54
00:03:24,670 --> 00:03:29,140
what you've learned to something in the real world that you would encounter as a practicing data scientist.

55
00:03:30,370 --> 00:03:34,000
Then we'll talk about sampling, where we focus on ideas like the Central Limit theorem, which are

56
00:03:34,000 --> 00:03:38,440
really critical for understanding how to apply statistical concepts to real world data sets.

57
00:03:38,980 --> 00:03:43,750
After this, we discuss hypothesis testing, which is super important for concepts like understanding

58
00:03:43,750 --> 00:03:48,950
terms like P values or significant levels or ideas like AB testing on a website.

59
00:03:48,970 --> 00:03:54,070
This all has to do with hypothesis testing and then we conclude the course with regression.

60
00:03:54,070 --> 00:03:58,630
So we touch on one of the most common statistical modeling techniques at the very end of the course,

61
00:03:58,630 --> 00:04:03,430
allowing you to use your skills to actually perform something like possible forecasting or modeling

62
00:04:03,430 --> 00:04:05,950
of outcomes based on inputs.

63
00:04:07,050 --> 00:04:07,530
All right.

64
00:04:07,530 --> 00:04:08,370
Let's get started.

65
00:04:08,400 --> 00:04:09,660
We'll see you at the next lecture.

