0
1
00:00:00,480 --> 00:00:08,330
Supervised learning is where observations contain input-output pair, just like when you're at a
1

2
00:00:08,340 --> 00:00:10,620
school and you want to learn something.
2

3
00:00:10,770 --> 00:00:13,380
Teacher is usually giving you some data.
3

4
00:00:13,920 --> 00:00:15,410
Suppose this is a math problem.
4

5
00:00:15,420 --> 00:00:18,770
It's giving you a match problem with the solution.
5

6
00:00:19,320 --> 00:00:25,950
And then after a few times after, let's say several times, then a teacher will take a quiz or we'll
6

7
00:00:25,950 --> 00:00:30,750
take a test by giving you the question itself and not giving the answer.
7

8
00:00:30,750 --> 00:00:37,720
And you are required to write down your answer based on the data that have been used to train you.
8

9
00:00:37,860 --> 00:00:39,410
It's just looking the same.
9

10
00:00:39,420 --> 00:00:42,660
We have some data with 80 percent of this data.
10

11
00:00:42,690 --> 00:00:46,430
We will train the network with the remaining 20 percent.
11

12
00:00:46,440 --> 00:00:47,500
We do the testing.
12

13
00:00:47,730 --> 00:00:53,250
Supervised learning, it's also the simplest one to understand in supervised algorithms,
13

14
00:00:53,300 --> 00:01:01,350
You may not know the inner relations of the data you are processing, but of course you do know very
14

15
00:01:01,350 --> 00:01:05,100
well which is the output you need from your model.
15

16
00:01:05,760 --> 00:01:13,290
For example, I need to be able to start predicting when users will cancel their subscription.
16

17
00:01:13,590 --> 00:01:22,540
Suppose this is a Netflix company or YouTube channel and they are looking to keep their users in subscribe.
17

18
00:01:22,890 --> 00:01:29,010
They want to have a method to predict which users are most likely going to unsubscribe for the
18

19
00:01:29,020 --> 00:01:29,880
upcoming months.
19

20
00:01:30,030 --> 00:01:38,340
For this purpose, they have some users history and they know which users unsubscribed and which users
20

21
00:01:38,430 --> 00:01:41,640
based on their activity to just stayed subscribed. 
21

22
00:01:41,880 --> 00:01:47,340
Let's say, for example, we have here 10000 data.
22

23
00:01:47,790 --> 00:01:53,190
These are users' history, users usages' (activity) history; from these,
23

24
00:01:53,190 --> 00:01:58,250
Maybe five thousand of them already canceled their subscription.
24

25
00:01:58,590 --> 00:02:06,490
And from this 10000, 5000, are still staying on their subscription.
25

26
00:02:06,720 --> 00:02:16,060
What you can do is take data from this set who unsubscribed their subscription after a few months.
26

27
00:02:16,140 --> 00:02:23,030
Let's take four thousand five hundred of this data and give it to a machine for training purposes.
27

28
00:02:23,040 --> 00:02:29,310
And from these sets of data, the users, which are very good users, they just stayed subscribed to their
28

29
00:02:29,310 --> 00:02:29,980
product.
29

30
00:02:30,390 --> 00:02:33,840
Let's take another four thousand five hundred to train
30

31
00:02:33,840 --> 00:02:40,650
the network. We will show this data to our machine, to our system, and we will tell the system this is
31

32
00:02:40,650 --> 00:02:41,970
the user history.
32

33
00:02:42,480 --> 00:02:51,510
These users unsubscribed the product, but these are the good users who stayed subscribed and they didn't
33

34
00:02:51,510 --> 00:02:55,260
unsubscribe their products after a learning process,
34

35
00:02:55,470 --> 00:03:04,060
it's time to test the data. From these nine thousand data, we have the remaining 1000 from this (group of) one thousand,
35

36
00:03:04,080 --> 00:03:10,410
We are going to take a quiz of our system. We randomly, give the data to the system and this time
36

37
00:03:10,710 --> 00:03:12,870
we are not giving the result.
37

38
00:03:12,870 --> 00:03:20,940
We are not telling the machine if this user stayed subscribed or already unsubscribed their product and we will
38

39
00:03:20,940 --> 00:03:28,380
check if it can already predict the correct subscribe or subscribe status for each user.
39

40
00:03:28,620 --> 00:03:33,150
This is just one example of how can we use this supervised learning.
