1
00:00:11,010 --> 00:00:16,980
Now, we've just looked at the API perspective on LDA, so we understand how it works in terms of inputs

2
00:00:16,980 --> 00:00:17,700
and outputs.

3
00:00:18,090 --> 00:00:22,170
We know what inputs to pass in and what outputs we should expect to get back.

4
00:00:23,040 --> 00:00:28,320
Let's also look at the actual API to understand more about how it works and to prepare you for the coding

5
00:00:28,320 --> 00:00:28,800
lecture.

6
00:00:29,730 --> 00:00:33,900
Let's assume we know how to use a count vector riser to build our data matrix x.

7
00:00:34,560 --> 00:00:39,060
We then create an object of type laten directly allocation as we normally do.

8
00:00:40,320 --> 00:00:44,760
We then call the fit method on our object, passing an X as we normally do.

9
00:00:45,900 --> 00:00:50,880
Note that unlike supervised learning, unsupervised models only take in the input data set.

10
00:00:51,180 --> 00:00:52,620
But there are no targets.

11
00:00:53,160 --> 00:00:58,560
So in the supervised learning, you'll recall that this looks like fit x y where Y is an array of targets,

12
00:00:58,980 --> 00:01:01,800
but we don't have targets, so there is no y array.

13
00:01:03,570 --> 00:01:09,060
Now, the next thing we do is call the Transform method, which gives us back a new matrix called Z

14
00:01:10,140 --> 00:01:12,840
Z represents our documents by topics matrix.

15
00:01:14,100 --> 00:01:20,910
Just like with X, Z is a common letter we use in machine learning to denote a specific thing in particular,

16
00:01:20,920 --> 00:01:23,130
whereas X represents data we observed.

17
00:01:23,550 --> 00:01:25,710
Z represents data we did not observes.

18
00:01:26,670 --> 00:01:31,620
If you study machine learning further, you'll see these referred to as latent variables or hidden variables.

19
00:01:32,940 --> 00:01:37,950
Another difference with unsupervised learning is that we use this transform method instead of predict.

20
00:01:38,580 --> 00:01:41,700
This is just a circuit learning convention, but it makes a lot of sense.

21
00:01:42,240 --> 00:01:47,400
With supervised learning, you are trying to predict a target, but with unsupervised learning you don't

22
00:01:47,400 --> 00:01:51,300
have any targets, so you're just transforming the input into a new variable.

23
00:01:52,050 --> 00:01:59,130
In this case, a vector of unknown topics both transform and predict can be generalized under what we

24
00:01:59,130 --> 00:02:00,090
call inference.

25
00:02:00,690 --> 00:02:04,050
So when we transform data to get Z, we call that inference.

26
00:02:04,380 --> 00:02:08,160
But when we predict to get Y hat, we also call that inference.

27
00:02:08,669 --> 00:02:13,200
So inference is a generic term that can mean either of these, depending on the context.

28
00:02:17,700 --> 00:02:22,950
In addition, note that it's possible to combine these two calls into a single call where we just say

29
00:02:22,950 --> 00:02:26,550
Fit, Transform and we pass index, which gives us back Z.

30
00:02:27,480 --> 00:02:31,440
Now at this point, you might be thinking what happened to the other matrix?

31
00:02:33,090 --> 00:02:38,700
As you recall, I said LDA outputs, two matrices, one with documents by topics and the other with

32
00:02:38,700 --> 00:02:39,930
topics by words.

33
00:02:40,590 --> 00:02:45,300
Well, the topics by words matrix is actually stored as an attribute called components.

34
00:02:45,900 --> 00:02:50,400
So to access this matrix, you simply call model dot components with an underscore.

35
00:02:51,750 --> 00:02:53,730
So let's think about why this makes sense.

36
00:02:54,420 --> 00:02:59,430
The key is that this model can be used to transform any document, even one that didn't exist in the

37
00:02:59,430 --> 00:03:00,210
training set.

38
00:03:00,870 --> 00:03:05,490
So suppose that you've already built your model, but your boss comes to you with new documents tomorrow

39
00:03:05,490 --> 00:03:06,660
called X Test.

40
00:03:07,620 --> 00:03:11,250
Luckily, you don't need to retrain your model, since that would take a lot of work.

41
00:03:11,850 --> 00:03:17,760
You can simply call Model Dot Transform passing in X tests, and this will give you Z Test, which represents

42
00:03:17,760 --> 00:03:19,500
the topics for these new documents.

43
00:03:20,340 --> 00:03:25,950
This is unlike the components matrix, which is more a property of the model instead of data to be transformed.

44
00:03:26,700 --> 00:03:28,710
Think about that if it doesn't make sense.

45
00:03:29,220 --> 00:03:31,770
The model stores its representations of each topic.

46
00:03:31,830 --> 00:03:33,060
They are part of the model.

47
00:03:33,390 --> 00:03:35,880
This does not come from the documents you are transforming.