1
00:00:11,940 --> 00:00:17,880
So in this lecture, we will be summarizing what we learned in this section, this section, which is

2
00:00:17,880 --> 00:00:24,750
part of our module on machine learning, looked at topic modeling as you saw topic modeling as an unsupervised

3
00:00:24,750 --> 00:00:26,940
method, which is similar to clustering.

4
00:00:27,780 --> 00:00:33,210
We learned about two algorithms that could be applied to this task, namely latency, richly allocation

5
00:00:33,450 --> 00:00:35,490
and non-negative matrix factorization.

6
00:00:36,390 --> 00:00:42,030
LDA is quite a complex algorithm, but it was helpful to see it from the perspective of inputs and outputs.

7
00:00:42,600 --> 00:00:48,180
If you were more advanced, then you also learned a little bit about graphical models and how LDA assumes

8
00:00:48,180 --> 00:00:49,470
documents are generated.

9
00:00:50,610 --> 00:00:56,190
One major difference between LDA and a simple mixture model is that a new topic is sampled for every

10
00:00:56,190 --> 00:00:56,670
word.

11
00:00:57,180 --> 00:01:00,510
In a simple mixture model, a topic would only be sampled once.

12
00:01:01,710 --> 00:01:06,750
We also looked at non-negative matrix factorization, which is an algorithm derived from recommender

13
00:01:06,750 --> 00:01:07,530
systems.

14
00:01:08,190 --> 00:01:13,590
In fact, we started this lecture by looking at recommenders to better understand the motivation behind

15
00:01:13,590 --> 00:01:15,570
the matrix factorization approach.

16
00:01:16,230 --> 00:01:21,330
We then realized how this approach could be immediately applied to topic modeling, since the model

17
00:01:21,330 --> 00:01:25,080
parameters happen to have the same format as the outputs of LDA.

18
00:01:26,130 --> 00:01:31,770
In addition, we also noted that we could go in the reverse direction and apply LDA to recommender systems

19
00:01:31,770 --> 00:01:32,250
as well.