1
00:00:11,120 --> 00:00:17,630
So in this lecture, we will be looking at the problem of sentiment analysis, as with our other application

2
00:00:17,630 --> 00:00:19,340
based sections of this course.

3
00:00:19,760 --> 00:00:22,160
This section will be split up into two parts.

4
00:00:22,730 --> 00:00:28,670
The first lecture, which is this lecture, will focus on describing sentiment analysis in why we would

5
00:00:28,670 --> 00:00:29,600
want to do it.

6
00:00:30,200 --> 00:00:34,820
The subsequent lectures will focus on specific solution methods we will use.

7
00:00:39,370 --> 00:00:41,230
So what is sentiment analysis?

8
00:00:41,980 --> 00:00:43,780
Consider the following sentences.

9
00:00:44,350 --> 00:00:46,390
Wow, that was a really great film.

10
00:00:47,170 --> 00:00:48,790
And now consider this sentence.

11
00:00:49,270 --> 00:00:52,720
I can't believe I wasted two hours of my life on that film.

12
00:00:53,650 --> 00:00:56,410
So what is the difference between these two sentences?

13
00:00:57,100 --> 00:01:02,410
Well, it's probably very clear that one of these sentences is a very positive reaction to a film,

14
00:01:02,710 --> 00:01:04,330
while the other is very negative.

15
00:01:04,870 --> 00:01:06,910
This is the concept of sentiment.

16
00:01:07,630 --> 00:01:13,330
Typically, we categorize sentiment as being either positive, negative or somewhere in between.

17
00:01:18,060 --> 00:01:23,190
Note that sentiment analysis is typically thought of as a classification task.

18
00:01:23,910 --> 00:01:30,930
Sometimes we simply call it sentiment classification, which is, in my opinion, a bit more clear now

19
00:01:30,930 --> 00:01:34,440
because sentiment can be thought of as being either positive or negative.

20
00:01:34,800 --> 00:01:39,660
We often build a binary classifiers to discriminate between these two classes.

21
00:01:41,210 --> 00:01:46,400
However, you will find instances where there might be three categories or even five categories.

22
00:01:46,970 --> 00:01:51,900
For example, in some datasets, the categories are positive, negative and neutral.

23
00:01:52,520 --> 00:01:53,660
In other data sets.

24
00:01:53,900 --> 00:01:56,300
Perhaps we'd like to take a more fine grained view.

25
00:01:56,630 --> 00:02:01,730
So we have positive, very positive, negative, very negative and neutral.

26
00:02:02,750 --> 00:02:06,860
So there are some different possibilities when we want to categorize sentiment.

27
00:02:07,640 --> 00:02:10,669
However, note that this does not always have to be the case.

28
00:02:11,720 --> 00:02:18,270
It would be just as natural, for instance, to treat sentiment analysis like a regression task, perhaps.

29
00:02:18,290 --> 00:02:23,840
Zero would represent neutral sentiment, greater than zero would be positive, and less than zero would

30
00:02:23,840 --> 00:02:24,440
be negative.

31
00:02:25,400 --> 00:02:31,970
In fact, one common and related application in machine learning is the recommender system and these

32
00:02:31,970 --> 00:02:32,570
systems.

33
00:02:32,570 --> 00:02:35,900
We typically analyze and try to predict ratings.

34
00:02:36,530 --> 00:02:42,020
One very common rating system, which I'm sure you've seen and which is used for hotels, movies and

35
00:02:42,020 --> 00:02:48,770
online courses, goes from one to five stars, five stars being the best and one star being the worst.

36
00:02:50,310 --> 00:02:56,190
Interestingly, in these scenarios, we normally treated like a regression task, which is interesting

37
00:02:56,190 --> 00:03:00,870
because there's no real difference between five different star ratings and five different sentiment

38
00:03:00,870 --> 00:03:01,680
categories.

39
00:03:02,160 --> 00:03:06,840
In other words, why do we treat this like regression in recommender systems, but we treat it like

40
00:03:06,840 --> 00:03:09,090
classification and sentiment analysis.

41
00:03:10,230 --> 00:03:16,380
In any case, I would personally opt to do a regression in the case where you have many categories since

42
00:03:16,380 --> 00:03:19,830
the categories are not independent but related on a scale.

43
00:03:20,460 --> 00:03:24,450
We know that five is better than four, four is better than three and so forth.

44
00:03:25,260 --> 00:03:29,070
Likewise, we know that very positive is more positive than positive.

45
00:03:29,400 --> 00:03:31,380
Positive is more positive than neutral.

46
00:03:31,680 --> 00:03:34,320
Neutral is more positive than negative and so forth.

47
00:03:35,070 --> 00:03:40,560
This is unlike other kinds of classification tasks, like predicting an object inside an image.

48
00:03:41,220 --> 00:03:46,710
For example, if we're trying to discriminate between cats, dogs and bicycles, there's no ordering

49
00:03:46,710 --> 00:03:47,820
to these categories.

50
00:03:48,510 --> 00:03:52,680
So for sentiment, it makes more sense to do regression, in my opinion.

51
00:03:53,820 --> 00:03:58,890
Thus, if you're the lead data scientist on your project and you have the opportunity to choose the

52
00:03:58,890 --> 00:04:01,860
kind of task you want to do, that might be a good choice.

53
00:04:02,430 --> 00:04:07,320
But if you're given a dataset like we will be in this course, then we'll just use what we have.

54
00:04:11,890 --> 00:04:16,149
OK, so just to be super clear, what is the task of sentiment analysis?

55
00:04:16,990 --> 00:04:23,110
This is a task where we are given a document and we want to predict the sentiment of that document.

56
00:04:23,770 --> 00:04:29,590
As mentioned, the sentiment can be defined as a category like positive or negative or as a real number

57
00:04:29,590 --> 00:04:32,500
like plus one point two either way.

58
00:04:32,530 --> 00:04:38,260
Note that this is another example of supervised learning, since the sentiment will be the target.

59
00:04:39,130 --> 00:04:44,290
Furthermore, this should help give you a sense of how a sentiment analysis dataset would be structured,

60
00:04:44,920 --> 00:04:46,990
as with other supervised learning methods.

61
00:04:47,350 --> 00:04:51,340
We will be given a table of two columns the text and the targets.

62
00:04:51,940 --> 00:04:57,280
The text contains the document we want to make a prediction for, and the target contains the sentiment

63
00:04:57,280 --> 00:04:57,790
value.

64
00:04:58,810 --> 00:05:03,940
So at this point, you should already have a pretty decent idea of how you would go about building a

65
00:05:03,940 --> 00:05:06,400
machine learning model to solve this task.

66
00:05:06,820 --> 00:05:07,450
In fact.

67
00:05:07,480 --> 00:05:12,190
Note that it is no different from spam detection, as you recall.

68
00:05:12,220 --> 00:05:14,350
This is just another instance of my rule.

69
00:05:14,620 --> 00:05:15,940
All data is the same.

70
00:05:20,630 --> 00:05:26,840
So the next topic of this lecture will be to try and understand why sentiment analysis is useful in

71
00:05:26,840 --> 00:05:27,710
the real world.

72
00:05:28,400 --> 00:05:34,070
As always, I like to think of this in terms of money because most students understand money and most

73
00:05:34,070 --> 00:05:37,790
students understand that either as a business or as an individual.

74
00:05:38,090 --> 00:05:39,620
Your goal is to make more money.

75
00:05:40,490 --> 00:05:45,740
So how can you use sentiment analysis to make more money for your business or yourself?

76
00:05:46,580 --> 00:05:50,060
In order to understand this, it's helpful to look at some examples.

77
00:05:52,370 --> 00:05:57,260
One example is reputation management, which involves monitoring of social media.

78
00:05:58,070 --> 00:06:03,800
Basically, the idea is you want to automatically look at tweets and posts on Reddit, Facebook and

79
00:06:03,800 --> 00:06:06,710
so forth that users are making about your company.

80
00:06:08,060 --> 00:06:14,000
You can then pass these through your sentiment analyzer and create reports showing user sentiment statistics.

81
00:06:15,500 --> 00:06:20,270
Note that this can all be done programmatically, so no one has to go in and read tweets themselves,

82
00:06:20,540 --> 00:06:22,400
which saves your company time and money.

83
00:06:23,180 --> 00:06:27,020
You can then take action based on what your customers have said about your company.

84
00:06:27,890 --> 00:06:30,950
Furthermore, note that this is not limited to your own company.

85
00:06:31,310 --> 00:06:33,620
You can do the same with your competitors as well.

86
00:06:34,160 --> 00:06:39,050
So if you're Apple, you might be interested in the sentiment behind the Google Pixel or the Samsung

87
00:06:39,050 --> 00:06:39,680
Galaxy.

88
00:06:41,340 --> 00:06:46,590
Another example is to predict the sentiment of customers who are using your customer support.

89
00:06:47,220 --> 00:06:52,950
Perhaps the action taken by your customer support team is different based on the sentiment of the customer.

90
00:06:54,590 --> 00:07:01,190
Another popular example is to use sentiment analysis on news articles, tweets and other media concerning

91
00:07:01,190 --> 00:07:05,510
companies and industries whose stock prices might be affected by the news.

92
00:07:06,170 --> 00:07:11,060
So if your model predicts that the sentiment of an article is negative, you might want to consider

93
00:07:11,060 --> 00:07:13,230
selling your stock and vice versa.

94
00:07:13,250 --> 00:07:19,430
If the sentiment of an article is positive, by making more intelligent trades, you can profit by buying

95
00:07:19,430 --> 00:07:21,890
and selling stocks at the right times.