1
00:00:11,110 --> 00:00:17,230
So in this lecture, we will continue looking at our notebook for human activity recognition in this

2
00:00:17,230 --> 00:00:17,650
video.

3
00:00:17,680 --> 00:00:22,010
Our goal is to check how well our model performs using only static features.

4
00:00:22,510 --> 00:00:27,580
That is, we will not use any time series, but only features derived from the Time series.

5
00:00:28,210 --> 00:00:30,520
So this is more like traditional machine learning.

6
00:00:31,330 --> 00:00:36,190
We'll also take this opportunity to test the other machine learning models suited to this problem,

7
00:00:36,430 --> 00:00:38,710
like the ones we discussed in the previous section.

8
00:00:39,910 --> 00:00:45,380
So we'll start by imparting the standard scalar from Saikat learn for tabular data.

9
00:00:45,400 --> 00:00:49,900
It's useful to standardize the features for certain models like neural networks.

10
00:00:52,340 --> 00:00:57,350
The next step is to define a new function and called the load features, the job of this function is

11
00:00:57,350 --> 00:01:02,070
to return the train and test features, both of which will be two dimensional and biddy's.

12
00:01:05,110 --> 00:01:10,740
Inside the function will begin by using pedigreed CSV to load in the file extranets.

13
00:01:14,240 --> 00:01:17,570
The next step is to convert the data frame into a Nampara.

14
00:01:20,180 --> 00:01:23,130
The next step is to do the same process for tests.

15
00:01:27,470 --> 00:01:32,240
The next step is to use the standard scalar to standardize both the training and test features.

16
00:01:37,820 --> 00:01:41,720
The next step is Duckula, a new function to get train and feed test.

17
00:01:47,370 --> 00:01:52,170
The next step is to determine the dimensionality of our features, which is the number of columns in

18
00:01:52,170 --> 00:01:53,040
either array.

19
00:01:57,120 --> 00:02:01,260
The next step is to create a feed for it and then what you already know how to do.

20
00:02:05,790 --> 00:02:09,990
The next step is to call the compile function, which has the same arguments as before.

21
00:02:14,700 --> 00:02:20,550
The next step is to create a checkpoint, note that we've given this a new file name so as not to conflict

22
00:02:20,760 --> 00:02:22,070
with the existing file.

23
00:02:26,340 --> 00:02:30,060
The next step is to call the fifth function to begin the training process.

24
00:02:38,950 --> 00:02:41,220
The next step is to plot the laws, Repak.

25
00:02:46,080 --> 00:02:49,890
OK, so again, the train loss is much better than the test loss.

26
00:02:53,190 --> 00:02:55,920
The next step is to plot the accuracy prepack.

27
00:03:00,090 --> 00:03:05,690
Interestingly, this model seems to do pretty well, it seems to do better than the Time series version.

28
00:03:10,070 --> 00:03:12,380
The next step is to load in our best model.

29
00:03:16,660 --> 00:03:20,230
The next step is to compute our test predictions and the accuracy.

30
00:03:27,340 --> 00:03:34,090
OK, so you can see that our suspicions are correct, in fact, using the features led to a better performing

31
00:03:34,090 --> 00:03:36,700
model compared to using the TIME series.

32
00:03:39,490 --> 00:03:43,090
The next step is to try a few machine learning models from Saikat learn.

33
00:03:47,660 --> 00:03:50,160
So we'll start by testing logistic regression.

34
00:03:50,990 --> 00:03:54,660
Note that for some reason, the default optimizer didn't work that well.

35
00:03:54,830 --> 00:03:56,810
So I've set the solver to live linear.

36
00:04:03,530 --> 00:04:06,250
OK, so the train accuracy seems pretty high.

37
00:04:10,480 --> 00:04:15,550
And we see that the test accuracy is also pretty high, even better than our neural network.

38
00:04:16,390 --> 00:04:20,140
This suggests that our data might have close to a linear decision boundary.

39
00:04:23,400 --> 00:04:26,190
The next step is to test the support vector machine.

40
00:04:29,680 --> 00:04:31,660
Again, the train score is pretty good.

41
00:04:36,250 --> 00:04:39,850
So the test score is also good, but not as good as logistic regression.

42
00:04:42,760 --> 00:04:48,400
So to test this idea that the data might be close to linearly separable, we're going to test the support

43
00:04:48,400 --> 00:04:52,090
vector machine again, but this time we'll use a linear kernel.

44
00:04:56,370 --> 00:04:59,490
So the train score is better when we use a linear kernel.

45
00:05:04,190 --> 00:05:07,190
And again, our test score beats the nonlinear version.

46
00:05:10,490 --> 00:05:12,650
The next step is to test the random forest.

47
00:05:17,870 --> 00:05:21,770
So, as usual, the random forest gets a perfect score on the train set.

48
00:05:26,270 --> 00:05:29,750
Unfortunately, the REENFORCE performed poorly on the test.

49
00:05:31,730 --> 00:05:33,800
OK, so what have we learned in this lecture?

50
00:05:34,430 --> 00:05:38,630
We've learned that feature engineering is very useful, working with raw time.

51
00:05:38,630 --> 00:05:39,710
So it sounds cool.

52
00:05:39,710 --> 00:05:45,080
But sometimes the more practical approach might be to just make some features and use regular machine

53
00:05:45,080 --> 00:05:48,860
learning and also don't discount linear models.

54
00:05:49,130 --> 00:05:51,760
They should always be part of your list of things to try.
