1
00:00:11,710 --> 00:00:16,810
In this lecture we are going to discuss an application of our own ends that I think would be nice because

2
00:00:16,810 --> 00:00:20,180
most other resources don't cover anything like this.

3
00:00:20,290 --> 00:00:25,780
In particular we are going to look at how to apply Arnolds to image classification.

4
00:00:25,780 --> 00:00:27,850
Now you might think that's pretty weird.

5
00:00:27,850 --> 00:00:34,290
I thought CNN is where our images and art ends were for sequences and indeed that's generally the case.

6
00:00:34,420 --> 00:00:37,530
But in this lecture I hope to expand your mind a little bit.

7
00:00:42,620 --> 00:00:45,550
So basically this is all about your imagination.

8
00:00:45,770 --> 00:00:50,350
If you recall when we took our dumb as possible approach what did we do.

9
00:00:50,360 --> 00:00:55,060
Originally we considered tabular data such as what you might collect in a survey.

10
00:00:55,370 --> 00:00:59,190
Each column just represents your answers to my survey questions.

11
00:00:59,210 --> 00:01:01,380
How long did you study for my math exam.

12
00:01:01,400 --> 00:01:06,080
How many hours did you spend playing video games have you completed all the assignments in the course

13
00:01:06,080 --> 00:01:07,670
so far and so on.

14
00:01:09,140 --> 00:01:13,050
These are called features and they make up the input feature vector.

15
00:01:13,190 --> 00:01:18,620
Well the dumbest possible approach for dealing with images was just to flatten the pixels of the image

16
00:01:18,650 --> 00:01:21,050
and pretend that it's a feature vector.

17
00:01:21,050 --> 00:01:25,230
Pretend that each pixel value is an answer to a survey question.

18
00:01:25,340 --> 00:01:30,460
In other words this is all make believe you are just using your imagination.

19
00:01:30,500 --> 00:01:33,470
We did the same thing with time series sequences.

20
00:01:33,470 --> 00:01:39,440
Instead of thinking of it as a sequence if we just want to pass the data into an and then we just pretend

21
00:01:39,440 --> 00:01:45,510
the sequence is a feature vector and that each value of the sequence is your answer to a survey question.

22
00:01:45,680 --> 00:01:47,270
Again all make believe.

23
00:01:47,420 --> 00:01:50,300
Stock prices are not answers to a survey question.

24
00:01:50,390 --> 00:01:56,670
It's just imaginary.

25
00:01:56,710 --> 00:02:01,790
So how about we think of other kinds of data using this imaginary perspective.

26
00:02:01,840 --> 00:02:03,070
Here's what we know.

27
00:02:03,360 --> 00:02:06,750
A multi-dimensional time series is a T by D matrix.

28
00:02:06,760 --> 00:02:11,890
This is a two dimensional matrix where each column is a time series by the way.

29
00:02:11,890 --> 00:02:15,900
Note that this image is a little misleading because each time series is a row.

30
00:02:16,000 --> 00:02:17,730
It goes from left to right.

31
00:02:17,740 --> 00:02:20,470
This is just due to the way we visualize time series.

32
00:02:20,470 --> 00:02:25,070
It would look very strange if I gave you a time series that went from top to bottom.

33
00:02:25,240 --> 00:02:26,430
So just keep that in mind.

34
00:02:26,440 --> 00:02:32,860
A time series stored in a matrix would go along the columns because the number of rows t is the number

35
00:02:32,860 --> 00:02:33,850
of timestamps

36
00:02:38,990 --> 00:02:44,260
now consider a black and white image such as what we have in amnesty and fashion amnesty.

37
00:02:44,270 --> 00:02:46,910
This is a height by width matrix.

38
00:02:46,910 --> 00:02:51,290
Importantly it is also a two dimensional matrix.

39
00:02:51,290 --> 00:02:52,120
Each I give.

40
00:02:52,130 --> 00:02:57,500
Entry of the matrix is the pixel intensity at row I column J.

41
00:02:57,680 --> 00:02:59,120
But here's the trick.

42
00:02:59,360 --> 00:03:05,930
Since a multi-dimensional time series is a two dimensional matrix and an image is also a two dimensional

43
00:03:05,930 --> 00:03:06,850
matrix.

44
00:03:07,010 --> 00:03:13,730
How about we just use our imagination to pretend that an image is a multi-dimensional time series

45
00:03:18,910 --> 00:03:23,470
using this method you can think of the on in as like an image scanner.

46
00:03:23,470 --> 00:03:29,620
It scans each pixel of the image from top to bottom looking at each row one at a time and updating the

47
00:03:29,620 --> 00:03:33,480
hidden state based on each row it encounters from here.

48
00:03:33,490 --> 00:03:36,820
Our code preparation doesn't involve anything we haven't seen already

49
00:03:41,970 --> 00:03:44,010
step number one is to load in the data.

50
00:03:44,070 --> 00:03:46,320
That's just amnesty from this.

51
00:03:46,320 --> 00:03:51,180
We get our input data as an array which is of size and bytes by DB.

52
00:03:51,390 --> 00:03:54,640
Here t is 28 but this is also 28.

53
00:03:54,870 --> 00:04:01,410
Step number two is to instantiate our model for this we can use an Alice VM network and have the final

54
00:04:01,410 --> 00:04:04,860
dense layer with 10 outputs and a soft Max activation.

55
00:04:05,730 --> 00:04:10,860
After this we call fit as usual and plot the loss in accuracy per iteration.

56
00:04:10,860 --> 00:04:12,110
So very easy.

57
00:04:12,150 --> 00:04:16,790
All it required was a different way of thinking about images as an exercise.

58
00:04:16,800 --> 00:04:21,840
You might want to try implementing this before looking at my code to practice what you've learned so

59
00:04:21,840 --> 00:04:23,010
far.

60
00:04:23,010 --> 00:04:25,620
Remember this doesn't require anything new.

61
00:04:25,620 --> 00:04:28,380
Just putting together old things in new ways.

62
00:04:31,080 --> 00:04:36,900
As a side note you might also want to try global Max pulling as well to see if it improves performance.
