1
00:00:11,080 --> 00:00:17,140
In this video, we will be doing code preparation for CNN's in order to preview the code that we will

2
00:00:17,140 --> 00:00:18,010
write next.

3
00:00:18,430 --> 00:00:23,860
As before, the main purpose of this lecture is to show you the syntax so that you are not surprised

4
00:00:23,860 --> 00:00:25,030
when we use it later.

5
00:00:25,540 --> 00:00:30,490
Now, whether or not you understand what this code is doing will be dependent on whether you watch the

6
00:00:30,490 --> 00:00:31,530
previous lectures.

7
00:00:31,870 --> 00:00:35,260
If you did, then you will know what's going on under the hood.

8
00:00:35,500 --> 00:00:37,210
If not, that's OK too.

9
00:00:37,420 --> 00:00:39,610
You can simply treat this like an API.

10
00:00:44,320 --> 00:00:50,050
OK, so the first thing to discuss is that the structure of this code is no different than what we had

11
00:00:50,050 --> 00:00:50,870
for eons.

12
00:00:52,030 --> 00:00:57,760
You'll find that the setup for this video, meaning the previous conceptual lectures, is way more meaty

13
00:00:57,760 --> 00:00:59,420
than the actual implementation.

14
00:01:00,010 --> 00:01:04,420
Thanks to our hard work in the previous section, we already know most of what to do.

15
00:01:05,500 --> 00:01:09,600
The new parts are really just learning about the syntax for the new layers.

16
00:01:10,000 --> 00:01:13,630
So without further ado, let's review what we've already learned.

17
00:01:14,170 --> 00:01:17,020
We know that the first step is to build our model.

18
00:01:17,380 --> 00:01:21,160
The syntax for that is precisely what we will learn very shortly.

19
00:01:22,180 --> 00:01:24,430
The next step is to call the compile function.

20
00:01:24,970 --> 00:01:30,520
The arguments to this, such as the loss, will be dependent on what task you are doing, but it works

21
00:01:30,520 --> 00:01:31,810
the same way as before.

22
00:01:32,860 --> 00:01:34,810
The next step is to call the FID function.

23
00:01:35,350 --> 00:01:40,480
This time the input data extranet next test are now NBT by the arrays.

24
00:01:41,530 --> 00:01:45,130
After training the model, we would like to use it to make predictions.

25
00:01:45,550 --> 00:01:49,280
Thus we use the predictive function parsing in the input data.

26
00:01:49,450 --> 00:01:55,930
We would like to make predictions for OK, so this is exactly the same as before, except that now the

27
00:01:55,930 --> 00:01:57,950
data has the shape and by TBD.

28
00:02:02,780 --> 00:02:06,090
The next step is to understand the syntax for building a CNN.

29
00:02:06,770 --> 00:02:08,610
So this should be pretty straightforward.

30
00:02:09,140 --> 00:02:13,310
We'll start with what a convolutional layer looks like mathematically.

31
00:02:13,310 --> 00:02:18,740
You can think of this is doing W, star X plus B and then passing it through some activation function.

32
00:02:19,670 --> 00:02:22,730
As you recall, the star means convolution.

33
00:02:23,720 --> 00:02:27,340
In this case, W is the filter and B is the bias term.

34
00:02:27,890 --> 00:02:31,820
As per usual, the activation function is typically evalu.

35
00:02:33,440 --> 00:02:36,500
OK, so let's check out the arguments for kind of one D.

36
00:02:37,070 --> 00:02:41,870
The first argument is the number of output feature maps as mentioned previously.

37
00:02:41,990 --> 00:02:45,980
This usually grows larger at every subsequent layer of the CNN.

38
00:02:47,140 --> 00:02:53,120
The next argument is the filter size, typically we choose small values relative to the input size.

39
00:02:53,350 --> 00:02:56,350
So normally you see values like three, five and seven.

40
00:02:57,870 --> 00:03:04,080
The next argument is the activation, as mentioned in modern times, we typically just use the rescue

41
00:03:05,100 --> 00:03:06,000
at this point.

42
00:03:06,030 --> 00:03:07,950
These are the only arguments we need to know.

43
00:03:12,760 --> 00:03:18,670
Also note that for two dimensional convolution, the arguments would be exactly the same, except now

44
00:03:18,670 --> 00:03:21,970
the layer is called a kind of 2D instead of kind of one de.

45
00:03:23,360 --> 00:03:28,730
One option, which is normally not used, is that you can specify a different height and width for the

46
00:03:28,730 --> 00:03:33,530
kernel, but if you only pass in a single number, it assumes that they are the same.

47
00:03:38,340 --> 00:03:45,030
OK, so here's an example of a full CNN using the kind of one delayer that we just learned about, you

48
00:03:45,030 --> 00:03:48,830
should be able to match this code with the previous diagrams we've seen.

49
00:03:50,130 --> 00:03:53,840
So we start with an input layer which specifies the input size.

50
00:03:54,240 --> 00:03:59,520
As mentioned, this is TBD for added dimensional time series of length t.

51
00:04:00,180 --> 00:04:03,750
The next step is to use a kind of one d as before.

52
00:04:03,750 --> 00:04:08,060
All these values are hyper parameters which can be freely chosen for the most part.

53
00:04:09,880 --> 00:04:15,760
The next step is to use pooling by default, if you just pass in into it assumes that the strait is

54
00:04:15,760 --> 00:04:19,120
also two, such that the time dimension will shrink by half.

55
00:04:21,140 --> 00:04:27,200
The next step is to do another kind of one d notice that the number of feature maps is increasing as

56
00:04:27,200 --> 00:04:30,950
is typical, the next step is to do another max pool.

57
00:04:32,060 --> 00:04:36,020
The next step is to do yet another kind of one day notice.

58
00:04:36,020 --> 00:04:38,480
The number of feature maps has doubled once again.

59
00:04:39,850 --> 00:04:45,370
Now, remember that the number of convolutional layer's is up to you, it's your job to experiment and

60
00:04:45,370 --> 00:04:48,510
find out what works best for your particular data set.

61
00:04:49,660 --> 00:04:54,750
At this point, we're going to stop doing convolutions and have one final global max pooling.

62
00:04:55,330 --> 00:04:57,640
Alternatively, you could just use flatten.

63
00:04:58,450 --> 00:05:04,630
The final step is to have our final dense layer with however many outputs we need for the current task.

64
00:05:05,200 --> 00:05:10,460
OK, so that's everything you need to know in order to build a one d convolutional neural network.

65
00:05:10,960 --> 00:05:12,220
I hope you've seen that once.

66
00:05:12,220 --> 00:05:13,590
You know how to build Andsnes.

67
00:05:13,750 --> 00:05:15,310
CNN's are pretty simple.

68
00:05:19,970 --> 00:05:25,550
Now, for completion sake, I'd also like to show you at this point how to build a CNN four images.

69
00:05:25,940 --> 00:05:29,770
As you recall, it's possible to convert a time series into an image.

70
00:05:30,170 --> 00:05:32,590
I hope you'll agree that this is surprisingly simple.

71
00:05:33,200 --> 00:05:35,370
Only two changes need to take place.

72
00:05:36,560 --> 00:05:41,930
Firstly, the shape of the input now has three dimensions height with a number of features.

73
00:05:42,890 --> 00:05:48,050
Now, for typical everyday images, this value is usually three, since we use three channel colour

74
00:05:48,050 --> 00:05:54,140
systems and cameras and computers, for example, RGB or HSV four time series.

75
00:05:54,170 --> 00:05:57,890
This will simply be, however, many variables your Time series has.

76
00:05:59,290 --> 00:06:04,870
The second thing that has the changes anywhere you previously saw one, did you now see TUTTY?

77
00:06:05,440 --> 00:06:08,410
So as mentioned, either way you want to treat your time series.

78
00:06:08,590 --> 00:06:10,870
The CNN itself looks in nearly the same.