1
00:00:05,480 --> 00:00:13,160
‫Now, let's create the structure of our first artificial neural network model before starting.

2
00:00:13,340 --> 00:00:18,950
‫Let's just set random seed 242 using these two statements.

3
00:00:20,120 --> 00:00:24,460
‫Random seed is used to replicate the same result every time.

4
00:00:25,430 --> 00:00:32,900
‫You can use any number instead of 42 if you use that number in future when you are running the same

5
00:00:32,900 --> 00:00:33,260
‫code.

6
00:00:33,620 --> 00:00:38,210
‫You will get the same output as we have discussed in the theory.

7
00:00:38,330 --> 00:00:45,680
‫There are multiple occasions where our neural network generates random number, such as assigning the

8
00:00:45,680 --> 00:00:53,660
‫initial weights using random seed will help you to reproduce the same result.

9
00:00:53,870 --> 00:00:59,240
‫Using the same initial weights every time. Let's just on this

10
00:01:01,940 --> 00:01:09,080
‫So for our problem, we have observations in the form of 28 into 28 pixels.

11
00:01:10,250 --> 00:01:12,960
‫Observations are in the form of 2D array.

12
00:01:14,150 --> 00:01:18,440
‫And as an output, we want ten categories.

13
00:01:19,460 --> 00:01:21,710
‫These categories are exclusive.

14
00:01:22,970 --> 00:01:29,870
‫That means a single image can either be a t shirt or a top or a boot.

15
00:01:33,460 --> 00:01:35,140
‫This is what we are planning to do.

16
00:01:37,450 --> 00:01:44,710
‫We are first converting our 2d observations into a flat 1d observations.

17
00:01:45,640 --> 00:01:49,390
‫So instead of a 2d array of 28 into 28 pixel.

18
00:01:49,780 --> 00:01:54,100
‫We want 784 pixel in our input layer.

19
00:01:55,390 --> 00:01:59,020
‫Then we are going to create two hidden layers.

20
00:02:00,430 --> 00:02:08,950
‫The activation function, which we are going to use for hidden layers, will be relu, as discussed in

21
00:02:08,950 --> 00:02:09,930
‫the theory lecture.

22
00:02:10,120 --> 00:02:13,780
‫We always prefer RELU for classification models.

23
00:02:15,670 --> 00:02:20,020
‫And in the output since this 10 categories are exclusive.

24
00:02:20,720 --> 00:02:23,740
‫and this is a classification model.

25
00:02:23,860 --> 00:02:26,480
‫We will be using softmax activation.

26
00:02:28,720 --> 00:02:32,830
‫We have already discussed this activation types in our theory lecture.

27
00:02:33,100 --> 00:02:35,620
‫That's why we are not going to discuss it here.

28
00:02:38,020 --> 00:02:43,510
‫Now let's start creating this neural network using sequential api's of keras.

29
00:02:46,270 --> 00:02:48,940
‫First, we will need to create an model object.

30
00:02:50,110 --> 00:02:52,600
‫So our object variable name is model.

31
00:02:53,200 --> 00:02:57,280
‫And we are just creating it using this function.

32
00:02:57,360 --> 00:02:59,620
‫Keras.models.sequential.

33
00:03:01,640 --> 00:03:05,390
‫In this sequential object, we can add different layers.

34
00:03:05,930 --> 00:03:08,030
‫We will start with our input layer.

35
00:03:08,540 --> 00:03:12,200
‫We'll move on to hidden layer one, then to hidden layer two.

36
00:03:13,280 --> 00:03:15,170
‫And then to the output layer.

37
00:03:17,240 --> 00:03:22,730
‫So first, for the input layer, we can write like this model dot add.

38
00:03:23,390 --> 00:03:25,460
‫And then keras dot layers.

39
00:03:26,570 --> 00:03:34,820
‫And then since we want to convert this 2d array of 28 into 28 pixels to 784 pixel in a single

40
00:03:34,820 --> 00:03:37,010
‫Array, we are using flatten.

41
00:03:39,680 --> 00:03:48,200
‫And then we need to provide the input shape of over X variables, since our X variable is a 2d

42
00:03:48,200 --> 00:03:50,300
‫Array of 28 into 28 pixel.

43
00:03:50,660 --> 00:03:57,380
‫We are using input shape equal to then we are providing a list of two variables that is 28

44
00:03:57,460 --> 00:03:58,550
‫Comma 28.

45
00:04:01,950 --> 00:04:04,300
‫Then our second layer is the hidden layer.

46
00:04:06,330 --> 00:04:14,550
‫So in the next step, we are adding another layer that is model dot add then keras dot layer.

47
00:04:14,850 --> 00:04:17,630
‫And since this is a dense layer, will write dense.

48
00:04:18,990 --> 00:04:24,060
‫And here we need to mention the number of neurons we want in this layer.

49
00:04:26,100 --> 00:04:29,880
‫So in hidden layer one, we need 300 neurons.

50
00:04:30,240 --> 00:04:32,310
‫That's why we are writing 300.

51
00:04:32,970 --> 00:04:36,180
‫And then we want RELU activation function.

52
00:04:36,630 --> 00:04:42,840
‫That's why we are writing activation equal to Relu in the next step.

53
00:04:43,050 --> 00:04:44,760
‫We want another hidden layer.

54
00:04:45,750 --> 00:04:47,760
‫So we are following the same process.

55
00:04:48,480 --> 00:04:51,120
‫That is, we are writing model.add .

56
00:04:52,020 --> 00:04:55,620
‫And in bracket we are writing keras dot layer dot dense.

57
00:04:56,400 --> 00:04:59,100
‫And in this layer we want hundred neurons.

58
00:05:00,300 --> 00:05:01,380
‫That's why we are writing.

59
00:05:01,410 --> 00:05:06,830
‫Hundred and then activation equal to Relu since this is also a hidden layer.

60
00:05:07,380 --> 00:05:12,810
‫We want activation function to be relu. in the next output layer.

61
00:05:13,320 --> 00:05:15,420
‫We want 10 different categories.

62
00:05:15,630 --> 00:05:18,570
‫That's why we have to add 10 neurons.

63
00:05:18,600 --> 00:05:20,040
‫Into this layer.

64
00:05:21,360 --> 00:05:24,390
‫And since the classes are exclusive.

65
00:05:25,050 --> 00:05:28,170
‫That's why we have to use softmax activation.

66
00:05:30,540 --> 00:05:33,220
‫So we'll write model dot add.

67
00:05:33,930 --> 00:05:36,680
‫And then keras dot layers, dot dense.

68
00:05:37,200 --> 00:05:42,660
‫And then the number of neurons, which is 10 and activation, equal to softmax.

69
00:05:43,770 --> 00:05:52,890
‫I hope you remember what relu and Softmax are. relu is zero for all the negative numbers and equal to

70
00:05:52,980 --> 00:05:55,800
‫the input for all the positive inputs.

71
00:05:57,460 --> 00:06:03,290
‫Whereas softmax equates the sum of all the class probability to one

72
00:06:05,920 --> 00:06:13,000
‫in case if you want any additional hidden layer , you can always add additional layer between any of

73
00:06:13,000 --> 00:06:16,450
‫these layers in the later part of the course.

74
00:06:17,000 --> 00:06:21,910
‫We will see how to choose the number of neurons in each layer.

75
00:06:24,010 --> 00:06:29,240
‫Let's run this after creating this model structure.

76
00:06:30,790 --> 00:06:34,030
‫You can look at it using summary method.

77
00:06:34,330 --> 00:06:39,640
‫So if you write your object name, that is model and if you write dot summary.

78
00:06:42,400 --> 00:06:47,990
‫The model summary method displays all the model layers, including each layer's

79
00:06:48,010 --> 00:06:54,340
‫Names its output shape and the number of parameters.

80
00:06:56,680 --> 00:06:58,570
‫So these are the layer names.

81
00:06:59,230 --> 00:07:00,970
‫Second is the output shape.

82
00:07:01,840 --> 00:07:03,760
‫This the number of output.

83
00:07:04,180 --> 00:07:07,360
‫And this is the beth size of the input.

84
00:07:08,170 --> 00:07:10,190
‫Since we are passing all our data.

85
00:07:10,580 --> 00:07:12,230
‫That's why this is none.

86
00:07:12,640 --> 00:07:15,340
‫None means no limit on input data.

87
00:07:18,060 --> 00:07:29,100
‫And next is the number of trainable parameters, since our input have 784 variables and we are passing

88
00:07:29,130 --> 00:07:32,220
‫each of these variables into 300 different neurons.

89
00:07:32,790 --> 00:07:36,420
‫We have individual weights for each of these linkages.

90
00:07:37,050 --> 00:07:42,180
‫So total number of weights is 784 into 300 plus

91
00:07:42,240 --> 00:07:48,360
‫There are other 300 bias variables that are associated with each of these neurons.

92
00:07:49,110 --> 00:07:54,390
‫So 784 into 300 plus 300 will give you this number.

93
00:07:55,380 --> 00:08:01,500
‫Our neural network is trying to optimize this many parameters for this layer.

94
00:08:02,610 --> 00:08:05,990
‫Similarly, these are the trainable parameters for this layer.

95
00:08:06,720 --> 00:08:09,760
‫Again, this will be three hundred into 100.

96
00:08:10,710 --> 00:08:15,110
‫There are three hundred into hundred linkages between these two layers.

97
00:08:15,480 --> 00:08:22,440
‫And each of that linkage will have associated weights and each of the neuron in this layer.

98
00:08:22,500 --> 00:08:26,190
‫That is, hundred neurons have hundred different Bias's values.

99
00:08:26,730 --> 00:08:29,760
‫So 300 into 100 plus 100.

100
00:08:30,230 --> 00:08:35,130
‫So thirty thousand one hundred trainable parameters are associated with this layer.

101
00:08:36,240 --> 00:08:41,160
‫Similarly, 1010 trainable parameters are associated with this layer.

102
00:08:44,130 --> 00:08:48,960
‫So at the bottom, you get the summary of total number of trainable parameters.

103
00:08:49,320 --> 00:08:50,610
‫In this neural network.

104
00:08:50,880 --> 00:08:56,210
‫So our neural network will try to optimize this many parameters to get the best result.

105
00:08:59,350 --> 00:09:08,410
‫Now, if you want to look at our neural network, you can do

106
00:09:08,410 --> 00:09:09,680
‫that using pydot.

107
00:09:10,420 --> 00:09:12,190
‫So you have to import pydot.

108
00:09:12,760 --> 00:09:22,920
‫And if pydot is not installed in your system, you can install it using PIP space, install space pydot

109
00:09:23,230 --> 00:09:28,490
‫or conda Space, install space pydot in your command prompt.

110
00:09:30,910 --> 00:09:39,250
‫So if you just write keras.utilities, dot plot, underscore model and then give your object name.

111
00:09:40,780 --> 00:09:46,960
‫And if you run this, you will get the structure of your neural network.

112
00:09:47,470 --> 00:09:49,270
‫So here we have input layer.

113
00:09:49,900 --> 00:09:54,000
‫Then we are flightening the 2d array into a 1d array

114
00:09:54,430 --> 00:09:56,360
‫So that's why we have a flatten layer.

115
00:09:57,520 --> 00:09:59,830
‫And then we have 2 dense hidden layer

116
00:10:00,460 --> 00:10:04,750
‫And we have an output layer which is giving us the class probabilities.

117
00:10:06,850 --> 00:10:12,970
‫So after creating the structure of your neural network, you can also visualize this structure using

118
00:10:12,970 --> 00:10:13,180
‫this

119
00:10:13,180 --> 00:10:13,540
‫Command

120
00:10:16,300 --> 00:10:24,640
‫As I said earlier, our model is trying to optimize weights and biases that are represented by this number

121
00:10:24,940 --> 00:10:25,640
‫to get the output.

122
00:10:27,990 --> 00:10:35,770
‫And if you remember in theory lecture, we have discussed that weights are assigned randomly for initialization

123
00:10:38,050 --> 00:10:41,650
‫to get the information of those weights and biases.

124
00:10:42,340 --> 00:10:49,810
‫There is a get underscored weight method that you can use to get information of those weights and biases.

125
00:10:52,420 --> 00:10:54,700
‫So I can write my object name.

126
00:10:54,730 --> 00:10:59,920
‫That is model and then the layer number for the second layer.

127
00:11:00,010 --> 00:11:01,140
‫I can write layers.

128
00:11:01,750 --> 00:11:06,580
‫And then one since the location of object two is one.

129
00:11:07,150 --> 00:11:09,390
‫And then I can use get weights

130
00:11:09,400 --> 00:11:10,150
‫Method.

131
00:11:12,210 --> 00:11:16,270
‫I'm storing this information into two new variables, weights and biases.

132
00:11:16,930 --> 00:11:25,960
‫So if I just output the weights, you can see this are the randomly generated weights

133
00:11:28,090 --> 00:11:36,070
‫Are 784 into 300 such weights in this layer.

134
00:11:37,630 --> 00:11:40,960
‫So if you just view the shape, you can see that.

135
00:11:43,240 --> 00:11:52,580
‫There are 784 rows and 300 columns in our weights these all weights are randomly assigned for

136
00:11:52,580 --> 00:11:53,510
‫Initialization

137
00:11:54,740 --> 00:12:02,450
‫Similarly, we can also look at the biases values. Biases are initialized as zero.

138
00:12:02,870 --> 00:12:08,530
‫And if you just check the shape of biases, this should be 300.

139
00:12:14,370 --> 00:12:19,920
‫You can see that there are 300 Bias's. In the next video.

140
00:12:20,040 --> 00:12:23,100
‫We will compile and train our model.

141
00:12:23,750 --> 00:12:24,110
‫Thank you.