1
00:00:00,330 --> 00:00:05,970
Hello, everyone, and welcome to this amazing session, which we are going to see how to integrate

2
00:00:06,000 --> 00:00:09,190
tensor bot callbacks with TensorFlow.

3
00:00:09,210 --> 00:00:16,410
And this session we are going to look at how to log in information from our training process or from

4
00:00:16,410 --> 00:00:23,670
all different experiments into tensor bar, how to view model graphs, how to do hyper parameter tuning

5
00:00:23,670 --> 00:00:32,610
with tensor bar, how to view distributions, histograms time series, how to log image data like convention

6
00:00:32,610 --> 00:00:39,390
matrices, RC plots, and finally, how to do profiling with tensor Bot.

7
00:00:39,390 --> 00:00:45,750
In one of our previous sessions, we saw the importance of working with callbacks as the parameters

8
00:00:45,750 --> 00:00:53,880
to modify certain key information during the training process and also store certain information and

9
00:00:53,880 --> 00:00:56,180
do certain modifications during the training.

10
00:00:56,190 --> 00:01:02,610
That said, we have this tensor board callback right here, which we spoke of last time but didn't really

11
00:01:02,610 --> 00:01:03,360
get into.

12
00:01:03,360 --> 00:01:10,110
And so in this session we are going to go in depth and see how to make use of tensor board to visualize

13
00:01:10,110 --> 00:01:14,090
on a web interface vital training information.

14
00:01:14,100 --> 00:01:20,340
So notice also that this tensor board callback is going to be used in a similar way to the way we had

15
00:01:20,340 --> 00:01:22,110
done with the previous callbacks.

16
00:01:22,110 --> 00:01:29,340
So yeah, we define the callback and then just here you see in this callbacks argument we pass in the

17
00:01:29,340 --> 00:01:31,700
tensor board callback right here.

18
00:01:31,710 --> 00:01:33,960
So let's go ahead and copy this out.

19
00:01:33,960 --> 00:01:40,230
We have this we could just copy from your we've copied this to clipboard and then now we go ahead and

20
00:01:40,230 --> 00:01:43,680
see how to pass this in our training process.

21
00:01:43,680 --> 00:01:46,350
So we see exactly how it works.

22
00:01:47,130 --> 00:01:51,750
So yeah, we open up this callbacks, you see the different callbacks we have created previously.

23
00:01:51,750 --> 00:01:55,770
And then now we're going to include our tensor board callback.

24
00:01:55,770 --> 00:01:59,130
So just, yeah, let's add this text and adequate.

25
00:01:59,130 --> 00:02:00,810
So we paste this out.

26
00:02:01,410 --> 00:02:06,000
Then let's take all this off and simply go, Let's run.

27
00:02:06,000 --> 00:02:11,920
The cell looks fine, and now let's move on to our training process here.

28
00:02:11,920 --> 00:02:13,500
Are we going to have the callback?

29
00:02:13,500 --> 00:02:22,500
So we have the callback equal tensor board callback, and then we're ready to train our model.

30
00:02:22,680 --> 00:02:24,480
So let's check on this.

31
00:02:24,480 --> 00:02:28,620
We had gotten an error, unexpected keyword argument callback.

32
00:02:28,890 --> 00:02:29,580
Oh, okay.

33
00:02:29,580 --> 00:02:31,170
You should have callbacks.

34
00:02:31,170 --> 00:02:32,460
They'll run that again.

35
00:02:33,480 --> 00:02:38,340
Now, while our model is training, you'll notice that there is this locked folder which has been created.

36
00:02:38,340 --> 00:02:43,800
And the reason why the logs folder has been created was because we actually specified that in this tense

37
00:02:43,800 --> 00:02:46,110
about callback call.

38
00:02:46,110 --> 00:02:52,870
So here we have this log dire log directory argument and we're going to store in some information or

39
00:02:52,890 --> 00:02:54,240
some training information.

40
00:02:54,240 --> 00:03:00,390
And as this training information that turns about, we'll use to display very important training information

41
00:03:00,390 --> 00:03:02,020
on the web interface.

42
00:03:02,040 --> 00:03:06,750
Now, you could click open here and you'll see that you have these two folders in this train folder.

43
00:03:06,750 --> 00:03:12,720
You see you have this information concerning the training data which has been stored here and information

44
00:03:12,720 --> 00:03:14,970
for the validation which has been stored right here.

45
00:03:15,390 --> 00:03:22,410
The next thing will do is copy out this here, this command, and then run it just below.

46
00:03:22,410 --> 00:03:25,800
So let's get to where we have visualizations.

47
00:03:25,800 --> 00:03:29,670
Like we reduce this one and we have the visualizations here.

48
00:03:29,700 --> 00:03:31,260
Now we're going to have this.

49
00:03:31,290 --> 00:03:36,240
We're going to see how we're going to replace this visualizations with our tensor board visualizations.

50
00:03:36,240 --> 00:03:37,770
So it paces our we have ten.

51
00:03:37,770 --> 00:03:40,890
So BART log the pattern logs.

52
00:03:41,280 --> 00:03:46,680
Then we created this log dye variable here which takes in our path.

53
00:03:46,680 --> 00:03:48,390
So let's get back.

54
00:03:48,390 --> 00:03:52,950
And then instead of putting our path directly, we just have actually just log there.

55
00:03:52,950 --> 00:03:55,200
So we have the log DA.

56
00:03:55,440 --> 00:04:03,690
Now, before running this, we are going to add this code cell and then load this tensor board notebook

57
00:04:03,690 --> 00:04:04,440
extension.

58
00:04:04,440 --> 00:04:14,190
So here we have load extension ten saw board, we run this and then you see already loaded.

59
00:04:14,190 --> 00:04:15,330
We are loaded this already.

60
00:04:15,330 --> 00:04:18,270
Anyway, we have that and then tensor board now.

61
00:04:18,270 --> 00:04:25,050
So we run this and then we should expect to have some interface which contains all our log data.

62
00:04:25,080 --> 00:04:27,810
No dashboard active for the current dataset.

63
00:04:28,380 --> 00:04:30,450
Check out on scalar is what we have.

64
00:04:30,600 --> 00:04:31,200
Anyway.

65
00:04:31,200 --> 00:04:32,790
Let's, let's do this.

66
00:04:32,790 --> 00:04:33,990
Let's check on this.

67
00:04:33,990 --> 00:04:37,380
Let's have logs logs.

68
00:04:37,380 --> 00:04:38,610
Run this again.

69
00:04:39,090 --> 00:04:39,600
Okay.

70
00:04:39,600 --> 00:04:40,470
And there we go.

71
00:04:40,470 --> 00:04:46,260
We see now that we have this interface which pops up or what do we see here?

72
00:04:46,260 --> 00:04:48,540
We have this logs.

73
00:04:48,540 --> 00:04:50,610
We have both train and validation.

74
00:04:50,610 --> 00:04:52,800
You could pick this out last.

75
00:04:52,830 --> 00:04:54,870
You could take only the training data.

76
00:04:54,870 --> 00:04:56,370
You can see only train data.

77
00:04:56,370 --> 00:04:58,290
And then here you have this killers.

78
00:04:58,290 --> 00:04:59,820
These killers basically is.

79
00:05:00,290 --> 00:05:07,160
All this information which we pass in here, like the loss and the metrics that we had defined these

80
00:05:07,160 --> 00:05:08,000
metrics here.

81
00:05:08,000 --> 00:05:10,940
So we we are going to get all these metrics information.

82
00:05:10,940 --> 00:05:17,840
So unlike previously where we had to manually do this step and do this step for the loss and the accuracy,

83
00:05:17,840 --> 00:05:23,390
now this is done automatically, so let's run this and then we will compare what we get here with what

84
00:05:23,390 --> 00:05:24,040
we get from ten.

85
00:05:24,050 --> 00:05:24,830
So bored.

86
00:05:24,920 --> 00:05:28,880
You can see here we have this loss and then we get back to ten.

87
00:05:28,880 --> 00:05:30,920
So bar we have this level of skills.

88
00:05:32,420 --> 00:05:35,570
Let's let's reduce this accuracy, Let's view the loss first.

89
00:05:35,570 --> 00:05:36,950
So let's reduce this.

90
00:05:36,980 --> 00:05:41,780
Okay, We check out we have this epoch loss now this is what we get for the loss.

91
00:05:41,780 --> 00:05:44,060
Let's include the validation.

92
00:05:44,060 --> 00:05:49,400
So you see, you have this plot right here clicking on this.

93
00:05:49,400 --> 00:05:50,390
You could also expand.

94
00:05:50,390 --> 00:05:52,370
So you see, this is the plot we get.

95
00:05:52,370 --> 00:05:54,200
Now what do you notice?

96
00:05:54,200 --> 00:05:58,940
You notice that it's exactly the same as what we had here, but this time around, we didn't have to

97
00:05:58,940 --> 00:05:59,750
write any code.

98
00:05:59,750 --> 00:06:07,130
All the information was automatically locked in this file here and tense about took care of the rest.

99
00:06:07,130 --> 00:06:10,520
So that's how this works is really very interesting.

100
00:06:11,300 --> 00:06:13,910
And it's a kind of tool you want to master how to use.

101
00:06:13,910 --> 00:06:20,210
Because when working on different machine learning experiments, you wouldn't want to always have to

102
00:06:20,210 --> 00:06:23,690
log all these values by hand or manually.

103
00:06:23,690 --> 00:06:25,700
You want to have this done automatically.

104
00:06:25,700 --> 00:06:30,380
Now, one of the interesting point is you have all the metrics here.

105
00:06:30,380 --> 00:06:31,880
You just have to select any one.

106
00:06:31,880 --> 00:06:35,270
So let's look at the accuracy which we we've seen already.

107
00:06:35,270 --> 00:06:36,560
We can check out this.

108
00:06:36,560 --> 00:06:37,370
Let's click here.

109
00:06:37,370 --> 00:06:37,710
Okay.

110
00:06:37,790 --> 00:06:42,200
So we look at we see this accuracy and then let's scroll down.

111
00:06:42,200 --> 00:06:44,540
You see, you could compare it with what we have here.

112
00:06:44,540 --> 00:06:46,640
See, we have this zip around.

113
00:06:47,210 --> 00:06:49,800
This should be the ninth epoch or let's say it epoch.

114
00:06:49,820 --> 00:06:51,770
Anyway, let's come back and check here.

115
00:06:53,180 --> 00:06:54,770
Let's scroll up from here.

116
00:06:54,770 --> 00:06:57,530
It's actually an interface in this other interface.

117
00:06:57,530 --> 00:07:00,700
So we have this, you see this nine epoch.

118
00:07:00,710 --> 00:07:01,490
What do you see here?

119
00:07:01,490 --> 00:07:08,000
You see notice on this here you will have this name train, smooth add value.

120
00:07:08,930 --> 00:07:09,800
What do we have?

121
00:07:09,800 --> 00:07:13,460
Step nine, Step the time and that's it.

122
00:07:13,580 --> 00:07:15,680
Okay, so there we go.

123
00:07:16,400 --> 00:07:23,450
We see how we could plot all this automatically, and then you could get at any point so you could go

124
00:07:23,450 --> 00:07:27,680
through each and every point and then get all the exact values.

125
00:07:27,830 --> 00:07:29,450
So that's how we look at this.

126
00:07:29,450 --> 00:07:30,260
Let's reduce this.

127
00:07:30,260 --> 00:07:35,150
We could take now, let's say precision so you could see, monitor the precision.

128
00:07:36,650 --> 00:07:38,630
You could also check out a number of false negatives.

129
00:07:38,630 --> 00:07:42,500
You see how this as you keep on training, a number of false negatives keep reducing.

130
00:07:42,500 --> 00:07:44,450
And then the false positives.

131
00:07:45,050 --> 00:07:46,130
What do we have here?

132
00:07:46,130 --> 00:07:47,680
False positives.

133
00:07:47,690 --> 00:07:48,620
That's loading.

134
00:07:50,210 --> 00:07:54,770
While that's loading, let's scroll down and we have the loss.

135
00:07:55,010 --> 00:07:56,480
We've seen this loss already.

136
00:07:56,480 --> 00:07:59,540
We have let's look at true negatives.

137
00:08:00,350 --> 00:08:02,300
That's loading true positives.

138
00:08:02,990 --> 00:08:06,710
There is a plow we get for the true positives and the true negative.

139
00:08:06,710 --> 00:08:12,710
And then the section where we have this evaluation, evaluation evaluations, practically the validation

140
00:08:13,130 --> 00:08:16,940
metric and loss that we plotting against the number of iterations.

141
00:08:16,940 --> 00:08:21,110
So that's why we have here is just like the validation accuracy versus the iterations.

142
00:08:21,110 --> 00:08:23,660
If you take off the train, you see nothing really changes here.

143
00:08:23,660 --> 00:08:25,730
But when you do this, you see all that goes.

144
00:08:25,730 --> 00:08:31,760
So that's our validation and we could monitor this and observe that the highest accuracy we have is

145
00:08:31,760 --> 00:08:36,260
94.09% or now it's 94.21%.

146
00:08:36,320 --> 00:08:37,220
Scroll down.

147
00:08:37,220 --> 00:08:38,510
We have precision.

148
00:08:38,510 --> 00:08:39,650
I was a precision.

149
00:08:39,650 --> 00:08:45,410
Here is 93.6 know the value is 94.39%.

150
00:08:45,410 --> 00:08:52,160
As of now, we've been able to log this information just from compiling our model.

151
00:08:52,160 --> 00:08:57,530
So because we pass our loss and the different metrics, we're able to log the information and visualize

152
00:08:57,530 --> 00:08:58,730
it on TensorFlow board.

153
00:08:58,880 --> 00:09:06,170
But there are other possibilities that is, it's also possible for us to log information manually instead

154
00:09:06,170 --> 00:09:08,960
of just logging only this information.

155
00:09:08,960 --> 00:09:13,580
So what we could do is log, for example, image data.

156
00:09:13,580 --> 00:09:17,900
We could log even this different learning rates here.

157
00:09:17,900 --> 00:09:25,520
So we are going to log or we can log the learning rate values for each and every epoch.

158
00:09:25,820 --> 00:09:31,760
And this gives us that freedom to log in just any kind of scalar or quantity we want.

159
00:09:31,760 --> 00:09:34,820
So here we have this metric.

160
00:09:34,820 --> 00:09:37,970
We're going to create a metric DA metric directory.

161
00:09:38,720 --> 00:09:41,000
This is going to be logs.

162
00:09:41,000 --> 00:09:42,860
And then here we have metrics.

163
00:09:43,010 --> 00:09:50,570
Then what we're going to do now is create this train writer because now we are doing this like manually.

164
00:09:50,570 --> 00:09:52,310
So we create our train writer.

165
00:09:52,310 --> 00:09:58,640
We have TFW that summary that create file writer.

166
00:09:58,640 --> 00:09:59,750
And then here we.

167
00:09:59,890 --> 00:10:02,020
Have our metric directory.

168
00:10:02,590 --> 00:10:03,160
So that's good.

169
00:10:03,160 --> 00:10:09,460
We have our trained right to create it based on this metric directory, which is going to be in the

170
00:10:09,460 --> 00:10:09,880
logs.

171
00:10:09,880 --> 00:10:13,410
So in those logs we're going to create a metrics directory.

172
00:10:13,420 --> 00:10:14,200
So that's it.

173
00:10:14,200 --> 00:10:17,230
We have our train rider, we could run the cell.

174
00:10:18,190 --> 00:10:21,370
Then just in here we have width.

175
00:10:21,370 --> 00:10:33,250
So once we're done with this with with train writer as default, we train writers default.

176
00:10:33,250 --> 00:10:36,640
We want to have this locked.

177
00:10:36,640 --> 00:10:39,460
So we have to have the summary that's killer.

178
00:10:39,580 --> 00:10:43,960
And then we specify that we're dealing with the learning rate.

179
00:10:43,960 --> 00:10:48,460
So our learning rate and then we're going to pass in the learning rate actually.

180
00:10:48,460 --> 00:10:51,670
So your personal learning rate and then we pass in the epoch.

181
00:10:51,670 --> 00:10:56,890
So this is like as you could see here, you have data look at look at this pop up.

182
00:10:56,890 --> 00:11:01,150
You see here you have data and then here you have steps, all right, a step.

183
00:11:01,150 --> 00:11:05,170
So you have the name of this killer, you have the data and you have the step.

184
00:11:05,170 --> 00:11:06,460
You also have the description.

185
00:11:06,460 --> 00:11:12,100
So yeah, we have the name which is learning rates.

186
00:11:12,100 --> 00:11:15,160
We have the data, which is this learning rate.

187
00:11:15,160 --> 00:11:18,770
And then we have the step, which is the epoch.

188
00:11:18,790 --> 00:11:27,190
Now we have to do this like with this, we have to put it in each and every one of these since either

189
00:11:27,190 --> 00:11:28,810
we get into this or get into that.

190
00:11:28,810 --> 00:11:33,280
But to avoid writing this twice, we just have to set we could set a learning rate.

191
00:11:33,280 --> 00:11:42,910
So in your we define this learning rate learning rates and then year two, we have our learning rates.

192
00:11:44,080 --> 00:11:45,940
There we go, we have the learning rate.

193
00:11:45,940 --> 00:11:50,850
And then out of this we return our learning rate.

194
00:11:51,550 --> 00:11:52,600
Okay, so that's it.

195
00:11:52,600 --> 00:11:57,070
Now we return the learning rate and then we are also logging this data.

196
00:11:57,070 --> 00:12:00,550
So yeah, we should change this and have learning rates.

197
00:12:00,550 --> 00:12:01,230
So that's it.

198
00:12:01,240 --> 00:12:05,920
Now we have the set and then we could go on and train before training.

199
00:12:05,920 --> 00:12:12,160
Recall each time you want to log in this kind of data, all this kind of custom data, first thing you

200
00:12:12,160 --> 00:12:16,210
do, you create your writer as you create this file writer.

201
00:12:16,210 --> 00:12:22,960
After creating the file writer based on a given directory that you set, you now go ahead and then put

202
00:12:22,960 --> 00:12:27,250
this in this train writer scope right here.

203
00:12:27,250 --> 00:12:29,830
So with this, we can run this.

204
00:12:29,830 --> 00:12:30,580
Now.

205
00:12:30,580 --> 00:12:31,630
We've run this already.

206
00:12:31,630 --> 00:12:37,990
We could run this now and then since these are scheduled, our callback will have to add this in our

207
00:12:37,990 --> 00:12:38,470
fit.

208
00:12:38,470 --> 00:12:40,400
So let's go ahead and add it in us.

209
00:12:40,420 --> 00:12:41,500
Our feet matter here.

210
00:12:41,500 --> 00:12:43,840
Let's take now just five epochs.

211
00:12:43,840 --> 00:12:45,730
So yeah, we have ten.

212
00:12:45,730 --> 00:12:50,560
So board callback and then we have scheduler callback.

213
00:12:50,560 --> 00:12:53,680
Okay, so we run this now and see what we get.

214
00:12:53,890 --> 00:12:55,000
That's training.

215
00:12:55,930 --> 00:12:57,490
Our training is now complete.

216
00:12:57,490 --> 00:12:59,890
Let's go ahead and check this here.

217
00:12:59,890 --> 00:13:01,060
We have our logs.

218
00:13:01,060 --> 00:13:06,460
You see we have this matrix and we have this locked next step we go to visualize.

219
00:13:06,460 --> 00:13:13,870
So here we have the stance of board and then here let's do anyway, let's let's run this first so you'll

220
00:13:13,870 --> 00:13:15,100
see what we get.

221
00:13:15,100 --> 00:13:19,300
As you could see right here, we now have this learning rate which has been logged and which we can

222
00:13:19,300 --> 00:13:20,140
visualize.

223
00:13:20,140 --> 00:13:29,950
So you see as we go from this epoch to this epoch to this this, if we set this at zero smoothening,

224
00:13:29,950 --> 00:13:36,580
we have this Now, the reason why we after this we don't get any value is because of what we have here.

225
00:13:36,580 --> 00:13:43,120
So we send the learning rate into this tensor and what we should be doing here is getting that non pi.

226
00:13:43,120 --> 00:13:46,630
So we should have this learning rate, right?

227
00:13:46,630 --> 00:13:52,510
Your learning rate equal to learning rate because this is going to be converted into a tensor.

228
00:13:52,510 --> 00:13:55,870
So we have learned to read the non pi and that should be cool.

229
00:13:55,960 --> 00:13:58,210
Okay, we run this again and that should work.

230
00:13:58,630 --> 00:14:02,770
And then to, to view or to have image response, let's set this to just one.

231
00:14:02,770 --> 00:14:08,560
So if the number of epochs is greater than greater than one, then we would have the learning rate being

232
00:14:08,560 --> 00:14:09,190
modified.

233
00:14:09,190 --> 00:14:12,520
So let's get back again to training.

234
00:14:12,520 --> 00:14:16,510
This time around, let's say we just have three epochs, okay?

235
00:14:16,540 --> 00:14:22,300
This time around we have this actual value locked and so we get back.

236
00:14:22,300 --> 00:14:27,130
Let's reduce this custom training loop, we get back to our visualizations.

237
00:14:27,130 --> 00:14:28,720
We will run this again.

238
00:14:29,290 --> 00:14:37,780
Getting back here, we see we start with this loss and then this drops after this second epoch right

239
00:14:37,780 --> 00:14:38,200
here.

240
00:14:38,590 --> 00:14:45,340
As you may have noticed, each and every time we run new training process, the previous values are

241
00:14:45,340 --> 00:14:49,540
the previous locked input data is being deleted.

242
00:14:49,540 --> 00:14:52,930
So what we could do now is we'll modify this file name.

243
00:14:52,940 --> 00:14:59,710
That's this log file we're using your where this folder name now depends.

244
00:14:59,830 --> 00:15:01,030
On the current time.

245
00:15:01,030 --> 00:15:04,290
So yeah, we're going to have date time.

246
00:15:05,680 --> 00:15:07,120
Date time.

247
00:15:07,120 --> 00:15:19,120
That date time now and then we get a string string from the time and then format this output.

248
00:15:19,120 --> 00:15:20,750
So here we're going to have.

249
00:15:20,770 --> 00:15:30,070
Percentage the day, percentage the month, the year, and then we specify the exact time.

250
00:15:30,070 --> 00:15:35,350
So you're going to have the are the minute and the second.

251
00:15:35,350 --> 00:15:38,380
Now let's go ahead and import the time up here.

252
00:15:38,380 --> 00:15:40,780
We have date time.

253
00:15:41,110 --> 00:15:43,540
So yeah, we import the time.

254
00:15:43,540 --> 00:15:46,510
Simply that should be imported.

255
00:15:46,510 --> 00:15:47,060
Okay.

256
00:15:47,080 --> 00:15:49,150
So we run that and that should be fine.

257
00:15:49,180 --> 00:15:49,400
Okay.

258
00:15:49,420 --> 00:15:50,650
We've imported that time.

259
00:15:50,650 --> 00:15:55,810
Let's get back and then let's print out this log dye right here.

260
00:15:55,810 --> 00:15:58,680
Let's have lock dye, print it out, see what we get.

261
00:15:58,690 --> 00:16:02,410
You see, we have this logs, and this is actually our new folder.

262
00:16:02,410 --> 00:16:08,380
Now, if you print if you print out this again, you see, we're going to have a different folder.

263
00:16:08,380 --> 00:16:16,780
And this is important because each and every time we do not need to cancel our previous runs from here,

264
00:16:16,780 --> 00:16:19,210
let's take all let's take this off.

265
00:16:19,210 --> 00:16:23,290
And then it set this as our current time.

266
00:16:23,290 --> 00:16:29,920
So current time current time is equal to this.

267
00:16:30,640 --> 00:16:40,180
Take this off, here's our current time, and then we have this plus current time.

268
00:16:40,480 --> 00:16:41,110
Okay?

269
00:16:41,110 --> 00:16:42,400
And then, yeah, we do the same.

270
00:16:42,400 --> 00:16:44,980
We have this plus current time.

271
00:16:45,280 --> 00:16:49,360
Let's take this off and then have the current time year.

272
00:16:50,020 --> 00:16:52,030
Current time.

273
00:16:52,840 --> 00:16:55,630
If we get back to this training, we can run this again.

274
00:16:55,630 --> 00:17:01,840
Training now complete as we could see here in this locks, we have this new folder created which is

275
00:17:01,840 --> 00:17:07,780
depending on the time in which we decided to do the training.

276
00:17:07,780 --> 00:17:10,060
And then in this matrix we also have this.

277
00:17:10,060 --> 00:17:14,440
But what we want to have is actually just this one photo which contains the train validation and the

278
00:17:14,440 --> 00:17:15,010
matrix.

279
00:17:15,010 --> 00:17:19,120
So we should modify this right up here.

280
00:17:19,810 --> 00:17:26,890
So instead of having this matrix before we should take this off, we should take this off and then added

281
00:17:26,890 --> 00:17:27,790
it later on.

282
00:17:27,940 --> 00:17:31,450
We have plus and then we add this.

283
00:17:31,480 --> 00:17:36,520
Okay, so we have this slash and then slash, then we should take this now.

284
00:17:36,520 --> 00:17:37,210
So that's it.

285
00:17:37,210 --> 00:17:44,170
We have recreated this and then we will rerun again to avoid this kind of error.

286
00:17:44,170 --> 00:17:47,320
So let's go ahead and retrain our model.

287
00:17:47,320 --> 00:17:50,440
Let's let's have it to be here.

288
00:17:50,440 --> 00:17:51,130
That's fine.

289
00:17:51,280 --> 00:17:54,700
Let's say two epochs and then I'll run that again.

290
00:17:55,150 --> 00:17:55,570
Okay.

291
00:17:55,570 --> 00:17:56,590
The training is complete.

292
00:17:56,590 --> 00:17:59,860
Now you'll see that if you open this up, you have trained validation.

293
00:17:59,860 --> 00:18:03,790
This is what we had previously, but now we have matrix train validation.

294
00:18:03,790 --> 00:18:05,110
This is actually what we want.

295
00:18:05,110 --> 00:18:12,640
We want to be able to log all this into this one directory and you now notice how we do not have to

296
00:18:12,640 --> 00:18:14,560
erase previous logs.

297
00:18:14,560 --> 00:18:18,520
So let's go down to run this again.

298
00:18:18,520 --> 00:18:19,930
We run this.

299
00:18:19,930 --> 00:18:21,280
Take this off.

300
00:18:23,020 --> 00:18:24,010
Okay, here we go.

301
00:18:24,010 --> 00:18:26,260
We have this information now locked.

302
00:18:26,290 --> 00:18:32,560
You see, we could take all the previous locks out and focus on just this lock here.

303
00:18:32,560 --> 00:18:34,680
So let's have this one.

304
00:18:34,690 --> 00:18:36,850
This should be zero five.

305
00:18:36,850 --> 00:18:39,160
The same entry, seven one.

306
00:18:39,160 --> 00:18:43,180
So we want to focus on this one, which ends with seven one.

307
00:18:43,300 --> 00:18:45,480
And here we go.

308
00:18:45,490 --> 00:18:48,460
So we have this matrix, this strain and this validation for this.

309
00:18:48,460 --> 00:18:50,860
We are not interested in log in this.

310
00:18:50,860 --> 00:18:51,420
So that's it.

311
00:18:51,490 --> 00:18:53,590
Let's take this off now and then get back.

312
00:18:53,590 --> 00:19:01,390
So you see here we have the learning rate, we have the epoch accuracy, we have this true negatives,

313
00:19:01,390 --> 00:19:04,210
we have the recall and that's it.

314
00:19:04,210 --> 00:19:10,540
So yeah, we've seen now how to create this directories which are dependent on the current date and

315
00:19:10,540 --> 00:19:11,230
time.

316
00:19:11,230 --> 00:19:18,790
Now the next step we'll be doing is how to actually do this logs without or when we're doing a custom

317
00:19:18,790 --> 00:19:19,360
training.

318
00:19:19,360 --> 00:19:22,870
So let's get back to where we did this custom training loop.

319
00:19:22,870 --> 00:19:29,410
We had this custom training loop and then we have this feed method which comes directly with TensorFlow.

320
00:19:29,410 --> 00:19:36,550
So what if we now try to use our what if we actually use the custom training loop and we do not have

321
00:19:36,550 --> 00:19:43,360
the possibility of just simply saying, okay, callbacks tensor by callback and then the job is done?

322
00:19:43,360 --> 00:19:47,410
And what if we just have this custom training loop?

323
00:19:47,410 --> 00:19:53,290
In this case we are going to use exactly the same process we've just followed here.

324
00:19:53,290 --> 00:19:57,640
So we're just going to create this file writer.

325
00:19:57,640 --> 00:19:59,740
So let's copy all this.

326
00:19:59,920 --> 00:20:08,110
And then just as we did here, actually, we're just going to put right in this scalar values and then

327
00:20:08,110 --> 00:20:11,950
like create this killer, put in the data and then specify the step.

328
00:20:11,950 --> 00:20:14,170
So that's basically how we're going to function.

329
00:20:14,170 --> 00:20:16,840
Now, let's get back to this custom training loop.

330
00:20:16,840 --> 00:20:18,700
We're going to add this code cell.

331
00:20:18,700 --> 00:20:25,210
And then in here we have current time, as usual, we have now let's call this custom directory.

332
00:20:25,210 --> 00:20:29,920
We have logs current time, and then we have custom.

333
00:20:29,950 --> 00:20:32,710
Let's call this custom trained writer.

334
00:20:32,710 --> 00:20:35,200
Let's call this custom train writer.

335
00:20:35,290 --> 00:20:37,150
Custom train writer.

336
00:20:37,150 --> 00:20:40,270
And you could also define a custom validation writer.

337
00:20:40,840 --> 00:20:42,070
We could have that too.

338
00:20:42,070 --> 00:20:45,370
So let's have your custom.

339
00:20:46,990 --> 00:20:51,370
Custom direct and in your we you specify also train.

340
00:20:51,370 --> 00:20:59,590
So notice that since you are using this feat what we got was automatically we automatically got this

341
00:20:59,590 --> 00:21:00,910
train and validation.

342
00:21:00,910 --> 00:21:08,970
So what happens in the background is this two firefighters are created that is the train and the validation.

343
00:21:08,980 --> 00:21:10,510
And we're just going to do exactly that.

344
00:21:10,510 --> 00:21:19,000
Your we have custom and then let's say we have custom train directory, custom train and then custom

345
00:21:19,000 --> 00:21:20,230
validation.

346
00:21:20,980 --> 00:21:21,910
So we have that.

347
00:21:21,910 --> 00:21:24,180
And then here we have custom validation.

348
00:21:24,190 --> 00:21:29,770
Now we specify our writer, we have custom train and custom validation.

349
00:21:29,770 --> 00:21:32,470
Then here we have custom train.

350
00:21:33,580 --> 00:21:36,910
Custom train, custom validation.

351
00:21:37,240 --> 00:21:39,820
Okay, so I think this is okay.

352
00:21:39,820 --> 00:21:49,420
We could now run this and then let's copy out this code we had put out your in the section and the scheduler,

353
00:21:49,420 --> 00:21:51,490
so let's simply copy this out.

354
00:21:51,490 --> 00:21:55,630
You see how easy it becomes when you have already done this.

355
00:21:55,630 --> 00:22:03,100
So you're you, you now have to say instead of just only printing this out, you let's have this.

356
00:22:03,100 --> 00:22:08,800
So with our let's cop let's get a name from here with our custom train writer.

357
00:22:08,890 --> 00:22:12,820
And then here we have with our custom train writer.

358
00:22:13,090 --> 00:22:15,220
Custom train writer.

359
00:22:15,220 --> 00:22:17,590
We have the loss.

360
00:22:17,590 --> 00:22:18,820
So we have the loss.

361
00:22:18,820 --> 00:22:20,320
Let's call it train loss.

362
00:22:20,320 --> 00:22:22,480
We have the training loss.

363
00:22:22,480 --> 00:22:23,560
We have the data.

364
00:22:23,590 --> 00:22:25,690
The data is now this loss here.

365
00:22:25,720 --> 00:22:26,620
It's this loss.

366
00:22:26,620 --> 00:22:30,370
So we have the the data which is passed, which is now the loss.

367
00:22:30,640 --> 00:22:33,970
And then the step is the epoch we have here.

368
00:22:33,970 --> 00:22:35,050
So that's it.

369
00:22:35,080 --> 00:22:36,280
Okay, we've logged this.

370
00:22:36,280 --> 00:22:40,480
Let's now go ahead and lock for the accuracy.

371
00:22:40,480 --> 00:22:49,060
So we paste this out and then we have training, accuracy, accuracy and then the accuracy.

372
00:22:49,060 --> 00:22:58,840
So we kind of paste out, we will see metric the results metric that results, okay, Metric the results.

373
00:22:58,840 --> 00:23:00,850
And then we have the step specified.

374
00:23:00,850 --> 00:23:02,830
So this is for the training process.

375
00:23:02,830 --> 00:23:06,460
We could separate this block and that's it.

376
00:23:07,000 --> 00:23:07,370
Okay.

377
00:23:07,390 --> 00:23:08,290
So we've done this.

378
00:23:08,290 --> 00:23:12,910
We now simply copy this out and then do the same for the validation.

379
00:23:13,120 --> 00:23:19,960
So in here, instead of writing this out like this, we could simply put out your custom.

380
00:23:19,960 --> 00:23:22,780
Val We have custom.

381
00:23:22,780 --> 00:23:25,090
Val, take this off here.

382
00:23:25,090 --> 00:23:28,900
We have validation, validation, loss.

383
00:23:29,050 --> 00:23:31,060
Here we have the loss.

384
00:23:31,060 --> 00:23:32,920
But this is loss of our loss.

385
00:23:32,920 --> 00:23:34,570
Wow, that's fine.

386
00:23:34,570 --> 00:23:35,530
We have metric.

387
00:23:35,530 --> 00:23:37,360
Wow, metric.

388
00:23:37,360 --> 00:23:39,580
Wow, that's fine.

389
00:23:39,580 --> 00:23:47,050
We have validation, accuracy, validation, validation, accuracy.

390
00:23:47,050 --> 00:23:51,360
Now you have this to take this off, and then we have the Val.

391
00:23:51,520 --> 00:23:53,410
Okay, so that sounds fine.

392
00:23:54,100 --> 00:23:55,240
Everything looks okay.

393
00:23:55,240 --> 00:23:57,310
We could run this here.

394
00:23:57,310 --> 00:24:05,920
So let's run this, We run this and then we run, learn, and then we start with a training.

395
00:24:07,630 --> 00:24:08,740
Training now complete.

396
00:24:08,740 --> 00:24:10,780
Let's go ahead and see what we have.

397
00:24:10,810 --> 00:24:12,310
You could check out this log.

398
00:24:12,350 --> 00:24:17,950
You see, we have our values now locked in your custom train and validation.

399
00:24:17,950 --> 00:24:19,460
So we have this locked.

400
00:24:19,480 --> 00:24:23,840
We now go ahead and rerun the stencil board.

401
00:24:23,860 --> 00:24:26,380
As you could see, you have all this values here.

402
00:24:26,380 --> 00:24:35,710
You could, as usual, let's take this off, toggle around runs, and then let's pick this very last

403
00:24:35,710 --> 00:24:36,040
one.

404
00:24:36,040 --> 00:24:43,330
So we picked this last one and also pick out this one because this is a train and this is the validation.

405
00:24:43,720 --> 00:24:49,630
Then we come right here and check out the training accuracy train loss.

406
00:24:50,350 --> 00:24:50,980
There we go.

407
00:24:50,980 --> 00:24:52,810
We have train accuracy.

408
00:24:52,810 --> 00:24:54,670
We have train loss.

409
00:24:55,150 --> 00:25:01,810
If we if we do this, you see, we have okay, we have the train accuracy, we have the train loss,

410
00:25:01,810 --> 00:25:06,490
we have the validation accuracy and we have the validation loss.

411
00:25:06,640 --> 00:25:12,580
Now, if we want to take off all the information stored in the locks, you could have this command.

412
00:25:12,580 --> 00:25:16,900
So we remove all this information and then we specify the locks.

413
00:25:16,900 --> 00:25:20,020
So we run this and then open it up this.

414
00:25:20,020 --> 00:25:23,380
You see, you don't have that locked folder anymore at this point.

415
00:25:23,380 --> 00:25:27,540
We'll go ahead and see how to display image data with Tensor Bar.

416
00:25:27,550 --> 00:25:33,790
So unlike previously where we've been displaying information like the loss, the different metrics,

417
00:25:33,790 --> 00:25:41,260
now we'll see how to implement or rather we're going to display image data like the confusion matrix

418
00:25:41,260 --> 00:25:42,550
we had seen previously.

419
00:25:43,030 --> 00:25:44,260
Let's get back here.

420
00:25:44,260 --> 00:25:46,750
We have this confusion matrix right here.

421
00:25:46,750 --> 00:25:54,160
And what we'll do now is after each epoch we are going to display this conversion matrix with tensor

422
00:25:54,160 --> 00:25:54,700
board.

423
00:25:54,700 --> 00:26:00,280
That said, we're going to copy out all this code we used in displaying this conversion matrix right

424
00:26:00,280 --> 00:26:00,950
here.

425
00:26:00,970 --> 00:26:08,770
So we have this code and then we have this log images callback right here with this on Epoch end method.

426
00:26:08,770 --> 00:26:15,610
And then in this method we are going to paste out this code we used in visualizing the conversion matrix

427
00:26:15,610 --> 00:26:16,300
previously.

428
00:26:16,300 --> 00:26:23,740
So here we have this label input right up to this we have the conversion matrix based on the stretch

429
00:26:23,740 --> 00:26:28,030
threshold, and then we're going to visualize this conversion matrix.

430
00:26:28,030 --> 00:26:35,140
But now since we're working with a callback, what were we doing is at the end of each and every epoch

431
00:26:35,140 --> 00:26:38,650
we are going to display this with Tensor Bar.

432
00:26:38,680 --> 00:26:45,760
Now that this is set, we are going to for now, we've actually just been able to visualize this.

433
00:26:45,760 --> 00:26:47,740
But how do we put this?

434
00:26:48,010 --> 00:26:51,190
How do we make this work with Tensor Board?

435
00:26:51,190 --> 00:26:54,010
What we're going to have here is we create a buffer.

436
00:26:54,550 --> 00:27:04,720
We have this buffer IO dot bytes IO here we have bytes IO and then so that's how our buffer we are going

437
00:27:04,720 --> 00:27:08,980
to save this image, the compression matrix image in this buffer.

438
00:27:08,980 --> 00:27:19,390
So we have this plt that save Feig and then we save it in that buffer and that will specify that the

439
00:27:19,390 --> 00:27:23,230
format should be P and G.

440
00:27:23,920 --> 00:27:27,160
So we have the PNG format and that's okay.

441
00:27:27,160 --> 00:27:29,200
So now we have this buffer.

442
00:27:29,200 --> 00:27:32,830
We've saved that into our buffer.

443
00:27:33,100 --> 00:27:37,750
The next step we will take is create an image of this buffer.

444
00:27:37,750 --> 00:27:39,880
So from here we have this image.

445
00:27:39,880 --> 00:27:49,300
Let's take this up, we have this image, we use the TensorFlow image, decode PNG method, which takes

446
00:27:49,300 --> 00:27:50,770
in this buffer.

447
00:27:50,770 --> 00:27:55,210
So we have our buffer get value number of channels equals three.

448
00:27:55,210 --> 00:27:55,930
That's it.

449
00:27:55,930 --> 00:28:02,560
We have this image and then once we get this image, we then write this in tensor bar.

450
00:28:02,560 --> 00:28:07,690
So we have this image writer which we've created right here, similar to what we've done already.

451
00:28:07,690 --> 00:28:14,620
We create, we use this create file writer, let's modify this and have your image directory.

452
00:28:14,620 --> 00:28:22,090
So we have the image directory and then we create this file writer, or rather we create this image

453
00:28:22,090 --> 00:28:22,780
writer.

454
00:28:22,780 --> 00:28:29,770
So from with this image writer's default, what we're going to do now is instead of having this summary

455
00:28:29,770 --> 00:28:36,070
dot scaler as we used to have here, now we're going to use summary dot image.

456
00:28:36,070 --> 00:28:45,610
So you could see here that tensor BART permits us not only input or write skills but also images, and

457
00:28:45,610 --> 00:28:47,650
that's basically all we needed to do here.

458
00:28:47,650 --> 00:28:56,650
So let's have this and then run this cell so we make sure we have this run, we run this, we run this,

459
00:28:56,650 --> 00:29:01,570
and then we can we have this log images call back here.

460
00:29:02,050 --> 00:29:02,860
Copy that.

461
00:29:03,100 --> 00:29:06,310
Now that's copied, we have reduced.

462
00:29:06,440 --> 00:29:07,460
This one.

463
00:29:07,460 --> 00:29:14,390
And then right here we run this matrix and then compile and run this.

464
00:29:14,660 --> 00:29:21,380
We get this arrow where we said, Oh, well, we have this valley error, no step set, so let's get

465
00:29:21,380 --> 00:29:27,590
back to this callback and then we specify the step.

466
00:29:27,590 --> 00:29:33,020
So right here we have the step, step equal to epoch.

467
00:29:33,020 --> 00:29:36,010
So we run that again and this should be fine.

468
00:29:36,020 --> 00:29:37,340
Trend is going on.

469
00:29:37,340 --> 00:29:40,910
And then the image data has been locked into ten.

470
00:29:40,910 --> 00:29:44,750
So bored at the end of each epoch training now done.

471
00:29:44,750 --> 00:29:48,710
We could go ahead and then run this on ten.

472
00:29:48,710 --> 00:29:55,010
Subodh Once training done, we could now visualize this conversion matrices on tensor board.

473
00:29:55,010 --> 00:30:03,170
So we run this two cells and here is where we get you'd see your step zero, step one, and then step

474
00:30:03,170 --> 00:30:03,470
two.

475
00:30:03,500 --> 00:30:09,080
This is because we actually run this for three epochs so we could notice how let's come back to the

476
00:30:09,080 --> 00:30:09,620
top.

477
00:30:09,620 --> 00:30:17,990
We notice how here we have 52 three and then as we keep training, this drops to 35.

478
00:30:17,990 --> 00:30:21,700
And then finally here, let's scroll down a little.

479
00:30:21,710 --> 00:30:26,480
Finally, here we have 240.

480
00:30:26,480 --> 00:30:34,190
So this tells us that the last epoch wasn't helpful in improving the number of false negatives you could

481
00:30:34,190 --> 00:30:40,820
see also, even with the validation that, yeah, we had 16 false negatives, eight false negatives,

482
00:30:40,820 --> 00:30:46,580
and then we this value rows up to 100 and 111 false negatives.

483
00:30:46,580 --> 00:30:55,070
So it's kind of similar to what we have with the test data, which is exactly why we're logging in the

484
00:30:55,070 --> 00:30:56,000
tensor bar.

485
00:30:56,240 --> 00:31:02,900
And now that you know how to lock in image data with tensor Bot from the example on the conversion matrix,

486
00:31:02,900 --> 00:31:07,310
what you could do is lock in directly this rosy plots.

487
00:31:07,310 --> 00:31:15,380
You could also lock in data like this one right here where on the test data you're going to put out

488
00:31:15,380 --> 00:31:19,460
the actual value and what the model predicts.

489
00:31:19,460 --> 00:31:22,970
And so that's it for this section and log in an image there.

490
00:31:23,000 --> 00:31:28,970
Now let's move on to visualize and model graphs with Tensor Bar to visualize a graph.

491
00:31:28,970 --> 00:31:36,530
We're going to rerun this command to delete all the logs we've stored so far, and then we run this

492
00:31:36,530 --> 00:31:38,930
tense about callback once more.

493
00:31:38,930 --> 00:31:40,280
So we have that.

494
00:31:40,280 --> 00:31:42,200
And then let's get back to metrics.

495
00:31:42,200 --> 00:31:44,300
We run this and that's fine.

496
00:31:44,930 --> 00:31:46,610
Now we have the training done.

497
00:31:46,610 --> 00:31:49,790
Let's go ahead and rerun this again.

498
00:31:49,790 --> 00:31:54,320
So we run this to cells again and as expected, years where we get.
