1
00:00:00,820 --> 00:00:04,310
In the previous video, we looked at
training a deep neural network to learn,

2
00:00:04,310 --> 00:00:07,590
to predict the next value
in a Windows sequence.

3
00:00:07,590 --> 00:00:11,920
We also looked at some basic hyper
parameter tuning to pick a good learning

4
00:00:11,920 --> 00:00:16,800
rate for the gradient descent, which then
allowed us to further improve the model.

5
00:00:16,800 --> 00:00:20,690
In this video, we'll wrap up this week by
going through the workbook that shows you

6
00:00:20,690 --> 00:00:21,700
how to do all of that.

7
00:00:24,101 --> 00:00:27,160
We'll run this code to see our
current version of TensorFlow.

8
00:00:27,160 --> 00:00:27,930
And as before,

9
00:00:27,930 --> 00:00:32,010
if you're using something earlier than
2.0, you'll see it reported here.

10
00:00:32,010 --> 00:00:36,808
Use the code cell above to install
the latest version of TensorFlow or

11
00:00:36,808 --> 00:00:39,300
a nightly version like I'm using.

12
00:00:39,300 --> 00:00:43,861
Running this block will set up the data
series and all the constants for

13
00:00:43,861 --> 00:00:45,802
the Window size and all that.

14
00:00:45,802 --> 00:00:50,430
This code will create
the Window dataset as before.

15
00:00:50,430 --> 00:00:54,340
And here is our DNN where we have
three layers in a sequential.

16
00:00:54,340 --> 00:00:56,770
The first has ten neurons
activated by relu.

17
00:00:56,770 --> 00:00:58,960
The second is the same, and

18
00:00:58,960 --> 00:01:02,470
the third is a single dense giving
us back the predicted value.

19
00:01:03,570 --> 00:01:06,520
We'll compile it with means
squared error loss and

20
00:01:06,520 --> 00:01:08,820
stock axis radiant
descent as an optimizer.

21
00:01:10,290 --> 00:01:14,470
After 100 epochs it's done, and
we can plot the forecast versus the data.

22
00:01:16,320 --> 00:01:18,970
Then we can print the mean absolute error.

23
00:01:18,970 --> 00:01:20,370
Don't worry if you get a different value,

24
00:01:20,370 --> 00:01:23,000
remember there's going to be some
random noise in the dataset.

25
00:01:24,420 --> 00:01:27,300
If we run this code block
we'll retrain over 100 epochs,

26
00:01:27,300 --> 00:01:31,790
but we'll use the code back to
call the learning rate scheduler,

27
00:01:31,790 --> 00:01:34,430
which will then adjust
the learning rate over each epoch.

28
00:01:35,610 --> 00:01:38,790
When it's done we can then
plot the results of the loss

29
00:01:38,790 --> 00:01:40,610
against the learning rates.

30
00:01:40,610 --> 00:01:44,480
We can then inspect the lower part of
the curve before it gets unstable.

31
00:01:44,480 --> 00:01:46,210
And we'll come up with the value.

32
00:01:46,210 --> 00:01:50,221
In this case it looks to be about two
notches to the left of 10 to the minus 5.

33
00:01:50,221 --> 00:01:53,823
So I'll say it's 8 times 10 to
the minus 6, or thereabouts.

34
00:01:55,954 --> 00:02:00,969
I'll then retrain and this time
I'll do it for 500 epochs with that

35
00:02:00,969 --> 00:02:06,350
learning rate when it's done, I can
then plot the loss against the epoch.

36
00:02:06,350 --> 00:02:09,370
So we can see how the loss
progressed over training time.

37
00:02:11,260 --> 00:02:14,430
We can see that it fell sharply and
then flattened out.

38
00:02:14,430 --> 00:02:18,501
But agin if we remove the first ten
epochs, we'll see the latter ones more

39
00:02:18,501 --> 00:02:22,652
clearly and it still shows the loss
smoothly decreasing at 500 epochs.

40
00:02:22,652 --> 00:02:25,092
So it's actually still learning.

41
00:02:25,092 --> 00:02:27,391
Let's now plot the forecast
against the data and

42
00:02:27,391 --> 00:02:29,980
we can see that the prediction
still look pretty good.

43
00:02:31,440 --> 00:02:35,602
And when we print out the value of the
mean absolute error we have improved even

44
00:02:35,602 --> 00:02:37,407
further over the earlier value.

45
00:02:42,134 --> 00:02:43,810
So that wraps up this week.

46
00:02:43,810 --> 00:02:47,080
Go through the workbook yourself and
experiment with different neural network

47
00:02:47,080 --> 00:02:49,430
definitions, changing
around the layers and

48
00:02:49,430 --> 00:02:51,362
stuff like that to see if If
we can make it even better.

49
00:02:52,422 --> 00:02:56,582
Next week we're going to take this to the
next level by using neural network types

50
00:02:56,582 --> 00:03:00,422
that were current neural networks which
have sequencing capabilities built-in.

51
00:03:00,422 --> 00:03:01,592
I'll see you there.