In the previous video, you got a look at RNNs and how they can be used for sequence to vector to sequence to
sequence prediction. Let's now take a look at coding them for
the problem at hand and seeing if we can get good predictions in our
time series using them. One thing you'll see in
the rest of the lessons going forward is that
I'd like to write a little bit of code to optimize the neural network for the learning rate
of the optimizer. Can be pretty quick to
train and we can from there save a lot of time in
our hyper-parameter tuning. So here's the code
for training the RNN with two layers
each with 40 cells. To tune the learning rate, we'll set up a callback,
which you can see here. Every epoch this just changes the learning rate a
little so that it steps all the way from 1 times 10 to the minus 8 to 1 times
10 to the minus 6. You can see that setup
here while training. I've also introduced
a new loss function to use called Huber which
you can see here. The Huber function is a loss function that's
less sensitive to outliers and as this data
can get a little bit noisy, it's worth giving it a shot. If I run this for 100 epochs and measure the loss at each epoch, I will see that
my optimum learning rate for stochastic gradient
descent is between about 10 to the minus 5
and 10 to the minus 6. So I'm going to set it's 5
times 10 to the minus 5. So now, I'll set
my models compiled with that learning rate and the stochastic gradient
descent optimizer. After training for 500 epochs, I will get this chart, with an MAE on the validation
set of about 6.35. It's not bad, but I wonder
if we can do better. So here's the loss and
the MAE during training with the chart on
the right is zoomed into the last few epochs. As you can see,
the trend was genuinely downward until a little
after 400 epochs, when it started getting unstable. Given this, it's probably worth only training for
about 400 epochs. When I do that, I get these results. That's pretty much the same with the MAE only a tiny
little bit higher, but we've saved 100 epochs worth of training to get
it. So it's worth it. A quick look at the training MAE and loss gives us this results. So we've done quite well, and that was just
using a simple RNN. Let's see how we can
improve this with LSTMs and you'll see
that in the next video.