1
00:00:01,530 --> 00:00:07,590
OK, back to over Motlop, let's just not change them for now.

2
00:00:07,750 --> 00:00:13,020
Later, I'm going to change validation and testing, but for now it's OK, 70 percent for training,

3
00:00:13,020 --> 00:00:21,760
15 percent of samples, which here is three samples for validation and 15 percent for testing.

4
00:00:22,110 --> 00:00:28,530
Now, just click on next in this board, we are going to add just a number of hidden layers.

5
00:00:28,650 --> 00:00:36,660
But the question is why we can't just change the number of output layer, why we only have the option

6
00:00:36,660 --> 00:00:40,440
to change the number of neurons in a hidden layer.

7
00:00:40,710 --> 00:00:43,710
The answer is actually pretty obvious.

8
00:00:44,250 --> 00:00:51,510
If you have more than one neuron in our output layer, it means we will have more than one output.

9
00:00:51,780 --> 00:00:55,000
And the question is why do we need more than one output?

10
00:00:55,020 --> 00:00:56,250
We have only one output.

11
00:00:56,250 --> 00:01:00,960
If we choose to narrow, it will give us to output what we have on the one output.

12
00:01:00,990 --> 00:01:06,640
So in this structure, we only need to have one neuron for the output layer.

13
00:01:06,990 --> 00:01:09,180
However, it's different for the hidden layer.

14
00:01:09,180 --> 00:01:15,270
We can change into different numbers for our training and I will show you the result with a different

15
00:01:16,140 --> 00:01:19,570
number of hidden neurons in the hidden layer.

16
00:01:20,160 --> 00:01:22,210
So let's just clear the next.

17
00:01:22,560 --> 00:01:24,950
Now the network is ready to be trained.

18
00:01:25,230 --> 00:01:30,000
We just divided the samples to training, validation and testing.

19
00:01:30,150 --> 00:01:33,780
We can choose here in the training algorithm by default.

20
00:01:33,780 --> 00:01:35,190
It is very clever.

21
00:01:35,190 --> 00:01:35,990
We can change it.

22
00:01:36,270 --> 00:01:38,570
Let's take a look at this explanation.

23
00:01:39,060 --> 00:01:47,130
This algorithm typically requires more memory, but less time training automatically stop evangelisation,

24
00:01:47,130 --> 00:01:55,520
a stop improving as indicated by the increase in the mean square error of their validation samples.

25
00:01:55,800 --> 00:02:02,100
So it really depends on the type of data and samples that we have, but we have other options here.

26
00:02:02,100 --> 00:02:05,220
We have this one, this chickweed here.

27
00:02:05,580 --> 00:02:13,440
This algorithm typically requires more time, but can result in good generalization for difficult,

28
00:02:13,500 --> 00:02:15,940
small and noisy sets.

29
00:02:16,230 --> 00:02:24,240
So again, it's really important to know the type of data set and to understand what is each of them

30
00:02:24,240 --> 00:02:27,920
for and then choose the best one to fit your data.

31
00:02:28,120 --> 00:02:35,280
The next one is the gradient descent algorithm requires less memory training automatically a stubborn

32
00:02:35,280 --> 00:02:40,670
generalization to stop improving as indicated by increase in Amena Square.

33
00:02:40,680 --> 00:02:47,790
Or this is just in the matter of generalization look like 11 Berthot using less memory.

34
00:02:48,660 --> 00:02:50,460
I'm going to just set it for one.

35
00:02:50,730 --> 00:02:51,420
That's a start.

36
00:02:51,420 --> 00:02:52,830
Training we can train.

37
00:02:53,730 --> 00:02:57,330
It will take a little time based on the data that you have.

38
00:02:57,610 --> 00:02:59,610
Our network now is training.

39
00:03:00,150 --> 00:03:07,020
Sometimes if you have lots of data, it might take even several hours to train the network.

40
00:03:07,500 --> 00:03:10,590
But for now I have only 20 samples here.

41
00:03:10,590 --> 00:03:13,920
I have this is a neural network training information.

42
00:03:13,920 --> 00:03:18,840
It just gives us some information about the training that we had.

43
00:03:19,290 --> 00:03:22,140
We can call it any time into programming.

44
00:03:22,140 --> 00:03:25,500
We do come in, we know with and train to.

45
00:03:25,860 --> 00:03:28,200
These are some information about algorithm.

46
00:03:28,200 --> 00:03:30,060
The data division was random.

47
00:03:30,300 --> 00:03:35,880
It just select the data randomly and this is what we use most of the time.

48
00:03:36,090 --> 00:03:42,990
Unless you have some data, for example, you have a parameter and it's very important for you to select

49
00:03:42,990 --> 00:03:48,000
three samples from the beginning, three samples from the end and three samples from the middle.

50
00:03:48,150 --> 00:03:55,800
You can later it in a programming map by default is random for selecting the data training.

51
00:03:55,800 --> 00:03:59,520
It just use Lowenberg for training for the performance.

52
00:03:59,850 --> 00:04:05,250
It's using Mosquera and the calculation is just using mix.

53
00:04:06,150 --> 00:04:08,450
Let's take a look at this process.

54
00:04:09,360 --> 00:04:17,580
The airport is just a number of times that it tries to train the network using the algorithm which we

55
00:04:17,580 --> 00:04:19,290
chose each time.

56
00:04:19,290 --> 00:04:27,300
It will change the rate based on this algorithm and then conduct a training process one more time.

57
00:04:27,480 --> 00:04:32,010
It can go from zero to 1000 times.

58
00:04:32,250 --> 00:04:37,350
It will stop the training process if it's reached one thousand time.

59
00:04:37,500 --> 00:04:41,100
Next one is about the time we didn't set any time.

60
00:04:41,100 --> 00:04:45,870
We can do it in a programming, but it just took two seconds.

61
00:04:46,290 --> 00:04:54,180
If you have a big data, you can adjust this time and you can, for example, city train for three minutes.

62
00:04:54,600 --> 00:04:59,790
And after we reached three minutes, stop the training processes or just.

63
00:04:59,910 --> 00:05:06,250
Limited to several hours, if you have a complicated system and it requires lots of time for training,

64
00:05:06,270 --> 00:05:07,680
you can just limit the time.

65
00:05:07,890 --> 00:05:14,160
But in our example, and most of the examples actually really don't need to set the time and it's going

66
00:05:14,160 --> 00:05:15,540
to be less than five minutes.

67
00:05:16,620 --> 00:05:20,280
The performance issues of performance and error.

68
00:05:20,610 --> 00:05:26,850
If error reached zero, then stop the training, because that's the best solution that we are looking

69
00:05:26,850 --> 00:05:27,180
for.

70
00:05:27,180 --> 00:05:28,660
Zero for the error.

71
00:05:29,490 --> 00:05:30,930
Next one is the gradient.

72
00:05:31,950 --> 00:05:37,720
Just to stop the training whenever the gradient rej one over negative seven.

73
00:05:39,060 --> 00:05:40,850
But why not to zero?

74
00:05:41,220 --> 00:05:48,510
Because we know that in derivative, if derivative reached a flat point, then that's it.

75
00:05:48,810 --> 00:05:52,430
We need to stop the training and that's a result that we are looking for.

76
00:05:53,550 --> 00:05:57,750
Here we have Ammu and last one is validation checks.

77
00:05:58,110 --> 00:06:04,440
It has different color, it's green and it's reached the maximum elevation check.

78
00:06:04,890 --> 00:06:08,490
Each time the validation fails, we will give it one.

79
00:06:09,180 --> 00:06:17,520
And then if six times we have a fail result for the quizes that we are conducting during the training,

80
00:06:17,760 --> 00:06:24,420
then there is no point to continue this training and we are just getting over train result.

81
00:06:24,570 --> 00:06:30,870
So we have to stop the training and each item which reached its limit.

82
00:06:30,960 --> 00:06:37,110
First, it's going to stop the training processes and it will be in a color of green here.

83
00:06:37,110 --> 00:06:39,390
We can also see here validation is the.
