1
00:00:00,470 --> 00:00:10,370
Now what we need to do is to define the optimizing or optimizers, the optimizing scheme or method.

2
00:00:10,370 --> 00:00:18,740
And in order to do that, we well, first we have to do something we didn't do before, which is assigning

3
00:00:18,740 --> 00:00:23,600
the, the, the, the data to a device to self dot.

4
00:00:24,430 --> 00:00:29,350
Self dot x two device.

5
00:00:29,560 --> 00:00:34,320
And here we're going to do it again and again.

6
00:00:34,330 --> 00:00:41,470
The first one is, well, the whole X, which is the location in in x and time.

7
00:00:42,100 --> 00:00:43,750
And then white train.

8
00:00:44,480 --> 00:00:45,350
And.

9
00:00:47,620 --> 00:00:49,390
A xtrain.

10
00:00:51,580 --> 00:01:01,450
All has to go to the device and regarding x also self dot x it will require.

11
00:01:03,320 --> 00:01:04,490
Requires.

12
00:01:06,910 --> 00:01:07,810
Gradient.

13
00:01:10,050 --> 00:01:13,190
So which is important for our training process.

14
00:01:13,200 --> 00:01:20,880
So of course Y and X will not require gradient because simply they are data, they are data points.

15
00:01:20,970 --> 00:01:21,870
So.

16
00:01:23,230 --> 00:01:23,840
Here.

17
00:01:23,890 --> 00:01:30,670
We we already finished regarding to assigning these variables to the devices and now let's do the optimizer

18
00:01:30,670 --> 00:01:31,400
thing.

19
00:01:31,420 --> 00:01:34,690
So here is the optimization.

20
00:01:37,960 --> 00:01:44,410
Or the miser sitting like whatever you can call it.

21
00:01:44,980 --> 00:01:48,010
And we start by self dot.

22
00:01:49,840 --> 00:01:50,580
Adam.

23
00:01:53,070 --> 00:01:57,180
Equals torch dot optim.

24
00:01:58,410 --> 00:01:59,100
Dot.

25
00:02:00,410 --> 00:02:00,830
Adam.

26
00:02:01,610 --> 00:02:02,210
Adam.

27
00:02:04,150 --> 00:02:11,230
And self dot model dot parameters.

28
00:02:12,580 --> 00:02:17,290
So these are the first optimizer, which is Adam.

29
00:02:17,470 --> 00:02:21,940
The second optimizer is, well, self optimal.

30
00:02:23,070 --> 00:02:27,360
Miser we call it and torch dot.

31
00:02:27,770 --> 00:02:28,770
Optym.

32
00:02:29,010 --> 00:02:30,730
Dot LP.

33
00:02:31,440 --> 00:02:33,120
Efg's.

34
00:02:33,840 --> 00:02:43,770
And basically this optimizer is what we will do is we start with Adam, we start to optimize the whole

35
00:02:43,770 --> 00:02:44,520
network.

36
00:02:44,560 --> 00:02:46,740
Rough optimization of the network.

37
00:02:46,740 --> 00:02:56,310
And after Adam, we are going to be quite exact regarding like to, to tweak these weights to get an

38
00:02:56,310 --> 00:02:57,690
more accurate number.

39
00:02:57,690 --> 00:03:09,540
So this is what we usually use and this is also the optimizer that is also this kind of optimizer is.

40
00:03:11,720 --> 00:03:16,190
It's like it's good for short or limited memory.

41
00:03:16,190 --> 00:03:19,670
So this is what we usually going to use.

42
00:03:19,670 --> 00:03:27,160
And it's called a Broyden Fletcher Gold for channel.

43
00:03:27,170 --> 00:03:30,560
So this is the A this type of optimizer.

44
00:03:30,560 --> 00:03:37,880
And then we have to pass some basically parameters or like some.

45
00:03:42,090 --> 00:03:43,800
Values to this optimizer.

46
00:03:43,860 --> 00:03:46,650
Think it's better just to put it here?

47
00:03:47,310 --> 00:03:51,270
So the first one is we need the parameters.

48
00:03:51,460 --> 00:03:54,630
I think we have to put it here this way.

49
00:03:55,390 --> 00:04:01,960
And again, here we put the same parameters and then we put the learning rate.

50
00:04:02,000 --> 00:04:03,730
Why it might move.

51
00:04:04,590 --> 00:04:07,920
Learning rate equals one.

52
00:04:08,830 --> 00:04:09,820
And.

53
00:04:11,070 --> 00:04:12,060
It should.

54
00:04:12,060 --> 00:04:13,710
I don't want it like this.

55
00:04:13,710 --> 00:04:14,420
I want.

56
00:04:14,430 --> 00:04:15,390
Yeah, this one.

57
00:04:17,600 --> 00:04:18,830
This is no problem.

58
00:04:19,250 --> 00:04:24,710
So learning rate is is is one and maximum.

59
00:04:25,840 --> 00:04:27,160
Iteration.

60
00:04:27,850 --> 00:04:33,520
Iteration, well, equals 50,000.

61
00:04:34,510 --> 00:04:35,470
And.

62
00:04:36,430 --> 00:04:37,510
Maximum.

63
00:04:37,750 --> 00:04:46,900
These are just values that you need to consider when some parameters that has to be set for this.

64
00:04:48,410 --> 00:04:49,190
Optimizer.

65
00:04:53,980 --> 00:04:55,090
History.

66
00:04:57,530 --> 00:04:58,370
History.

67
00:05:00,240 --> 00:05:03,060
Size is going to be 50.

68
00:05:04,720 --> 00:05:06,310
Tolerance.

69
00:05:09,780 --> 00:05:15,330
Great equals one e to the power E7.

70
00:05:19,710 --> 00:05:20,820
Rinse.

71
00:05:22,770 --> 00:05:24,840
Change equals.

72
00:05:28,780 --> 00:05:30,940
This is also a syntax and.

73
00:05:32,440 --> 00:05:33,070
Dot.

74
00:05:34,440 --> 00:05:35,160
Find.

75
00:05:36,280 --> 00:05:37,210
Float.

76
00:05:39,580 --> 00:05:42,020
Not apes.

77
00:05:43,890 --> 00:05:47,760
And the line search.

78
00:05:52,640 --> 00:05:53,480
If in.

79
00:05:57,840 --> 00:05:59,220
Strong Wolf.

80
00:05:59,250 --> 00:06:00,030
Wolf.

81
00:06:04,330 --> 00:06:09,190
I think that's it for this one.

82
00:06:10,190 --> 00:06:12,230
I hear we should put like this.

83
00:06:13,460 --> 00:06:21,650
Okay, so we start Adam, we optimize with Adam and then we use LPF as Optimizer.

84
00:06:21,650 --> 00:06:23,960
We have it will have many parameters.

85
00:06:23,960 --> 00:06:30,260
We have to search what is these parameters and then change it based on whatever it is and of course

86
00:06:30,260 --> 00:06:31,580
based on the results we have.
