1
00:00:00,510 --> 00:00:05,910
So you see, we are almost at the end of our notebook here, because the only thing left to do is we

2
00:00:05,910 --> 00:00:13,560
have to loop over the error of its gradient function over and over again and then update our values

3
00:00:13,980 --> 00:00:17,400
to do the actual fitting of our model function.

4
00:00:18,530 --> 00:00:19,230
So let's do it.

5
00:00:20,040 --> 00:00:25,220
I will define here the number of iterations to be maybe 1000.

6
00:00:25,230 --> 00:00:29,550
I think we need quite a few of those several thousands, but let's start with one thousand.

7
00:00:30,060 --> 00:00:35,970
And then we must define some update rate and to be honest, looking at these arrows here.

8
00:00:36,480 --> 00:00:41,480
So I know already that these are the correct set of parameters.

9
00:00:41,490 --> 00:00:49,290
So even if it's correct, the gradient will never be zero, which is due to the random numbers that

10
00:00:49,290 --> 00:00:49,970
we have added.

11
00:00:50,010 --> 00:00:54,120
So the fit is never really perfect, so we will always have some error on some gradient.

12
00:00:55,020 --> 00:00:58,350
So these will be the smallest gradients that we will encounter.

13
00:00:59,220 --> 00:01:07,890
So if these are the smallest gradients and such values shouldn't really affect the A0 at all.

14
00:01:07,900 --> 00:01:14,970
So we need a value of age that is much smaller than the reciprocal of these values, so much smaller

15
00:01:14,970 --> 00:01:15,660
than this one.

16
00:01:16,590 --> 00:01:17,640
So I don't know.

17
00:01:17,640 --> 00:01:23,280
Maybe I will add just here to more zero, so we will have to test and see if this works or not.

18
00:01:24,720 --> 00:01:28,080
And then we will start with some values for our eight.

19
00:01:28,650 --> 00:01:34,530
And as I said previously, we will of course not use a zero because, yeah, that would be stupid,

20
00:01:34,530 --> 00:01:36,720
because then we would have to perfect fit right away.

21
00:01:37,170 --> 00:01:39,930
We have to start from some random values.

22
00:01:40,380 --> 00:01:46,830
We could, of course, guess something just first of all, fit this function by I and then improve the

23
00:01:46,830 --> 00:01:48,300
fit by using this algorithm.

24
00:01:48,840 --> 00:01:55,020
But here I want to do them in a very brute force way and just say, OK, I used two times and p dot

25
00:01:55,590 --> 00:02:00,270
random dot around four minus one.

26
00:02:01,170 --> 00:02:03,390
So this means a is now a vector.

27
00:02:03,600 --> 00:02:10,530
It has four components for the four coefficients, and all the values are in the range of minus one

28
00:02:10,530 --> 00:02:10,979
and one.

29
00:02:11,190 --> 00:02:12,720
And they are randomly selected.

30
00:02:12,780 --> 00:02:18,060
For example, we can check this are now the starting points for our eight coefficients.

31
00:02:19,080 --> 00:02:22,980
And when I run this once again, of course, then the starting parameters are different.

32
00:02:24,510 --> 00:02:25,170
All right.

33
00:02:25,830 --> 00:02:34,430
So now the only thing that we have to do is you have to write a loop for I in range iterations.

34
00:02:36,810 --> 00:02:42,510
Now, the only thing that we have to do is we have to update the values for eight eight is the old eight.

35
00:02:42,510 --> 00:02:48,390
And then I told you that we will go along the opposite direction of the gradient and we will multiply

36
00:02:48,390 --> 00:02:49,320
by the step with.

37
00:02:49,890 --> 00:02:53,130
And then we will just update this with this error.

38
00:02:53,950 --> 00:02:58,790
And now, of course, we have to use our parameters and not the A0.

39
00:02:59,850 --> 00:03:01,410
And I think this is already it.

40
00:03:01,860 --> 00:03:07,460
So when we run this, then we should end up with some parameters.

41
00:03:07,560 --> 00:03:15,750
And so we could, for example, say here at the starting point, print A and then at the end print a

42
00:03:15,750 --> 00:03:16,140
again.

43
00:03:19,200 --> 00:03:22,320
So you see the A values have been updated.

44
00:03:23,900 --> 00:03:24,410
All right.

45
00:03:24,740 --> 00:03:31,160
So we do not really know if these are good values yet, but we could check this by comparing a with

46
00:03:31,160 --> 00:03:35,720
a zero because we know that these zeros are the good values.

47
00:03:36,710 --> 00:03:39,050
And it doesn't look good yet.

48
00:03:39,530 --> 00:03:43,610
So probably we have to increase the numbers of iterations.

49
00:03:44,180 --> 00:03:45,890
Let's go with 10000.

50
00:03:46,580 --> 00:03:48,740
So you see, it takes a moment now,

51
00:03:52,250 --> 00:03:56,810
and this time at least the last two coefficients fit quite well.

52
00:03:57,350 --> 00:04:00,770
And maybe this is a bit surprising for you now.

53
00:04:00,770 --> 00:04:07,520
But the last two coefficients are the most important ones because they determine the highest order polynomial

54
00:04:07,520 --> 00:04:07,980
terms.

55
00:04:08,030 --> 00:04:10,790
So this would be the coefficient for X to the power of three.

56
00:04:11,300 --> 00:04:12,860
This one four x squared.

57
00:04:13,160 --> 00:04:18,079
And of course, these will determine the shape of the function, especially for large X values.

58
00:04:18,769 --> 00:04:23,720
So it's really, really good that we see that these values are quite close to the actual values.

59
00:04:24,710 --> 00:04:27,170
So maybe just go to 100K.

60
00:04:27,860 --> 00:04:31,640
So you see here it runs for a few seconds now.

61
00:04:31,640 --> 00:04:37,580
And maybe if your PC is slower than mine, this could take quite a few seconds.

62
00:04:38,450 --> 00:04:39,770
Let's see what's happening here.

63
00:04:43,890 --> 00:04:52,740
Actually, I could leave this here because we print the values anyway, so this time, let's see.

64
00:04:53,280 --> 00:04:54,090
Um.

65
00:04:54,840 --> 00:05:04,020
Yeah, I didn't really get much better, but I would say, let's just look at the plots because it's

66
00:05:04,020 --> 00:05:06,150
hard to say just by comparing the parameters.

67
00:05:06,690 --> 00:05:12,930
So even though these are the parameters that we have used for creating the data points, they do not

68
00:05:12,930 --> 00:05:18,870
necessarily have to be the perfect parameters for the fit because we have added this random noise.

69
00:05:19,620 --> 00:05:23,280
So I would say, let's just go ahead and plot the function.

70
00:05:24,150 --> 00:05:25,810
So let me copy this one.

71
00:05:26,880 --> 00:05:37,350
And what we will do is we will use here a plot, so we will use people, we will plot the actual line

72
00:05:37,860 --> 00:05:43,940
and we will plot through, of course, the function with our fitted coefficients.

73
00:05:44,460 --> 00:05:51,030
And then in the background, we will use a scatter point as scatterplot and sorry, using the data points.

74
00:05:51,590 --> 00:05:53,310
Will use data zero.

75
00:05:53,310 --> 00:05:55,620
Common data one.

76
00:05:57,570 --> 00:05:59,610
And you see, it looks really good.

77
00:06:00,090 --> 00:06:04,270
So even though, as I said, the parameters do not agree perfectly.

78
00:06:05,170 --> 00:06:11,880
If it looks good and this could be, for example, for the following reason here this is the zero with

79
00:06:11,880 --> 00:06:14,460
all the terms, and this is just a shift in my direction.

80
00:06:14,970 --> 00:06:24,450
So actually, our fit is shifted by 18 units to the top compared to the function that we have used to

81
00:06:24,450 --> 00:06:25,260
create the data.

82
00:06:25,800 --> 00:06:28,830
But the linear term is also quite different.

83
00:06:28,830 --> 00:06:30,960
So maybe this compensates the whole thing.

84
00:06:32,010 --> 00:06:41,310
So what we could do is we could also add here the other fit or not the other fit, but the function

85
00:06:41,310 --> 00:06:44,370
that we have used to create the data and.

86
00:06:46,320 --> 00:06:48,510
Now, I'm a bit surprised.

87
00:06:49,050 --> 00:06:55,740
I just stopped the recording because I was surprised that the orange function that we have just plotted

88
00:06:55,740 --> 00:06:59,970
doesn't really reproduce our data points, so something was off here.

89
00:07:00,450 --> 00:07:03,330
I went back to the very start of this section.

90
00:07:03,330 --> 00:07:08,280
So where we have started with the interpolation and generated the data, I have seen that here.

91
00:07:08,280 --> 00:07:12,480
These are actually the coefficients and the first coefficient is actually 15.

92
00:07:12,930 --> 00:07:19,200
And for some reason, I redefined here, let me search for it.

93
00:07:20,600 --> 00:07:30,050
Here in three point 2.3.1, a redefined a zero to be minus two for some reason, so that was, of course,

94
00:07:30,050 --> 00:07:30,410
wrong.

95
00:07:30,770 --> 00:07:37,730
So let's updated to 15, and then I was, let's say let's rerun the whole notebook, our restyled run

96
00:07:37,730 --> 00:07:37,930
all.

97
00:07:39,620 --> 00:07:43,620
And now, of course, it takes a few seconds, few seconds.

98
00:07:43,640 --> 00:07:49,880
Actually, it's very fast, except for the very last thing, because here we will have to loop 100000

99
00:07:49,880 --> 00:07:50,420
times.

100
00:07:50,550 --> 00:07:51,470
This takes a while.

101
00:07:53,880 --> 00:08:01,390
And then we can compare the plots and we can compare to coefficients this time in an accurate manner.

102
00:08:01,410 --> 00:08:04,380
So sorry for having made this mistake.

103
00:08:04,820 --> 00:08:08,550
I hope this wasn't too confusing, but this is sometimes how it goes.

104
00:08:08,550 --> 00:08:14,460
You make mistakes, and the good thing or good methods to finding these mistakes is always plotting.

105
00:08:14,460 --> 00:08:20,190
I think because I have seen the orange function was way too low, so I noticed there had to be something

106
00:08:20,190 --> 00:08:23,160
wrong here and now everything looks good.

107
00:08:23,790 --> 00:08:30,450
You see, the orange function was the one that we have used for creating this data, and then we have

108
00:08:30,450 --> 00:08:32,100
added the noise on top of it.

109
00:08:32,850 --> 00:08:38,669
And the black function is the one that we have received from the fitting.

110
00:08:39,539 --> 00:08:46,200
So it could really be that due to the random data that we have added that the black function is actually

111
00:08:46,200 --> 00:08:52,380
better than the orange one, and we could even verify this by calculating the error.

112
00:08:52,830 --> 00:08:56,460
So this time we can use our Arrow Fit function so that it isn't useless.

113
00:08:57,000 --> 00:08:59,670
We can just go ahead and calculate here.

114
00:09:00,660 --> 00:09:09,300
Print error fits polynomial model, a zero data and then a data.

115
00:09:10,050 --> 00:09:12,280
So, first of all, will be our fit.

116
00:09:12,330 --> 00:09:13,890
Second run will be a zero.

117
00:09:14,250 --> 00:09:16,650
The data are the function we have used initially.

118
00:09:17,880 --> 00:09:21,370
And do you really see our fit data has a lower error.

119
00:09:21,390 --> 00:09:28,860
So really, at least for the error that we have to find our fit is better than the function that we

120
00:09:28,860 --> 00:09:29,790
have started with.

121
00:09:30,120 --> 00:09:33,630
So this is maybe a bit counterintuitive and a bit unexpected.

122
00:09:33,630 --> 00:09:38,970
So for me, it's also unexpected, but it makes sense because due to the randomly added points here,

123
00:09:38,970 --> 00:09:44,790
it can happen that the ideal function for fitting this data is different than the function that we have

124
00:09:44,790 --> 00:09:45,900
started out with.

125
00:09:47,070 --> 00:09:51,300
And so now if we compare two coefficients, they are a bit different.

126
00:09:51,660 --> 00:09:57,150
But you see, it makes sense because our fit is actually the better fit to the data.

127
00:09:57,830 --> 00:10:05,580
I think this is a very good closing remark and really the ultimate proof that our fit works very, very

128
00:10:05,580 --> 00:10:08,430
well and that this method is very useful.

129
00:10:09,120 --> 00:10:17,610
And so comparing this result with these blind methods, we have now really fitted our data with a model

130
00:10:17,610 --> 00:10:18,180
function.

131
00:10:18,510 --> 00:10:24,990
And this is really what you want to do in many physical problems where you have some mechanism that

132
00:10:24,990 --> 00:10:30,810
follows some physical law and then you measure some experimental data and then you do not always want

133
00:10:30,810 --> 00:10:34,860
to have the perfect fit, which would be a splain function.

134
00:10:35,280 --> 00:10:37,890
It would be the mathematically ideal fit.

135
00:10:38,310 --> 00:10:45,510
But do you want to have the physically optimal fit so you want to fit it to the physical model?

136
00:10:46,080 --> 00:10:50,550
And therefore, for example, you disregard these thermal fluctuation effects.

137
00:10:51,420 --> 00:10:58,230
So this is at least for many scientific problems, the better way to go to really think first about

138
00:10:58,230 --> 00:11:03,510
it, what would be our model function and then fit this model of function to your data?

139
00:11:04,620 --> 00:11:09,870
So in our case, we have used here a polynomial, and that's for our model function.

140
00:11:09,870 --> 00:11:15,870
We have made it really general, so you could now go ahead and test with a higher order polynomial.

141
00:11:16,350 --> 00:11:21,600
But please be warned, this can often leads to problems because the higher order you go into polynomial,

142
00:11:21,930 --> 00:11:27,810
the higher these terms will get and the higher the error will get, and you will soon run to some memory

143
00:11:27,810 --> 00:11:28,260
issues.

144
00:11:28,740 --> 00:11:33,300
But in general, it should work to you to consider also higher order polynomials.

145
00:11:33,780 --> 00:11:37,970
And of course, also you could use a completely different model function.

146
00:11:37,980 --> 00:11:43,170
You could use an exponential function sine function, anything that you want, basically.

147
00:11:43,860 --> 00:11:46,350
And this will be the only thing that you have to update.

148
00:11:46,710 --> 00:11:51,300
Of course, you have to be careful here a bit with the parameters of the coefficient so that you still

149
00:11:51,300 --> 00:11:53,580
call them and define them as an array.

150
00:11:53,940 --> 00:12:00,270
But if you do this, then you have to, of course, also take care a bit of the error function.

151
00:12:00,270 --> 00:12:06,750
But the general idea in the finding the error function and then the finding the error of its gradient

152
00:12:06,750 --> 00:12:15,300
function and then looping over and over again so that we can update the value of a this will all remain

153
00:12:15,300 --> 00:12:19,020
the same no matter which function you use for the fit.