1
00:00:00,120 --> 00:00:05,460
So the previous lecture, we have defined our model function, which is supposed to be a polynomial.

2
00:00:05,970 --> 00:00:12,180
And then we have tested that if we use the parameters that we used for the beginning, that indeed we

3
00:00:12,180 --> 00:00:13,620
get the correct function.

4
00:00:14,880 --> 00:00:19,920
So this is, of course, not the solution that we are going to use because, yeah, we are trying to

5
00:00:19,920 --> 00:00:24,810
figure out these parameters and we don't just want to use the correct solution right away.

6
00:00:25,050 --> 00:00:31,320
This was just for testing if our model function works and we see it works quite well.

7
00:00:32,310 --> 00:00:39,330
So say, we don't know these values yet and say we use here instead some other values, for example,

8
00:00:40,330 --> 00:00:44,250
a starting point could be just ones.

9
00:00:45,210 --> 00:00:53,640
So like these values, then you see the function looks totally different and there is a deviation from

10
00:00:53,640 --> 00:00:55,020
the actual data points.

11
00:00:55,830 --> 00:01:00,680
So how would we now say how good is a fit for this?

12
00:01:00,690 --> 00:01:02,460
We need an error function.

13
00:01:03,250 --> 00:01:04,830
So let me just for the notebook.

14
00:01:04,830 --> 00:01:08,820
Restore this and let's come to the error function.

15
00:01:08,970 --> 00:01:14,100
We will now define the error, which basically also defines the quality of our fit.

16
00:01:14,670 --> 00:01:20,850
As I wrote down, there are actually many reasonable definitions of an error function, but a very common

17
00:01:20,850 --> 00:01:27,430
choice, especially in the beginning, is such a square into differences methods.

18
00:01:27,840 --> 00:01:35,100
So basically, you sum up of all your data points and you just take the difference in the Y values and

19
00:01:35,100 --> 00:01:38,160
you square all of the values and then you add them up.

20
00:01:38,940 --> 00:01:41,460
And this gives you, of course, always a positive number.

21
00:01:42,060 --> 00:01:48,480
And so since you square it up when there is a single point with a large difference, this will give

22
00:01:48,480 --> 00:01:49,620
you a very large error.

23
00:01:50,220 --> 00:01:56,580
So it's such an error of function tries to basically minimize the distance between fit and data for

24
00:01:56,580 --> 00:02:03,600
all points at the same time, and mainly it tries to avoid some very large distance and one of these

25
00:02:04,320 --> 00:02:04,740
terms.

26
00:02:06,020 --> 00:02:12,440
So the function F is the fit function that is determined by the coefficients.

27
00:02:13,340 --> 00:02:20,060
And we have several coefficients a i so in our case, we have a zero, a one, a two and a three because

28
00:02:20,060 --> 00:02:21,520
we have a third order polynomial.

29
00:02:21,530 --> 00:02:22,940
So for parameters.

30
00:02:23,510 --> 00:02:27,040
But the way that we have programmed everything is very generous.

31
00:02:27,080 --> 00:02:29,720
We could also use a ninth order polynomial.

32
00:02:29,720 --> 00:02:30,440
It would work.

33
00:02:31,550 --> 00:02:34,430
And the points XIII and y i.

34
00:02:34,670 --> 00:02:37,040
These are the data points that we tried to fit.

35
00:02:37,040 --> 00:02:42,920
And I think there we had 21 data points between minus five and plus five.

36
00:02:44,330 --> 00:02:48,560
So now let's go ahead and just take this definition of the error and define it.

37
00:02:50,840 --> 00:02:52,580
So I will call this error fit.

38
00:02:52,610 --> 00:02:54,280
The name isn't really that important.

39
00:02:54,290 --> 00:03:02,000
You could give it another name and the arguments will be if the function that we have, the fish shoes

40
00:03:02,780 --> 00:03:04,970
and the data points, of course.

41
00:03:05,420 --> 00:03:15,950
So if it will be the fit function or the model, then the coefficients will be the values that we try

42
00:03:15,960 --> 00:03:16,670
to optimize.

43
00:03:16,670 --> 00:03:27,800
So the MRI that we try to optimize and then we have the data, which will be our data.

44
00:03:29,270 --> 00:03:32,300
We try to fit.

45
00:03:34,190 --> 00:03:37,400
And once again, we just add things up here.

46
00:03:37,400 --> 00:03:41,000
So you see, whenever we have these sons, we can just use a loop.

47
00:03:41,600 --> 00:03:46,820
And as I said, there are other methods, but I think the loop is just fine, even though it's a bit

48
00:03:47,000 --> 00:03:53,720
clumsy, maybe, but doesn't matter here because all of these calculations are very quick anyway.

49
00:03:53,720 --> 00:03:54,830
So it doesn't really matter.

50
00:03:54,830 --> 00:03:58,640
If we try to optimize things here, we can just go ahead.

51
00:03:59,270 --> 00:04:03,050
So here is maybe a thing where you could make a mistake if you're not careful.

52
00:04:03,470 --> 00:04:11,240
So we calculate here some and we have an index AI that ranges from.

53
00:04:11,420 --> 00:04:15,320
And then here I wrote, down from one to end.

54
00:04:15,740 --> 00:04:20,240
But in our case, we have programmed it to start at zero and it goes to end.

55
00:04:20,240 --> 00:04:22,330
So this, I would say, is totally fine.

56
00:04:22,340 --> 00:04:25,940
So since we don't include and we go from zero to and minus one.

57
00:04:26,510 --> 00:04:27,710
So it's the same thing.

58
00:04:27,710 --> 00:04:34,820
Basically here and here, you have to be careful because if you write in range length of data, then

59
00:04:35,180 --> 00:04:43,340
this value would just be too because we have X and Y, so the length would then be just too.

60
00:04:43,790 --> 00:04:48,830
But if we now just look at the X coordinates, for example, then we have the actual numbers of our

61
00:04:48,830 --> 00:04:49,340
points.

62
00:04:49,880 --> 00:04:52,260
So make sure you don't forget this one here.

63
00:04:52,280 --> 00:04:59,240
Otherwise, the court will still work, but it will give you an incorrect error because it will only

64
00:04:59,240 --> 00:05:02,690
take into account the first two data points, which is, of course, not good.

65
00:05:04,250 --> 00:05:09,380
But anyway, since we have it correct now, we can have data on error, so we will be old error plus

66
00:05:10,490 --> 00:05:18,110
and then we just write the square here in the brackets, we have data one which will be the Y coordinates

67
00:05:18,470 --> 00:05:23,300
and then the five point minus f of.

68
00:05:25,320 --> 00:05:33,630
Date Zero Comma II, which the X coordinates, and then we have the coefficients, which are just two

69
00:05:33,630 --> 00:05:36,780
coefficients coefficients.

70
00:05:39,110 --> 00:05:45,920
And then we could say just for testing purposes, we want to print the arrow just to see if it works,

71
00:05:46,100 --> 00:05:52,340
because this will, we will delete later on and then we can return the error after the loop.

72
00:05:53,570 --> 00:06:04,240
So we run this and now we can write a fit of our colleague in non-lethal model.

73
00:06:05,600 --> 00:06:11,870
Then we use a zero for the starting point for the parameters, and then we use the data points for the

74
00:06:11,870 --> 00:06:12,320
data.

75
00:06:13,170 --> 00:06:14,000
And when I run this.

76
00:06:19,070 --> 00:06:19,850
What's going on?

77
00:06:20,240 --> 00:06:24,300
I think for some reason, this is a markdown cell, so this is just my fault.

78
00:06:24,300 --> 00:06:26,000
They have somehow missed something else.

79
00:06:26,000 --> 00:06:30,950
Let's copy it in an actual cell that is a code cell, and now it works.

80
00:06:30,950 --> 00:06:31,340
Yes.

81
00:06:31,970 --> 00:06:38,180
So you see here the error will be added up and we start from a very small value.

82
00:06:38,180 --> 00:06:44,090
And then you see it just keeps on increasing, which is clear since we are adding here square numbers.

83
00:06:44,090 --> 00:06:48,560
So these can only be positive and the error will increase.

84
00:06:48,560 --> 00:06:53,750
And then at some point the loop will stop and it will ultimately give us the error.

85
00:06:54,860 --> 00:06:55,250
All right.

86
00:06:55,250 --> 00:06:56,880
So it seems to me reasonable.

87
00:06:56,930 --> 00:06:57,860
So that's common.

88
00:06:57,900 --> 00:06:59,300
This one, we don't need it anymore.

89
00:07:00,050 --> 00:07:03,200
And so now the error just gives us a single value.

