1
00:00:00,420 --> 00:00:06,689
So especially in the previous section, you've seen that we can use fitting procedures to describe very

2
00:00:06,689 --> 00:00:13,560
complicated physical problems and to understand then what is going on and to do this, we have formulated

3
00:00:13,560 --> 00:00:14,550
a lot of the function.

4
00:00:14,850 --> 00:00:19,860
We have then formulated an oral function between the data and our model function.

5
00:00:20,190 --> 00:00:25,980
And then we used the so-called gradient descent methods to minimize this error until we had the perfect

6
00:00:25,980 --> 00:00:26,280
fit.

7
00:00:27,480 --> 00:00:32,520
However, it turns out that when you have many objects, for example, in the previous section, many

8
00:00:32,520 --> 00:00:39,000
of these oscillators, then it takes an enormous amount of time to really do this gradient descent and

9
00:00:39,000 --> 00:00:41,010
to minimize the error in this way.

10
00:00:41,970 --> 00:00:48,420
So instead, a different approach would be to not use gradient descent, so not just update the parameters

11
00:00:48,420 --> 00:00:53,010
along the gradient direction, but instead to use randomness.

12
00:00:53,580 --> 00:00:57,210
So it means we randomly change one of the parameters.

13
00:00:57,630 --> 00:01:03,600
And if this change reduces the error function, then it's considered a good change and it is accepted.

14
00:01:04,200 --> 00:01:09,420
However, when the change increases the energy, it's considered a bad change and then it will just

15
00:01:09,420 --> 00:01:10,410
be disregarded.

16
00:01:11,550 --> 00:01:19,140
So it means that we use randomness to gain knowledge, and this is really the idea of Monte Carlo simulations.

17
00:01:20,040 --> 00:01:26,820
So really exploiting randomness and luck, and we will start with a very simple and mathematical example.

18
00:01:26,820 --> 00:01:33,570
We will approximate the value of PI by basically placing random points in a square.

19
00:01:33,570 --> 00:01:39,900
And then we're counting how many points are inside of a certain region, which is a circle and how many

20
00:01:39,900 --> 00:01:41,010
points are outside.

21
00:01:41,610 --> 00:01:48,510
And then we will turn to a physical problem that we will simulate the collective behavior of a magnet.

22
00:01:49,110 --> 00:01:52,950
So you see, a magnet consists of individual tiny magnets.

23
00:01:53,280 --> 00:01:58,320
And all of these magnets can point along all three directions in space, and they interact with each

24
00:01:58,320 --> 00:01:58,590
other.

25
00:01:59,490 --> 00:02:06,870
So it means when you try to minimize here or error, then it really takes an enormous amount of time.

26
00:02:07,590 --> 00:02:15,690
So instead, you minimize the error by just randomly reorienting individual magnets and checking if

27
00:02:15,690 --> 00:02:16,890
the error has reduced.

28
00:02:17,550 --> 00:02:20,760
And if this is the case, then you accept the chain to change.

29
00:02:20,760 --> 00:02:23,160
And if not, then you disregard the change.

30
00:02:24,000 --> 00:02:29,580
And we will really try this for this magnet and we will see that it brings about a lot of advantages.

