1
00:00:03,470 --> 00:00:09,120
Hello, bonjour, gruss gott, laba diena,

2
00:00:09,840 --> 00:00:15,310
Welcome to the the 9th week of Statistical Mechanics: Algorithms and Computations

3
00:00:15,310 --> 00:00:19,310
from the Physics Department of Ecole Normale Supérieure.

4
00:00:19,880 --> 00:00:26,480
During our course, we have mostly concentrated on equilibrium statistical mechanics

5
00:00:27,740 --> 00:00:32,600
Our algorithms allowed us to compute partition functions,

6
00:00:32,600 --> 00:00:36,600
energies and condensed fractions

7
00:00:36,600 --> 00:00:41,570
but they were rarely concerned with time-dependent phenomena.

8
00:00:43,910 --> 00:00:47,480
Physical time played a minor role

9
00:00:47,480 --> 00:00:51,780
except in the molecular dynamics calculations for the hard disks.

10
00:00:53,360 --> 00:00:57,570
Monte Carlo time really was of secondary importance

11
00:00:58,150 --> 00:01:04,300
for example the Wolff cluster algorithm that we discussed last week

12
00:01:04,300 --> 00:01:08,850
had a dynamics that had nothing in common

13
00:01:08,850 --> 00:01:12,850
with the physical dynamics of the Ising model.

14
00:01:12,850 --> 00:01:18,090
In this lecture, we explore time-dependent phenomena

15
00:01:20,860 --> 00:01:26,820
and strangely enough we are right back with the local Monte Carlo algorithm,

16
00:01:27,860 --> 00:01:35,850
as it provides often an excellent model for a time evolution in a physical system.

17
00:01:36,880 --> 00:01:44,310
So, what we are looking for is an efficient implementation of markov_ising.py

18
00:01:45,780 --> 00:01:51,840
in dynamical Monte Carlo, we are stuck with a given dynamical rule

19
00:01:52,620 --> 00:01:58,290
and we want to implement this rule as efficiently as possible.

20
00:01:59,590 --> 00:02:04,300
In this lecture we will treat markov_ising.py

21
00:02:05,740 --> 00:02:10,790
and study the faster-than-the-clock implementation

22
00:02:10,790 --> 00:02:15,130
and also its main limitation, that's called the futility.

23
00:02:15,910 --> 00:02:18,200
In this week's tutorial

24
00:02:18,200 --> 00:02:23,860
we plunge into the great and very general method called simulated annealing.

25
00:02:25,330 --> 00:02:32,540
Simulated annealing is a tool for solving difficult optimization problems

26
00:02:32,540 --> 00:02:36,540
with or without a relation with physics.

27
00:02:38,620 --> 00:02:41,800
It consists in reformulating

28
00:02:41,800 --> 00:02:44,290
the optimization problem

29
00:02:44,290 --> 00:02:48,290
in terms of an artificial physical system

30
00:02:48,290 --> 00:02:52,990
whose ground-state at zero temperature or at infinite pressure

31
00:02:52,990 --> 00:02:56,990
contains the solution to the original task.

32
00:02:58,700 --> 00:03:05,570
This state is then approached through a slow decrease in temperature

33
00:03:05,570 --> 00:03:09,570
or a slow increase in pressure.

34
00:03:11,190 --> 00:03:14,010
We will apply simulated annealing

35
00:03:14,010 --> 00:03:21,230
to the famous 13-spheres problem, that already Isaac Newton worked on.

36
00:03:23,360 --> 00:03:28,370
Simulated annealing gives a great solution.

37
00:03:28,370 --> 00:03:31,260
In this week's homework session

38
00:03:31,260 --> 00:03:34,530
- the final homework session of this course -

39
00:03:34,530 --> 00:03:38,900
we will also be concerned with simulated annealing

40
00:03:39,700 --> 00:03:46,230
in the 13-spheres problem and in the equally famous traveling salesman problem.

41
00:03:47,210 --> 00:03:50,230
So, let's get started

42
00:03:50,230 --> 00:03:54,850
with week 9 of Statistical Mechanics: Algorithms and Computations.

43
00:03:59,440 --> 00:04:04,950
To begin with, let us consider a magnetic field as shown here

44
00:04:04,950 --> 00:04:08,950
and a single Ising spin

45
00:04:08,950 --> 00:04:11,570
whose dynamics we want to study.

46
00:04:12,840 --> 00:04:16,650
We set the temperature equal to 1

47
00:04:16,650 --> 00:04:19,850
and the energy of this spin

48
00:04:19,850 --> 00:04:23,850
is equal to -h if it is up

49
00:04:23,850 --> 00:04:27,850
and is equal to +h if it is down.

50
00:04:28,360 --> 00:04:31,570
In short, the energy of sigma

51
00:04:31,570 --> 00:04:36,680
is -h*sigma, where sigma can be plus or minus one.

52
00:04:38,260 --> 00:04:45,270
To simulate the dynamics, we use the Metropolis algorithm

53
00:04:45,990 --> 00:04:50,080
min(1, pi_new / p_old)

54
00:04:51,680 --> 00:04:58,820
So the probability to flip the spin from minus to plus

55
00:04:58,820 --> 00:05:02,820
is equal to one, because the statistical weight

56
00:05:02,820 --> 00:05:08,230
of the new configuration is larger than the statistical weight of the old configuration.

57
00:05:10,030 --> 00:05:15,640
And the probability to flip from +1 to -1

58
00:05:15,640 --> 00:05:23,730
is equal to min(1, pi_new / pi_old)

59
00:05:24,100 --> 00:05:29,470
= exp(-beta h) / exp(beta h)

60
00:05:29,470 --> 00:05:33,470
= exp(- 2 beta h)

61
00:05:33,470 --> 00:05:37,470
In Python, this gives the following program

62
00:05:37,470 --> 00:05:41,470
naive_spin_dynamics.py, where we have set

63
00:05:41,470 --> 00:05:43,270
beta=1

64
00:05:43,990 --> 00:05:48,400
Now, notice that if at time t

65
00:05:48,400 --> 00:05:50,390
the spin is down

66
00:05:50,390 --> 00:05:56,170
it will certainly be up at time t+1

67
00:05:57,060 --> 00:06:02,460
This is the essence of Metropolis dynamics

68
00:06:02,460 --> 00:06:06,970
and we don't discuss here whether this is realistic.

69
00:06:08,930 --> 00:06:13,580
In any case, this algorithm satisfies detailed balance

70
00:06:13,580 --> 00:06:18,730
and in the long run - at long times -

71
00:06:19,420 --> 00:06:23,200
the two configurations up and down

72
00:06:23,200 --> 00:06:27,200
will be visited with their corresponding

73
00:06:27,200 --> 00:06:39,740
statistical weights: pi_up and pi_down.

74
00:06:39,740 --> 00:06:43,740
So now, let's look at the movie version

75
00:06:43,740 --> 00:06:46,630
of naive_spin_dynamics.py.

76
00:06:46,630 --> 00:06:52,720
Here is output of this program at high temperature:

77
00:06:52,720 --> 00:06:56,720
there are a lot of flips.

78
00:06:58,260 --> 00:07:02,650
And here is output at lower temperature

79
00:07:03,820 --> 00:07:09,480
you see: a long time nothing is happening and the spin is in the up position,

80
00:07:10,750 --> 00:07:15,860
then there is a flip, immediately followed by a back-flip,

81
00:07:17,660 --> 00:07:20,310
and then followed by a long desert

82
00:07:21,280 --> 00:07:22,590
of the spin being up

83
00:07:23,510 --> 00:07:29,200
and here is the dynamics of our system at very low temperature.

84
00:07:30,540 --> 00:07:36,950
Nothing going on, and just an occasional flip

85
00:07:38,630 --> 00:07:43,830
and our computer heating up without any result.

86
00:07:44,630 --> 00:07:51,450
Now remember: the probability to flip from down to up is equal to one

87
00:07:52,650 --> 00:08:00,880
and the probability from up to down is equal to exp(-2h)

88
00:08:02,230 --> 00:08:07,700
so now, to make progress, we choose a particular value of h.

89
00:08:08,690 --> 00:08:12,430
In fact, we choose log(6)/2

90
00:08:13,500 --> 00:08:18,180
so now, the probability to flip from up to down

91
00:08:18,180 --> 00:08:23,300
is exp(-2 log(6) / 2)

92
00:08:23,820 --> 00:08:28,270
=exp(-log(6))

93
00:08:28,270 --> 00:08:32,270
=1/6

94
00:08:32,270 --> 00:08:35,650
So look at this flip-die

95
00:08:35,650 --> 00:08:39,650
it has 5 neutral faces

96
00:08:39,650 --> 00:08:43,320
and one face that says "flip"

97
00:08:44,330 --> 00:08:48,310
so, simulating the Metropolis algorithm

98
00:08:48,310 --> 00:08:52,750
at magnetic field log(6)/2

99
00:08:52,750 --> 00:08:55,900
is like rolling this die

100
00:08:55,900 --> 00:08:59,900
so, let's roll it..

101
00:09:02,130 --> 00:09:07,500
First trial: no result, the spin does not flip

102
00:09:07,500 --> 00:09:10,160
Second trial

103
00:09:11,240 --> 00:09:13,180
Third time

104
00:09:13,180 --> 00:09:18,640
we are getting quite tired.. no result..

105
00:09:18,640 --> 00:09:21,460
Fifth attempt

106
00:09:22,300 --> 00:09:25,460
Sixth.. no..

107
00:09:26,220 --> 00:09:31,200
Seventh.. so now we have to flip the spin

108
00:09:31,570 --> 00:09:36,600
In Python this gives the following program: naive_throw.py

109
00:09:37,320 --> 00:09:40,300
and here is output of this program

110
00:09:41,890 --> 00:09:45,250
so now, let us program this

111
00:09:45,250 --> 00:09:49,810
problem a little more intelligently than we have done before.

112
00:09:51,250 --> 00:09:58,910
So, the probability of throwing a blank face of our beautiful flip-die

113
00:09:58,910 --> 00:10:02,910
is equal to 5/6

114
00:10:03,350 --> 00:10:10,720
The probability of throwing twice a blank face is equal to (5/6)^2

115
00:10:12,380 --> 00:10:28,540
and the probability of throwing k times a blank face followed by once the flip face is given by (5/6)^k times 1/6

116
00:10:29,750 --> 00:10:33,030
or (1 - 5/6)

117
00:10:33,030 --> 00:10:41,440
So this is equal to (5/6)^k - (5/6)^(k+1)

118
00:10:42,690 --> 00:10:49,250
so this can be made into a tower sampling problem between 0 and 1

119
00:10:49,690 --> 00:10:52,590
with horizontal lines at 5/6

120
00:10:52,590 --> 00:10:56,590
(5/6)^2, (5/6)^3 ...

121
00:10:57,870 --> 00:11:01,110
we throw a pebble into this tower

122
00:11:01,110 --> 00:11:05,110
and if it falls for example in the upper box

123
00:11:05,110 --> 00:11:08,490
we have a new flip right away

124
00:11:08,490 --> 00:11:12,490
if it falls into the second box, we have to wait once

125
00:11:13,700 --> 00:11:17,700
if it falls into the third box, we have to wait twice

126
00:11:19,610 --> 00:11:23,230
to find out the number of times we have to wait

127
00:11:24,350 --> 00:11:31,170
we have to compute the box, which means (5/6)^(k+1)

128
00:11:31,170 --> 00:11:36,940
smaller than the random number smaller than (5/6)^k

129
00:11:39,340 --> 00:11:43,970
Taking logarithms, we find that k

130
00:11:43,970 --> 00:11:50,940
is the integer part of the logarithm of the random number divided by the logarithm of 5/6.

131
00:11:52,130 --> 00:11:57,150
And this means - in our naive_throw program -

132
00:11:58,310 --> 00:12:02,240
that the time after which we have a new flip

133
00:12:02,240 --> 00:12:07,420
is given by t_flip=..

134
00:12:16,220 --> 00:12:23,360
So, in Python this gives the program fast_throw.py

135
00:12:25,280 --> 00:12:31,410
its output is indistinguishable from the naive program

136
00:12:33,370 --> 00:12:40,450
but it advances one iteration per accepted flip

137
00:12:41,210 --> 00:12:49,520
and not per time, it is for this reason that it is called a faster-than-the-clock algorithm.

138
00:12:49,520 --> 00:12:53,520
Of course, a very simple one.

139
00:12:53,520 --> 00:13:00,180
We may now write the fast_spin_dynamics program, as shown here

140
00:13:01,230 --> 00:13:04,180
it will remain exactly one step

141
00:13:04,180 --> 00:13:08,180
in the down-spin configuration with spin equal to -1

142
00:13:08,180 --> 00:13:13,020
and an average of 6 steps in the up-spin configuration

143
00:13:14,190 --> 00:13:17,020
so that the magnetization

144
00:13:18,040 --> 00:13:24,220
on average, will be 5/7

145
00:13:25,060 --> 00:13:30,100
We can obtain the same result also by a pedestrian approach

146
00:13:31,930 --> 00:13:34,460
So now, before continuing

147
00:13:34,890 --> 00:13:41,550
please take a moment to download, run and modify the programs we just discussed

148
00:13:43,120 --> 00:13:49,380
So we had the die-throwing program naive_throw.py

149
00:13:49,380 --> 00:13:56,400
and the fast_throw.py, that implemented the faster-than-the-clock idea

150
00:13:57,480 --> 00:14:02,880
The output of the two programs is completely indistinguishable

151
00:14:05,040 --> 00:14:15,430
and then we had the related programs naive_spin_dynamics.py and fast_spin_dynamics.py

152
00:14:15,430 --> 00:14:21,800
that implemented the dynamics of a single spin in the magnetic field.

153
00:14:26,770 --> 00:14:31,210
From a single spin in a field, we now pass on

154
00:14:31,210 --> 00:14:36,090
to a full-fledged simulation of the Ising model on N sites

155
00:14:36,090 --> 00:14:40,090
in the faster-than-the-clock flavor

156
00:14:40,090 --> 00:14:44,240
We consider a spin sigma, all the spins

157
00:14:45,140 --> 00:14:47,190
that depends on time

158
00:14:47,190 --> 00:14:51,190
and we denote by sigma_k

159
00:14:51,190 --> 00:14:55,190
the configuration that comes out of sigma

160
00:14:55,190 --> 00:14:59,190
by flipping the spin number k

161
00:14:59,780 --> 00:15:03,450
The energy change that is associated

162
00:15:03,450 --> 00:15:13,200
with this spin flip is delta_E = E[sigma_k] - E[sigma]

163
00:15:14,630 --> 00:15:17,200
In the Metropolis algorithm

164
00:15:17,200 --> 00:15:21,330
the probability to go from sigma to sigma_k

165
00:15:22,370 --> 00:15:29,500
is given by (1/N) min(1, exp(-beta delta_E))

166
00:15:31,120 --> 00:15:39,700
the 1/N stands for the probability of sampling the spin number k

167
00:15:41,000 --> 00:15:47,150
In the Ising model and its variances (for example spin glasses)

168
00:15:47,150 --> 00:15:52,940
at low temperature, most of the spin flips are rejected.

169
00:15:53,770 --> 00:16:01,830
So, let's roll back our sleeves and implement the faster-than-the-clock approach for the Ising model.

170
00:16:02,530 --> 00:16:05,830
Let's avoid all rejections.

171
00:16:07,470 --> 00:16:09,900
But there is a problem.

172
00:16:11,230 --> 00:16:17,910
In the die-rolling algorithm fast_throw.py

173
00:16:18,910 --> 00:16:24,320
we notice that the quantity that was important was not 1/6

174
00:16:24,320 --> 00:16:27,250
the probability to have a flip

175
00:16:27,250 --> 00:16:33,720
but it was rather 5/6, the probability of doing nothing

176
00:16:35,710 --> 00:16:39,140
Now, in our N-sites Ising model

177
00:16:39,910 --> 00:16:44,020
we also have to compute the probability of doing nothing

178
00:16:45,320 --> 00:16:52,880
but as we have N choices for doing something, computing this probability takes a bit of time.

179
00:16:54,390 --> 00:16:58,850
So we can compute now lambda, the probability of doing nothing

180
00:16:58,850 --> 00:17:02,850
that was 5/6 before, is given by 1 minus

181
00:17:02,850 --> 00:17:07,940
the sum over all the probabilities we just wrote down a few seconds ago

182
00:17:09,630 --> 00:17:13,210
and we can implement the faster-than-the-clock approach

183
00:17:13,970 --> 00:17:20,970
through the tower sampling from lambda, lambda^2, lambda^3, ...

184
00:17:24,230 --> 00:17:28,600
In Python, this is implemented in this program

185
00:17:28,600 --> 00:17:32,600
dynamic_ising.py

186
00:17:33,770 --> 00:17:37,900
and you see that there are two tower samplings

187
00:17:39,190 --> 00:17:45,250
the first tower sampling is in lambda, it is from 0 to 1

188
00:17:46,200 --> 00:17:51,850
and it computes after which time we accept a new spin flip

189
00:17:54,620 --> 00:17:58,060
then we have to decide which spin to flip

190
00:17:59,530 --> 00:18:02,670
and this again is a discrete probability

191
00:18:02,670 --> 00:18:08,680
and it is done in a second tower sampling from lambda to 1

192
00:18:09,640 --> 00:18:17,170
and it is there that we compute k, the value of the spin that is going to flip.

193
00:18:18,600 --> 00:18:22,910
Output of dynamic_ising.py

194
00:18:22,910 --> 00:18:26,910
is completely indistinguishable

195
00:18:26,910 --> 00:18:30,910
from markov_ising.py

196
00:18:30,910 --> 00:18:34,910
and the only problem that we have

197
00:18:34,910 --> 00:18:38,910
is that doing one iteration

198
00:18:38,910 --> 00:18:44,750
of this algorithm takes of the order of N operations

199
00:18:46,310 --> 00:18:52,680
besides that, there is no rejection in dynamic_ising.py

200
00:18:53,620 --> 00:18:56,940
The complete recalculation of lambda

201
00:18:56,940 --> 00:19:01,840
- the probability of doing nothing - can be avoided

202
00:19:02,450 --> 00:19:09,530
because flipping a spin somewhere in the system does not change the local environment elsewhere.

203
00:19:11,900 --> 00:19:15,760
So the number of environments of a spin

204
00:19:15,760 --> 00:19:23,520
with respect to its neighbors fall into a finite number of classes:

205
00:19:23,520 --> 00:19:29,620
a number of up spins and down spins that are neighbors of a given spin.

206
00:19:31,290 --> 00:19:36,650
And flipping a spin somewhere in the system just changes

207
00:19:36,650 --> 00:19:40,650
the numbers of members of each class.

208
00:19:42,480 --> 00:19:49,190
This was first implemented by Bortz, Kalos, and Lebowitz in 1975

209
00:19:50,120 --> 00:19:53,460
They called their algorithm

210
00:19:53,460 --> 00:20:00,490
the n-fold way, where n stands for the number of classes.

211
00:20:02,390 --> 00:20:07,020
Let's take as an example the Ising model in two dimensions

212
00:20:07,020 --> 00:20:09,250
with periodic boundary conditions

213
00:20:11,080 --> 00:20:16,220
As you see here, the number of classes is 10

214
00:20:17,040 --> 00:20:22,040
from a up spin with four neighboring up spins

215
00:20:22,040 --> 00:20:26,990
to a down spin with four neighboring down spin

216
00:20:27,320 --> 00:20:30,990
The tower of N probabilities

217
00:20:30,990 --> 00:20:37,210
can thus be reduced to a tower of 10 classes of spin flips

218
00:20:37,210 --> 00:20:42,040
if only you know the number of spins per class

219
00:20:43,300 --> 00:20:49,190
So, in this example the central spin that we want to flip

220
00:20:49,830 --> 00:20:55,900
has just one plus neighbor, so it lives in class 3

221
00:20:56,950 --> 00:21:00,970
flipping it brings it into class 8

222
00:21:00,970 --> 00:21:07,060
and makes change also the classes of the neighboring spins

223
00:21:07,630 --> 00:21:15,600
this very elementary book-keeping is written up in dynamic_ising_patch.py

224
00:21:17,710 --> 00:21:21,630
With respect to the original version (dynamic_ising.py)

225
00:21:21,630 --> 00:21:27,350
where each spin flip involved of the order of N operations

226
00:21:27,770 --> 00:21:35,050
now we can do one spin flip even for a large system in of the order of one operation

227
00:21:35,820 --> 00:21:39,050
At small temperature, this algorithm

228
00:21:39,050 --> 00:21:43,050
dynamic_ising_patch.py is much faster

229
00:21:43,050 --> 00:21:46,620
than the original markov_ising.py

230
00:21:47,310 --> 00:21:52,130
but remember that it implements exactly the same dynamics

231
00:21:52,130 --> 00:21:57,130
moving from accepted spin flip to the next accepted spin flip.

232
00:21:58,640 --> 00:22:02,820
So, before continuing, please take a moment

233
00:22:02,820 --> 00:22:07,940
to download, run  and modify the two programs we discussed in this section

234
00:22:08,720 --> 00:22:11,940
For once there was dynamic_ising.py

235
00:22:11,940 --> 00:22:16,560
that worked with a tower of N probabilities

236
00:22:17,750 --> 00:22:22,440
and the nice n-fold way algorithm

237
00:22:22,440 --> 00:22:25,250
first implemented by Kalos

238
00:22:25,770 --> 00:22:29,550
Bortz and Lebowitz in 1975

239
00:22:29,550 --> 00:22:33,240
that uses the n-fold way and goes

240
00:22:33,240 --> 00:22:39,600
from the tower of N probabilities to a tower of 10 probabilities.

241
00:22:40,400 --> 00:22:46,530
So the complexity of this algorithm is O(1) per accepted spin flip.

242
00:22:51,200 --> 00:22:56,500
So in conclusion, we have given in this lecture an introduction

243
00:22:56,500 --> 00:23:00,500
to faster-than-the-clock algorithms, that implement

244
00:23:00,500 --> 00:23:04,500
efficiently the dynamics of our system.

245
00:23:05,080 --> 00:23:10,990
This algorithm spend for per accepted move

246
00:23:10,990 --> 00:23:13,950
and no longer per attempted move.

247
00:23:14,620 --> 00:23:21,690
But, going from the word "attempted" to the word "accepted" move involves book-keeping.

248
00:23:22,480 --> 00:23:27,700
This book-keeping makes also that in many applications

249
00:23:27,700 --> 00:23:31,700
they are not as good as they seem to be

250
00:23:31,700 --> 00:23:36,030
because to go from one configuration a to a configuration b

251
00:23:36,030 --> 00:23:40,320
you may have to do some book-keeping as we showed in the Ising model

252
00:23:40,910 --> 00:23:46,740
and then on the next move you will simply go back from b to a.

253
00:23:46,740 --> 00:23:50,740
This is called the futility problem.

254
00:23:51,470 --> 00:23:54,110
To overcome this futility problem

255
00:23:54,110 --> 00:23:57,510
many people have implemented

256
00:23:57,830 --> 00:24:04,770
quite elaborate algorithms that even increase the expense per accepted move

257
00:24:06,080 --> 00:24:08,770
In this week's tutorial

258
00:24:08,770 --> 00:24:11,770
we will take up the study of dynamics algorithms

259
00:24:11,770 --> 00:24:15,770
from another point of view, the one of simulated annealing.

260
00:24:16,410 --> 00:24:22,020
Simulated annealing is a great optimization algorithm.

261
00:24:23,100 --> 00:24:29,290
It also shows that the sampling approach is not only related to integration

262
00:24:29,290 --> 00:24:32,290
as we have said for many weeks now

263
00:24:32,290 --> 00:24:39,350
but it can also be used for searching and for orientation in complicated spaces.

264
00:24:40,770 --> 00:24:46,650
So have fun with this week's tutorial and homework sessions

265
00:24:47,190 --> 00:24:49,300
and see you again next week

266
00:24:49,300 --> 00:24:54,610
in the final week of Statistical Mechanics: Algorithms and Computations

267
00:24:54,610 --> 00:25:00,340
for a discussion of the alpha and the omega of Monte Carlo

268
00:25:01,060 --> 00:25:05,500
followed by a resume and by a big party..