1
00:00:03,210 --> 00:00:08,580
Hello, bonjour, mingalarbar, kem che,

2
00:00:09,220 --> 00:00:15,190
Welcome to the 9th tutorial of Statistical Mechanics: Algorithms and Computations

3
00:00:15,190 --> 00:00:19,190
from the Physics Department of Ecole Normale Supérieure.

4
00:00:20,260 --> 00:00:24,050
The Monte Carlo method that we have mastered

5
00:00:24,050 --> 00:00:26,410
in the last two months and a half

6
00:00:27,080 --> 00:00:29,720
in classical and quantum systems

7
00:00:29,720 --> 00:00:32,990
has far outgrown its origin.

8
00:00:34,090 --> 00:00:36,000
In this week's lecture already

9
00:00:36,000 --> 00:00:39,000
we studied time-dependent phenomena

10
00:00:39,000 --> 00:00:43,000
and faster-than-time algorithms.

11
00:00:43,000 --> 00:00:49,580
Today, Monte Carlo methods are literally everywhere in science and engineering

12
00:00:50,670 --> 00:00:55,370
and time has come to present at least one application

13
00:00:55,370 --> 00:00:57,820
in a wider context of these methods.

14
00:00:57,820 --> 00:01:04,250
The application that we have chosen is concerned with optimization methods.

15
00:01:09,010 --> 00:01:10,700
For concreteness

16
00:01:10,700 --> 00:01:14,700
we consider a problem that is familiar to all of you

17
00:01:14,700 --> 00:01:16,240
and to all of us

18
00:01:16,240 --> 00:01:18,310
you know what it is?

19
00:01:18,310 --> 00:01:22,310
Well, it is the packing problem

20
00:01:22,310 --> 00:01:26,310
that builds the logo of our course.

21
00:01:26,310 --> 00:01:30,310
What you see here is a red unit sphere

22
00:01:30,860 --> 00:01:35,070
and a bunch of blue unit spheres around it.

23
00:01:36,730 --> 00:01:40,120
All of the blue ones touch the red one

24
00:01:40,120 --> 00:01:44,120
but none of the blue ones overlap with each other.

25
00:01:46,010 --> 00:01:47,750
So the question is

26
00:01:47,750 --> 00:01:54,710
how many of the blue unit spheres we can place around the red unit sphere?

27
00:01:56,870 --> 00:02:01,270
So, let's find the first solution to this packing problem.

28
00:02:02,400 --> 00:02:07,080
What you see here are six blue spheres

29
00:02:07,080 --> 00:02:12,990
grouped around a central red sphere in its equatorial plane.

30
00:02:13,560 --> 00:02:16,990
Now we add two more layers

31
00:02:16,990 --> 00:02:22,470
one on the top and one on the bottom, in fact we put three more spheres on the top

32
00:02:22,470 --> 00:02:24,850
and three more spheres on the bottom

33
00:02:24,850 --> 00:02:28,850
all of these spheres touch each other

34
00:02:28,850 --> 00:02:37,800
and touch the red sphere, and we have 3+6+3: we have found a packing with 12 spheres

35
00:02:38,430 --> 00:02:41,800
As you already saw in the logo

36
00:02:41,800 --> 00:02:44,360
we can distort this packing a little bit

37
00:02:44,360 --> 00:02:49,700
and create space around the blue spheres

38
00:02:50,950 --> 00:02:53,910
Let us find a second solution

39
00:02:54,570 --> 00:02:58,800
now not with three layers, but with four layers.

40
00:03:00,350 --> 00:03:05,300
We start with one sphere at the bottom

41
00:03:05,300 --> 00:03:10,630
we add five spheres on the second level

42
00:03:10,630 --> 00:03:14,630
then five spheres on the third level

43
00:03:14,630 --> 00:03:18,630
and then a final sphere on top.

44
00:03:19,170 --> 00:03:25,970
Now, we have 1+5+5+1 spheres

45
00:03:25,970 --> 00:03:29,110
all of them touch the red sphere

46
00:03:29,110 --> 00:03:33,110
but in this icosahedral packing

47
00:03:33,110 --> 00:03:36,700
none of them touches each other.

48
00:03:36,700 --> 00:03:40,700
Again, we make a little space

49
00:03:40,700 --> 00:03:45,030
and the question arises whether we have enough space

50
00:03:45,030 --> 00:03:49,030
to add a 13th blue sphere

51
00:03:49,660 --> 00:03:52,050
to this ensemble of 12 spheres.

52
00:03:52,050 --> 00:03:57,150
This is the famous 13-spheres problem

53
00:03:57,150 --> 00:04:01,650
and it was first discussed by Newton and Gregory

54
00:04:01,650 --> 00:04:04,500
in 1694.

55
00:04:05,370 --> 00:04:09,740
So clearly there are different possible packings,

56
00:04:09,740 --> 00:04:15,360
and you understand that a sampling problem is just around the corner.

57
00:04:16,090 --> 00:04:18,570
So, from a wider perspective

58
00:04:18,570 --> 00:04:21,400
the subject of this tutorial

59
00:04:21,400 --> 00:04:25,400
is the relationship between the sampling approach

60
00:04:25,400 --> 00:04:28,760
and for once not the integration

61
00:04:28,760 --> 00:04:36,030
but searching, and orientation in complicated configuration spaces.

62
00:04:37,000 --> 00:04:40,030
Next, in this tutorial

63
00:04:40,030 --> 00:04:42,410
Michael..

64
00:04:42,410 --> 00:04:48,430
next in this tutorial, Michael will show us how to formulate

65
00:04:48,430 --> 00:04:52,430
the packing problem not in terms of spheres

66
00:04:52,430 --> 00:04:56,430
on the surface of a sphere, but in terms of disks

67
00:04:56,430 --> 00:04:58,410
on the surface of the sphere.

68
00:04:58,410 --> 00:05:03,780
Then we will go on to discuss strategies

69
00:05:03,780 --> 00:05:09,280
to find solutions that were once so well-hidden

70
00:05:09,280 --> 00:05:17,710
that before the advent of Monte Carlo it took years and decades to find them.

71
00:05:22,090 --> 00:05:24,730
Let us transform the problem a little bit

72
00:05:24,730 --> 00:05:27,070
and search for the maximal radius

73
00:05:27,070 --> 00:05:33,860
of 13 blue outer spheres of radius R around a central red sphere.

74
00:05:33,860 --> 00:05:37,860
If this maximal radius is larger or equal to 1

75
00:05:37,860 --> 00:05:41,860
then there is enough space for Newton's 13th sphere

76
00:05:41,860 --> 00:05:43,400
otherwise not.

77
00:05:44,210 --> 00:05:47,620
As Werner just explained, we can simplify the problem a bit

78
00:05:47,620 --> 00:05:53,360
by considering the projections of the 13 blue outer spheres

79
00:05:53,360 --> 00:05:56,280
onto disks inside the red sphere.

80
00:05:56,280 --> 00:05:59,680
Let us start by considering the simple case

81
00:05:59,680 --> 00:06:03,190
with only two blue outer spheres

82
00:06:03,190 --> 00:06:07,860
touching each other and the central red sphere, as is illustrated here.

83
00:06:08,250 --> 00:06:13,440
The centers of the two blue disks are a distance of 2R apart

84
00:06:13,440 --> 00:06:19,240
while the centers of the projection disks are a distance of 2r apart.

85
00:06:20,340 --> 00:06:29,860
You can see that 2R relates to 2r like (1+R) relates to 1

86
00:06:29,860 --> 00:06:34,300
you might remember this relation as Thales' theorem.

87
00:06:34,840 --> 00:06:43,110
It gives us a simple relation between the radii of disks touching on a sphere and spheres touching on a sphere,

88
00:06:43,110 --> 00:06:50,300
it is given by R = 1 / (1/r - 1)

89
00:06:50,300 --> 00:06:59,510
likewise we can see that the disks have to be at least a distance 2r apart, lest they overlap.

90
00:06:59,510 --> 00:07:03,510
So let us now follow our winning strategy of the last two months

91
00:07:03,860 --> 00:07:08,510
we randomly sample the 13 positions of the blue disks

92
00:07:08,510 --> 00:07:11,300
on the surface of the unit sphere.

93
00:07:11,300 --> 00:07:17,940
In a first naive approach, we uniformly sample these positions and check for overlaps.

94
00:07:17,940 --> 00:07:21,940
Of course, we are only interested in legal configurations

95
00:07:22,160 --> 00:07:26,130
that is configurations of non-overlapping disks,

96
00:07:26,130 --> 00:07:28,970
and we reject in case of an overlap.

97
00:07:28,970 --> 00:07:35,080
In Python, this direct-sampling approach can be implements as is shown here

98
00:07:35,530 --> 00:07:42,740
Of course this program works and will eventually give us legal configurations of 13 disks on a sphere.

99
00:07:42,740 --> 00:07:47,200
This works very well for small radii r

100
00:07:47,200 --> 00:07:52,520
but it suffers from a huge rejection rate for larger disks.

101
00:07:52,520 --> 00:07:58,420
Already for disk radius r=0.32

102
00:07:58,550 --> 00:08:05,690
the program will generate 10000 illegal configurations before finding a legal one

103
00:08:05,690 --> 00:08:10,760
for disks of radius r=0.34

104
00:08:10,760 --> 00:08:15,570
the program will generate hundreds of thousands of illegal configurations

105
00:08:15,570 --> 00:08:19,570
and for r=0.36

106
00:08:19,570 --> 00:08:23,980
it will generate more than 10000000 illegal configurations.

107
00:08:24,660 --> 00:08:29,060
This value, r=0.36,

108
00:08:29,060 --> 00:08:35,750
corresponds to a sphere radius R=0.5625

109
00:08:35,750 --> 00:08:42,100
and you will see in a few minutes that this is still far away from the close-packing configuration.

110
00:08:42,800 --> 00:08:47,440
The problem of finding legal configurations of disks on a sphere

111
00:08:47,440 --> 00:08:53,160
is related to the problem of finding legal configurations of hard disks in a box,

112
00:08:53,160 --> 00:08:55,950
as we considered in week 2 of this course.

113
00:08:55,950 --> 00:09:02,850
You will remember that in tutorial 2 we discussed the program direct_disks_any.py

114
00:09:02,850 --> 00:09:06,850
which allowed us to first place the points in the plane

115
00:09:06,850 --> 00:09:10,850
and only then compute the minimal distance and density.

116
00:09:11,510 --> 00:09:14,250
We can adapt this program to our needs

117
00:09:14,250 --> 00:09:18,500
by placing the points on the sphere using the Gaussian random numbers

118
00:09:18,500 --> 00:09:21,700
we learned about in week 4 of this course

119
00:09:21,700 --> 00:09:24,510
and then computing the minimal distance.

120
00:09:24,510 --> 00:09:27,820
This gives the python program shown here.

121
00:09:27,820 --> 00:09:33,090
So, is there a smarter way to generate large-radius configurations?

122
00:09:33,090 --> 00:09:34,620
Of course there is..

123
00:09:34,620 --> 00:09:38,430
Alberto will in a few moments explain a method

124
00:09:38,430 --> 00:09:42,000
to use a Markov chain algorithm for this purpose,

125
00:09:42,000 --> 00:09:47,880
but before moving on, please take a moment to download, run and modify the programs

126
00:09:47,880 --> 00:09:52,450
direct_sphere_disks.py and its alternative version

127
00:09:52,450 --> 00:09:57,280
direct_sphere_disks_any.py that we discussed in this section.

128
00:10:02,120 --> 00:10:06,980
Now, after the direct sampling let's go ahead

129
00:10:06,980 --> 00:10:09,540
with the Markov chain sampling,

130
00:10:09,540 --> 00:10:13,540
following our nine week old strategy

131
00:10:13,540 --> 00:10:16,190
for tackling difficult problems.

132
00:10:17,890 --> 00:10:24,140
In Python, this gives markov_sphere_disks.py

133
00:10:24,860 --> 00:10:30,850
where here we start from a legal configuration of points

134
00:10:30,850 --> 00:10:34,850
on the sphere, and propose little moves

135
00:10:34,850 --> 00:10:38,850
which are accepted if there is no overlap,

136
00:10:39,490 --> 00:10:42,850
so that during our Markov chain simulation

137
00:10:42,850 --> 00:10:44,730
the disks never touch.

138
00:10:45,160 --> 00:10:48,730
This suggests that we should slightly

139
00:10:48,730 --> 00:10:51,310
swell the disks

140
00:10:51,430 --> 00:10:56,970
at certain time during the simulation, by a percentage gamma

141
00:10:56,970 --> 00:11:02,690
of the maximal increase that would keep the configuration legal.

142
00:11:03,310 --> 00:11:06,690
In Python this is done in the function

143
00:11:06,690 --> 00:11:11,310
resize_disks.py, that is shown here

144
00:11:11,310 --> 00:11:16,390
where we can set the percentage gamma of the maximal increase

145
00:11:16,390 --> 00:11:17,840
that we wish to use.

146
00:11:18,410 --> 00:11:23,390
Alternating long runs of markov_sphere_disks.py

147
00:11:23,390 --> 00:11:26,510
with resize_disks.py

148
00:11:26,510 --> 00:11:31,290
allows to construct a very efficient program which is actually

149
00:11:31,290 --> 00:11:35,290
a very general optimization strategy.

150
00:11:35,800 --> 00:11:39,180
In our example of 13 spheres

151
00:11:39,180 --> 00:11:42,440
we can see that we can reach densities

152
00:11:42,440 --> 00:11:46,440
that we never dreamt of using direct sampling.

153
00:11:47,550 --> 00:11:53,160
The idea of combining long Markov chains with careful resizing

154
00:11:53,160 --> 00:11:57,870
is reminiscent of what in metallurgy we call annealing

155
00:11:58,250 --> 00:12:03,440
which means that we cool down from high temperature a metal very slowly

156
00:12:03,440 --> 00:12:08,890
in order to drive out the grain boundaries and the other imperfections

157
00:12:08,890 --> 00:12:11,500
and this makes the metal less brittle.

158
00:12:12,640 --> 00:12:20,380
On the computer, the name simulated annealing was coined by Kirkpatrick, Gelatt and Vecchi

159
00:12:20,380 --> 00:12:24,380
more than 30 years ago in a very famous paper.

160
00:12:24,380 --> 00:12:28,380
In our problem of 13 disks on a sphere

161
00:12:28,380 --> 00:12:30,400
each time step

162
00:12:30,400 --> 00:12:36,450
is composed by a resizing and 10000 iterations of the Markov chain

163
00:12:36,450 --> 00:12:40,450
After each resizing

164
00:12:40,450 --> 00:12:42,690
the step of the Markov chain

165
00:12:42,690 --> 00:12:47,620
which is set by the standard deviation sigma of the Gaussians

166
00:12:47,620 --> 00:12:50,090
is automatically adjusted

167
00:12:50,090 --> 00:12:56,800
in order to keep the acceptance probability in accordance with the 1/2 rule.

168
00:12:57,880 --> 00:13:01,460
We see that the packing density eta

169
00:13:01,460 --> 00:13:04,840
slowly increases during a run

170
00:13:04,840 --> 00:13:09,410
until the disks settle in a jammed configuration

171
00:13:09,410 --> 00:13:13,410
and the parameter sigma goes to zero.

172
00:13:13,410 --> 00:13:18,420
Naturally there are many inequivalent jammed configurations

173
00:13:18,420 --> 00:13:21,470
and some of them are non-optimal

174
00:13:21,470 --> 00:13:28,640
but they can trap the simulating annealing routine forever and this is a clear limitation of the method.

175
00:13:29,520 --> 00:13:33,560
However if you run the simulated annealing routine

176
00:13:33,560 --> 00:13:38,990
many times with different random numbers and keeping gamma small

177
00:13:38,990 --> 00:13:41,630
you will observe that the solution

178
00:13:41,630 --> 00:13:46,120
is in most of the cases the highest density solution

179
00:13:47,270 --> 00:13:51,160
this means that the optimal packing configuration

180
00:13:51,160 --> 00:13:53,690
has a large basin of  attraction.

181
00:13:53,690 --> 00:13:57,690
On our problem of 13 disks on a unit sphere

182
00:13:57,950 --> 00:14:01,690
the maximal value of R that we can obtain

183
00:14:01,690 --> 00:14:05,690
and actually you will obtain in this week's homework,

184
00:14:05,690 --> 00:14:12,950
is R=0.916467..

185
00:14:13,830 --> 00:14:17,620
with our method we cannot actually exclude

186
00:14:17,620 --> 00:14:20,080
the value R=1

187
00:14:20,610 --> 00:14:25,180
in fact this is exclude by a proof given by the mathematicians

188
00:14:25,180 --> 00:14:30,400
Schutte and van der Waerden in a famous paper of 1951.

189
00:14:30,400 --> 00:14:33,640
Before listening to Vivien,

190
00:14:33,640 --> 00:14:35,990
(welcome back Vivien)

191
00:14:35,990 --> 00:14:43,540
please download, run and modify the program simulated_annealing.py that we discussed in this section.

192
00:14:43,540 --> 00:14:48,570
Please make sure to understand how the Markov chain is implemented

193
00:14:48,570 --> 00:14:52,570
and the function resize_disks.

194
00:14:57,260 --> 00:15:03,460
Let us now generalize what Alberto has explained to us, and this will allow us to illustrate

195
00:15:03,460 --> 00:15:09,300
the successes of the simulated annealing method, but also its limitations.

196
00:15:10,670 --> 00:15:17,740
As a first generalization, let us consider the case not of 13 disks on the sphere

197
00:15:17,740 --> 00:15:22,970
but the case of any number N of spheres around the inner sphere.

198
00:15:24,490 --> 00:15:30,780
The program simulated_annealing.py can be easily modified, and you obtain the following result

199
00:15:30,780 --> 00:15:35,450
for the maximal density eta_max as a function of N.

200
00:15:36,310 --> 00:15:40,440
You observe that eta_max grows slowly with N

201
00:15:40,440 --> 00:15:45,180
but it seems that it does not go to the value

202
00:15:45,180 --> 00:15:50,780
eta_close_packing = pi / 2 sqrt(3)

203
00:15:50,780 --> 00:15:55,520
which is the maximal density for the close packing in two dimensions

204
00:15:55,520 --> 00:16:01,680
However we would expect to reach this asymptotic

205
00:16:01,680 --> 00:16:06,170
since a lot of small disks on a sphere

206
00:16:06,170 --> 00:16:10,710
would feel like living in a two-dimensional plane

207
00:16:10,710 --> 00:16:14,710
and thus adopt the hexagonal ordering.

208
00:16:16,920 --> 00:16:23,450
Yet, in fact, this method will still reach this asymptotic

209
00:16:23,450 --> 00:16:28,000
and it can really be proven that the convergence is very slow

210
00:16:28,840 --> 00:16:34,620
The ratio between the maximal density eta_max and eta_close_packing

211
00:16:34,620 --> 00:16:40,610
behaves as 1 - const / N^(1/3)

212
00:16:40,610 --> 00:16:44,610
Let us now make an analogy

213
00:16:44,610 --> 00:16:52,310
we can imagine the jammed configuration as local minima in a vast and complex landscape

214
00:16:53,660 --> 00:17:00,580
the optimal configuration will correspond to the global minimum, that we can call the ground state.

215
00:17:02,880 --> 00:17:07,630
In fact, many physical problems can be recast in this description

216
00:17:07,630 --> 00:17:13,750
and in general the method of the simulated annealing will work

217
00:17:13,750 --> 00:17:17,750
because if one takes an equilibrium system

218
00:17:17,750 --> 00:17:22,520
and send the temperature to zero, it will adopt the ground state.

219
00:17:24,600 --> 00:17:28,710
However, this argument is purely formal

220
00:17:28,710 --> 00:17:37,470
because in practice simulated annealing cannot be done at a very very small velocity

221
00:17:38,100 --> 00:17:42,890
which we would require for this method to work always.

222
00:17:42,890 --> 00:17:46,890
The nice behavior we have discussed is in fact

223
00:17:46,890 --> 00:17:51,320
limited to simple cases: disks, spheres, ..

224
00:17:52,550 --> 00:17:57,240
For instance in the case of polydisperse disks

225
00:17:57,240 --> 00:18:01,240
the method of simulated annealing

226
00:18:01,240 --> 00:18:05,240
does not find the optimal configuration, but still

227
00:18:05,240 --> 00:18:07,770
finds good configurations.

228
00:18:09,960 --> 00:18:15,480
In real life, simulated annealing must be practiced with care

229
00:18:15,480 --> 00:18:20,300
especially in cases where there are imperfections and disorder

230
00:18:20,300 --> 00:18:27,530
because then there are a lot of metastable states and the system may be stuck in them.

231
00:18:27,530 --> 00:18:31,530
If the imperfections are too pronounced

232
00:18:31,530 --> 00:18:37,400
the result of the algorithm is very often a glass rather than a crystalline matter.

233
00:18:39,300 --> 00:18:50,760
For the case of the polydisperse disks on a sphere, that is to say disks with varied radius, this is what happens.

234
00:18:52,010 --> 00:18:57,880
The disorder prevents the system from reaching its optimal configuration

235
00:18:58,450 --> 00:19:04,020
and leads the system into one of the jammed configurations

236
00:19:04,020 --> 00:19:11,220
whose basins of attraction is much larger than in the case of monodisperse disks

237
00:19:11,220 --> 00:19:15,220
Before continuing, let me grab the object of the week

238
00:19:15,220 --> 00:19:19,650
the flip die, and let's flip faster than the clock.

239
00:19:23,010 --> 00:19:27,230
So, in conclusion, we have studied in this Tutorial 9

240
00:19:27,230 --> 00:19:32,820
one of the many connection between statistical mechanics and the rest of life.

241
00:19:33,830 --> 00:19:40,480
You will continue on the subject of packing in this week's homework

242
00:19:40,480 --> 00:19:45,840
but don't forget that there is a connection between sampling

243
00:19:45,840 --> 00:19:50,650
not only with integration as we have studied throughout this course

244
00:19:50,650 --> 00:19:55,910
but also with searching and orientation in complicated spaces

245
00:19:57,310 --> 00:20:03,510
So finally good luck and have a lot of fun with

246
00:20:03,510 --> 00:20:07,510
Homework session 9, the final homework session of this course.

247
00:20:07,510 --> 00:20:17,800
Next week, we will have a lecture on the alpha and the omega of Monte Carlo

248
00:20:17,800 --> 00:20:26,160
and the final tutorial will be devoted to a review and to a big party.

249
00:20:27,600 --> 00:20:35,960
And thanks again for your interest, and for the very active part that you're taking in our course.