Hello, bonjour, konnitchiwa, god dag, welcome to the first tutorial of Statistical Mechanics: Algorithms and Computations, from the Physics Department of Ecole Normale Superieure. Our subject today will be the convergence of Markov chain Monte Carlo algorithms and our main tool will be this stone, this pebble, moving around the 3x3 pebble game. It will allow us to derive and illustrate a profound mathematical result that is of great practical importance. We will show that the convergence of Markov chain Monte Carlo algorithms is exponential. This means that a Markov chain Monte Carlo algorithm is characterized by a time scale, the convergence time tau. It is on this scale that we approach equilibrium and that the pebble positions on the 3x3 pebble game become equally probable. After a few times tau, a Markov Chain Monte Carlo algorithm is in equilibrium for all intents and purposes. In the same way that after a few nuclear half-lives a radioactive compound has completely decayed. Let us now return to the 3x3 pebble game and understand its exponential convergence. On the 3x3 grid, we label the sites from 0 in the lower left corner to 8 in the upper right corner. The Markov chain Monte Carlo algorithm, as discussed in Lecture 1, consists in moving the pebble with probability 1/4 to the right and to the left, up, and down. When a move is not possible, it is rejected and the pebble remains in the same position for the next iteration. In Python, this gives the following program. The list 'neighbor' contains nine lists with four sites each, corresponding to the moves to the right, up, to the left and down. So here is the configuration at time 0 with the pebble on site 8. The function randint(0,3) samples random numbers 0, 1, 2 and 3 with equal probability. If we choose 0, we'll try to move to the right, but this move is rejected. If we have randint equal to one, we'll move up, but again this move will be rejected. If randint equal to 2, we'll move to the left; and if it's equal to 3, we'll move down. Before running this algorithm, let us ask a few questions. First question: if at time t=0 the pebble is in the upper right corner, on site 8, on which sites can it be at time t=1? Of course, it can be on the sites 5, 7 and 8. In fact, the pebble will remain on site 8 with probability 1/2, and it will move to sites 5 and 7 with probability 1/4 each. Question 2: at what time does the probability to be at site 0 become finite? The answer is that it becomes finite at time t=4. at t=0, 1, 2 and 3 because it takes at least 4 steps to connect the site 8 with the site 0. The probability to be at site 0 is non-zero for time t=4 and larger. Now, final question in this sequence. Question 3: what is the probability to be on site 0 at time t=4, given that the pebble started at site 8 at t=0? Attention, the probability is a fraction, and it is not easy to find. The answer is that this probability is equal to 6/256, or 3/128. The denominator (256) is equal to 4 to the power of 4 because there are four moves and in each move we have four possibilities. The numerator (6) corresponds to the 6 ways to connect site 8 to site 0 in four steps. From our stone age 3x3 pebble game we now move on to a program capable of doing graphics. The red pebble that you see here starts on site 8, then moves on to neighboring sites, hops back, goes on, touches site 0 and so on and so on, until - after long time - all the positions become equally likely. Next in this tutorial, Michael will discuss the simulation without watching pebbles hopping around the 3x3 pebble game. But before moving on, please take a moment to download, run and modify the programs we just discussed. On the Coursera website, you will find the program pebble_basic.py that we just discussed. You will also see pebble_basic_movie.py that produces the nice graphics output. Now let us modify a bit the simulation program for the 3x3 pebble game. Rather than having one long run, we will now consider many runs and each of the run will start at site 8 in the upper right corner of the grid. Now this program contains an additional loop over the runs, and we have a total of n_runs runs in the program. The program finally outputs the final position of the pebble at time t_max. Output for t_max=4 is shown here. Let us now represent the output using two-dimensional histograms, first for the case t_max=0. For this time, all the runs are still on site 8, and there are no pebbles on other sites. As all the pebbles are on site 8, their proportion on site 8 is equal to one, here represented in white. All the other sites are shown in black, as they do not appear in the output of the program. At t_max=1, that is after one iteration, the output of our program looks as follows, and the corresponding histogram is shown here. At this time, about half of the pebbles is still on site 8, while there is about a quarter of the pebbles on site 5 and site 7 each. The other sites are still black, because they cannot be reached within one move. At t_max=2, we have the following histogram. we see that now some pebbles also reach the sites 2, 4 and 6. For larger times we see how the pebble diffuses over the entire 3x3 pebble game till the color becomes more and more uniform. Please take a moment to make sure that you understand the color code in this figure, and test your understanding in the following little quiz. We consider the program with n_runs=100. As is indicated by the color bar on the right of the diagram the color of site 0 represents the number 0.09, which is output of the program. Question 1: what is the meaning of this number 0.09? The correct answer is answer 2. The histogram simply analyzes the output of a simulation with n_runs=100 Question 2: on the starting site 8, the probability falls from 1 at t= 0 to one half at t=1 and is always larger than the probability at any other site. Is this always true? Even if we choose another starting site at t=0? It is false, as can be seen from the case of the pebble game starting at site 4, the center. The probability to be found at the starting point at t=1 is zero, in this case. Question 3: would our results change, if instead of playing 100000 consecutive pebble games we would let 100000 pebbles move across the grid at the same time? The result would not change at all. The probabilities represented by the color code in the figure can be correctly interpreted as the normed density of an ensemble of a large number of pebbles moving across the sites, without interactions between them. Next in this Tutorial 1, Alberto will discuss the simulation using a compact analytic method, the transfer matrix. It will give us the exact value of the probabilities that we just discussed. Before moving on, take your time to download, run and modify the programs. On the Coursera website, you will find the program pebble_multirun.py and also the program that allows you to generate the histogram at time t_max. There's also a program which generates all the histograms at times t=0, t=1 and so on, in one run. As indicated by Michael, we now discuss the transfer matrix method, which allows to solve exactly the Monte Carlo dynamics of the 3x3 pebble game. For larger system, this method does not apply, but its lesson remains valid. If we start at the right upper corner at t=0, we know that the probability to find the pebble at site 8 is one. At time t=1, we know that with probability 1/4 we will be at the site 5 or 7 and with probability 1/2 at the site 8. To compute the exact probabilities at all time steps, we need to introduce a vector: pi_t, which contains the probabilities to find the pebble at time t at the site 0, 1, 2 and so on. As explained by Werner in Lecture 1, the Monte Carlo algorithm is nothing but a matrix, the transfer matrix of transition probabilities p(a->b) for moving from the site a to the site b. In the 3x3 pebble game, this matrix is 9x9, because there are 9 configurations. Let's fill in this matrix. The probability to go from the site 0 to 0 is 1/2, the probability to go from 0 to 1 is 1/4, the probability to go from 1 to 2 is 1/4, so let's fill in the entire matrix. Note the factor 1/4 in front of the matrix. It is now crucial to realize that the probability vector at time t+1 is the product between the transfer matrix and the probability vector at time t. In Python, this gives the following program, where thanks to numpy we can write in one line the matrix vector product. The output is the probability vector at time 0, 1, 2 and so on, as shown here. This is the entire Monte Carlo dynamics, for an infinite number of pebble games, and it is exact. For example the probability on the site 0 is zero at time 0, 1, 2 and 3 and is 6/256 - which means 0.023 - at time 4, as you proved earlier. We also see that the equilibrium value of 1/9, which is 0.111, is reached very fast, for all sites in the system. At small time, the matrix-vector multiplication modifies the probability vector, while at longer times this multiplication does not change the vector. Does this ring a bell with you? It means that the equilibrium probability vector is an eigenvector of the transfer matrix, with eigenvalue 1. Eigenvalues and eigenvectors can be computed in one line, in Python, and this is the modified program pebble_transfer.py and the nine eigenvalues of the transfer matrix. We see that 1 is the largest eigenvalue, the second largest eigenvalue is 0.75, then we have 0.5 and -0.5. We have already understood the role of the eigenvalue 1, it is associated to the equilibrium eigenvector. Now we will study what the other eigenvalues, and in particular 0.75, are good for. To do so, we consider again the output of pebble_transfer.py. Here we find the probability to reach any site starting from the upper right corner. Let us now subtract the equilibrium value, 0.1111, and take the absolute value. This is done in pebble_transfer_sub.py, and this is the output Let us now plot the output for the site 0, in the semilog scale. The straight line indicates an exponential convergence to equilibrium as announced at the beginning of the tutorial. The slope of this line is the same as the function (0.75)^t the second eigenvalue power t. Let us now put 0.75 into an exponential this gives exp(-t ln(0.75)), which can also be written as exp(-t/tau). tau, which is more or less 3.4, is the correlation time introduced by Werner at the beginning of the tutorial. Let's go back to the original evolution of the probability vector pi_t. It is after few correlation times tau - let's say 4 or 5 - that the equilibrium is reached and the corrections to the equilibrium disappear for all intents and purposes To understand our observations, we remark that the initial probability vector can be decomposed on the eigenvectors of the transfer matrix. The component associated with the eigenvalue 1 is conserved in time, because 1^t is equal to 1. The other components associated with the other eigenvalues lambda decay exponentially and the slowest decay is given by the second largest eigenvalue. Before moving on to Vivien's part about rigorous solutions, let's take a moment to download, run and modify the programs that we have discussed in this section. On the Coursera website, you will find pebble_transfer.py which contains the construction of the transfer matrix and the matrix-vector multiplication using numpy. The eigenvalues of the transfer matrix are computed in transfer_pebble_eigen.py. The graphics and the subtraction of the equilibrium value 1/9 is contained in the program transfer_pebble_sub.py. Alberto has indicated that the evolution of the Monte Carlo dynamics is determined by the decomposition of the probability vector at initial time into the eigenvectors of the transfer matrix. All eigenvalues have modulus less than or equal to one, an eigenvalue larger than one would lead to an explosion of the probability at large time. At large times, the dynamics converges to the eigenvector with the largest eigenvalue equal to one. this is what we will call the probability vector of the equilibrium state, or the steady state. When does this picture really apply? What are the rigorous mathematical conditions ensuring that one converges at large times to a unique steady state? This is what we will find out in the final part of this tutorial. There is a simple physical example in which a strange phenomenon occurs. Consider a system in which you have two copies of the 3x3 pebble game, one in red and one in blue. In each of these games, the pebble jumps from site to site with the same rules as we have seen before. If it is started in the red one, it will remain there forever and if it is started in the blue one the same thing will happen. The combined system is described by a 18x18 transfer matrix because we now have 18 sites. There are lot of zeros in this matrix indicating that the pebble cannot jump from the red game to the blue game and vice versa. In fact, we are in an annoying and physically unacceptable situation, where the outcome of the simulation depends on the starting configuration even at infinite times. We are also in a mathematically annoying situation, because the transfer matrix of our dual pebble game has two eigenvalues one. This is described in the program pebble_dual_eigen.py. Mathematicians describe this situation that we must avoid as stemming from the reducibility of the transfer matrix of our dual game. Reducible means that we can split apart the transfer matrix of our system into two subsystems without affecting the rest. One of the two mathematically rigorous conditions, for a Monte Carlo Markov chain algorithm is that this pulling apart is not possible, that would be irreducible. We can in fact render our dual pebble game irreducible, by adding a small probability of transition between the two games as depicted here. We now observe that the pebble can change its game. This small probability can also be observed in the transfer matrix in the program pebble_dual_eigen.py, where it allows to recover a unique eigenvector of eigenvalue one that is to say, a unique steady state. We realize we must have a unique eigenvector of eigenvalue one, ensuring that the matrix is irreducible. In fact, this is not enough. We need a second condition, ensuring that the probability converges to the steady state, in the large time limit. We could face the situation where the second eigenvalue lambda_2 (different from one and possibly complex) is of unit modulus. In this situation, lambda_2 to the power of t does not converge to zero as t increases. This means that the probability vector oscillates and never converges at infinite times. To illustrate this behavior in a simple example, consider the following 2x1 pebble game and adopt the following jump rule. The poor pebble is unlucky. Not only it has only two sites where to be, but also at each time step it must go to the other accessible position. Starting from the left at time 0, it has to go to the right at time 1, to the left at time 2, and so on, in a deterministic way. The transfer matrix of this dynamics is simple to write. The ones in the matrix indicate that the particles jumps to its neighboring site at each time step, and the zeros indicate that it cannot stay in the same position. The eigenvectors and eigenvalues take a simple form. The steady state is uniform and the other eigenvector has eigenvalue -1. This explains why the probability vector does not converge in the infinite time limit. How is this phenomenon general? Let us remark that every two time steps the pebble comes back to its original state. From a general point of view we now arrived at the second of the two mathematical conditions that we require for our Monte Carlo Markov chain algorithm. We must avoid recurring states: situations where when starting from one state we are sure to find it back after a fixed number of time steps. In our 2x1 pebble game, we have two recurring states which we are sure to find back after two time steps. They correspond to starting either from the right or from the left states. Fortunately recurring states can be easily avoided In our example, imagine that we allow the pebble to stay in the same configuration with a small probability epsilon. Every now and then, the pebble will stop and stay in its position In the program pebble_recurrent_eigen.py, you will see how with a small finite epsilon the steady state becomes the unique eigenvector of unit modulus. What we have seen here has been cast by mathematicians into two conditions that we must supplement to the global or detailed balance condition that Werner discussed during the lecture. The Markov chain must satisfy global balance and condition 1: it is irreducible and condition 2: it is aperiodic, that is: there are no recurrent states. Those are in total the only three conditions that we require the Markov chain algorithm to verify for it to behave in a proper way. Please take a moment to download, modify and run the programs that we have discussed in this session. On the Coursera website, you will find the two following programs pebble_dual_eigen.py and pebble_dual_movie.py In these two programs, you can add a small probability epsilon for the pebble to jump from the red game to the blue game. You will see how this allows the eigenstate of eigenvalue one that is to say the steady state, to be the unique one. You will also find the associate programs pebble_recurrent_eigen.py and pebble_recurrent_movie.py In these two programs, you can add a small probability epsilon allowing the pebble to stay in the same position an you will discover by yourself how the steady state becomes unique. In conclusion, we have seen that Markov chain Monte Carlo calculations are equivalent to transfer matrix multiplications. They converge to equilibrium after an infinite number of iterations But this infinite number is not a serious restriction because of the existence of a time scale, set by the second largest eigenvalue of the transfer matrix. In the 3x3 pebble game, the second eigenvalue gave a time scale tau=3.4 and convergence towards equilibrium was reached after few times tau, like 15 or 20 iterations. The concept of equilibrium is far reaching and the interest in Monte Carlo calculations is very strong because of the existence of this time scale that separates fast and slow things. There is a considerable tragedy attached to this question of the timescale tau, because in usual Monte Carlo calculations we have a hard time estimating tau. But this will be food for thought, for another session of Statistical Mechanics: Algorithms and Computations.