Direct sampling for high-dimensional integration
S. Ulam’s original idea at the base of the Monte Carlo method is to “change processes described by certain differential equations into an equivalent form interpretable as a succession of random operations” . As long as a problem can be given a probabilistic interpretation, it can be modeled using random numbers. As statistical physics formulates irregular phenomena as a stochastic ensemble, see Section 1.1, it leads to solving high dimensional integrals Section 1.1.1 or the corresponding differential equations (see Section 1.2.3), which is analytically a hard task. Following Ulam’s idea and relying on Bernoulli’s law of large numbers, a high dimensional integral is now understood as an average over a collection of random configurations, that follow the correct probability distribution.
Weak and strong laws of large numbers and the Monte Carlo method
As demonstrated by the law of large numbers, the average of an observable q over a probability distribution p, e.g. the Boltzmann distribution, written as the integral Eq. 2.1, may be estimated without bias by its empirical average over random and independent samples drawn from the same distribution p, as in Eq. 2.2. Such a collection of samples is said to be independent and identically distributed, i.i.d., in particular they have all the same mean and variance. Obtaining such a collection is commonly referred to as direct sampling or static Monte Carlo. The estimator of the empirical mean is called the Monte Carlo estimator (for a more thorough study of variance and error in Monte Carlo method, see Section 2.3). hqi = Z dx q(x)p(x).
Markov-chain Monte Carlo method
In 1953, Metropolis et al. , and later the theoretical justification of Wood and Parker  and the generalization of Hastings , solves the partition function and high rejection issues, which was encountered by direct sampling. The Metropolis algorithm provides indeed the right probability distribution of the configurations, from the moment one can compute a function f proportional to the density p. Therefore the issue of the computation of the partition function, or any normalizing factor, vanishes.
To do so, a sequence of samples is now obtained by running a Markov chain. Hence the name of Markov-chain Monte Carlo method. Such algorithms can be then understood as a Markov process following its own Master equation, see Section 1.2, with a special choice of transition probabilities, see Section 2.2.2, which depend only on ratios of the form p(x)/p(x0) with x and x0 two configurations and p the Boltzmann probability distribution. A Markov-chain Monte Carlo method is then equivalent to inventing a stochastic time evolution for a given system.
Unlike samples produced by direct sampling, samples originating from a Markovchain Monte Carlo method are thus correlated and assessment of errors asks for more care, see Section 2.3. The time evolution does not need, however, to correspond to any real physical dynamics; rather, the dynamics is to be chosen on the basis of its computational efficiency. The situation is then different than in molecular dynamics simulations . In molecular dynamics, the microcanonical ensemble is sampled by numerically solving Newton’s equations of motion. The extension to averages over the canonical ensemble is ensured by coupling the system to a heat bath [11, p. 139]. Particles of the system can undergo collisions with the heat bath, during which they gain a new velocity, drawn from the Maxwell-Boltzmann distribution, see Section 184.108.40.206. The mixing of Newtonian dynamics and stochastic collisions changes the simulation into a Markov process. It is however possible to keep the simulation deterministic by using an extended Lagrangian instead of introducing a heat bath [11, p. 147].
Principles of the Markov-chain Monte Carlo method
In the Markov-chain Monte Carlo method, a Markov chain is devised on the space state W, such that the chain is irreducible, aperiodic and its unique steady-state distribution is precisely the wanted probability distribution p, see Section 1.3.2.
Under the conditions of irreducibility and aperiodicity, it can be proven that the chain follows the strong law of large numbers, see Section 2.1.1, and the central limit theorem, see Section 2.3.3 : long-time average of any observable f converges with probability 1 to its average on the stationary distribution p and with fluctuations of size proportional to the square-root of the simulation time. Below are reviewed the necessary conditions that the matrix T of the Markov chain has to fulfill, in order to produce the correct stationary distribution.
As explained in Section 1.2, a homogeneous Markov chain is fully determined by the initial probability distribution and the transition probabilities contained in the transfer matrix T. In the Markov-chain Monte Carlo method, the initial probability distribution is in most cases a Dirac distribution centered on a valid initial state.
The transition probabilities need to ensure that the Markov chain is irreducible, see Section 220.127.116.11. Furthermore, the transition probabilities must satisfy the conservation x and its neighbors fxig. Global balance must always be fulfilled in order to retrieve the correct probability distribution. Detailed balance and maximal global balance are particular cases of global balance. Detailed balance imposed a symmetric equilibrium of flows between any two configurations, whereas maximal global balance offers an out-of-equilibrium scheme where all flows are irreversible.
Table of contents :
1 Markov processes
1.1 Fluctuations and statistical ensembles
1.1.1 Maxwell-Boltzmann distribution
1.1.2 Classical statistical mechanics and ensemble averages
1.1.3 Stochastic processes
1.2 Markov processes and master equation
1.2.1 General properties of a Markov process
1.2.2 Stationary and homogeneous Markov processes
1.2.3 Master equation
1.3 Convergence in Markov processes
1.3.1 Ergodic and mixing properties
1.3.2 Mixing in Markov chains
2 Monte Carlo Method
2.1 Direct sampling for high-dimensional integration
2.1.1 Weak and strong laws of large numbers and the Monte Carlo method
2.1.2 Inversion sampling
2.1.3 Acceptance-rejection method
2.1.4 Importance sampling
2.2 Markov-chain Monte Carlo method
2.2.1 Principles of the Markov-chain Monte Carlo method
2.2.2 Metropolis algorithm
2.2.3 A priori probabilities
2.2.4 Heat-bath algorithm
2.3 Convergence and error
2.3.1 Initialization bias
2.3.2 Uncorrelated samples
2.3.3 Correlated samples and autocorrelation times
2.3.4 Scaling of autocorrelation times around a phase transition
2.4 Cluster algorithms
2.4.1 Fortuin-Kasteleyn transformation
2.4.2 Swendsen-Wang algorithm
2.4.3 Wolff algorithm
2.4.4 Cluster algorithms for spin glasses
2.5 Non-local algorithms
2.5.1 Geometric cluster algorithm
2.5.2 Jaster algorithm
3 Irreversible factorized Metropolis algorithm
3.1 The Lifting Framework
3.1.1 Lifting for the unidimensional random walk
3.1.2 General case
3.1.3 Lifting for two particles
3.2 The Factorized Metropolis filter
3.2.1 Definition of the Factorized Metropolis filter
3.2.3 The factorized Metropolis filter in the lifting framework, beyond replicas
3.3 Continuous-time scheme
3.3.1 Maximal global-balance by infinitesimal steps
3.3.2 Event-driven approach: Event sampling
3.3.3 Observable averaging
3.4 Infinite chains
4 Applications of the irreversible factorized Metropolis algorithm
4.1 Soft-sphere systems
4.1.1 Melting in soft spheres
4.2 Pressure computation
4.3 Continuous classical spin systems
4.3.1 Planar rotator spin systems
4.3.2 Heisenberg spin systems on a three-dimensional lattice