Phase Space Overlap and FEP Sampling

P2 concept — synthesised from 2007-chipot-free-energy-calculations-book §6.3.1 (Lu, Woolf, Andricioaei), pp. 235–240. The single most operative diagnostic for whether an FEP, NEW (nonequilibrium work), or alchemical free-energy calculation will converge: are the important phase-space regions of reference state Γ₀ and target state Γ₁ in a subset relationship? When this fails, no amount of additional sampling rescues the run — only adding intermediates does.

Definition

The book defines Γ* as the set of configurations that contribute non-trivially to the partition function — configurations whose Boltzmann factor exp(−β U(Γ_i)) is large enough to matter. Operationally, Γ* is the configurations with U below a characteristic energy (most-likely or average energy of the system).

Thus, in general, Γ* is the set of representative configurations and needs to be sampled well in a simulation to measure correctly the ensemble-averaged properties of the system.

— [Chipot & Pohorille 2007, p. 235, §6.3.1]

The four overlap topologies (Fig. 6.1)

  1. (a) Subset, Γ₁ ⊆ Γ — sampling barrier is entropic. Forward FEP from 0 to 1 is reliable: every visit into Γ₁ contributes; the run fails only if Γ₁ is so small that random encounter is rare. Reverse FEP fails: from inside Γ₁ the rest of Γ₀ is gated by an energetic barrier (high U) which is rarely crossed.
  2. (b) Coincidence, Γ₁ ≈ Γ — both forward and reverse FEP work. The single-stage exponential average converges quickly. Rare in practice for any non-trivial perturbation.
  3. (c) Partial overlap — neither forward nor reverse single-stage FEP is reliable. Must stage with intermediate M placed in the intersection_M ⊂ Γ₀ ∩ Γ*₁) and run 0 → M ← 1 (overlap sampling). See MFEP Two-Stage Strategies Table row 3.
  4. (d) Disjoint, Γ₀ ∩ Γ₁ = ∅ — single-stage FEP gives the wrong answer. Must construct M with Γ_M ⊃ Γ₀ ∪ Γ*₁ (umbrella sampling) and run 0 ← M → 1.

The funnel rule for NEW

For nonequilibrium work / Jarzynski:

To ensure the accuracy of the free energy estimate by sampling the most important set of trajectories, we choose the sequence of systems so that each successive state obeys a phase space subset relationship with the one that preceded it. … We say that a path following such a trajectory moves down the funnel.

— [Chipot & Pohorille 2007, p. 240, §6.3.1]

i.e., for a switching protocol λ(t): 0 → 1, the chain Γ₀ ⊇ Γ{λ_1} ⊇ Γ*{λ_2} ⊇ … ⊇ Γ*₁ is required for low-variance work distributions. The book diagrammatic name for this is “down the funnel.”

Why this matters operationally

The phase-space picture explains, in one frame, three otherwise-mysterious failure modes:

  1. High precision but wrong answer — symptom of disjoint Γ₀ and Γ₁ (case d). Independent simulations agree with each other (low variance) but converge on a biased value because they all miss the same important regions.
  2. Forward and reverse give different ΔA (hysteresis) — symptom of partial overlap (case c). Run overlap sampling.
  3. Convergence depends on direction — symptom of subset (case a). Run from the higher-entropy state as reference.

In conclusion, “to guarantee a reliable free energy estimate the important phase space of the target state should be a subset of that of the reference state.” One way to achieve this is to choose the higher-entropy system as the reference.

— [Chipot & Pohorille 2007, p. 238, §6.3.1]

How to diagnose Γ₀ vs Γ₁ in practice (book §6.3.2)

The probability distribution of the perturbation P_0(ΔU) is the practical proxy for the overlap topology:

  • Wide, well-peaked, with significant probability at low ΔU → cases (a) or (b), forward FEP OK.
  • Bimodal, or with the dominant peak at high ΔU and a thin negative tail → case (c) or (d). Stage with intermediates.
  • The two distributions P_0(ΔU) and P_1(−ΔU) should overlap if the FEP is converged (Bennett 1976 / Crooks 1999). No crossing → no overlap → no convergence.

How to use this in STRC

  • h01 phase5 alchemical mutation E1659A: WT and mutant pockets are “partial overlap” — the carboxylate makes a salt bridge that the alanine cannot, so loop conformations that are “important” in WT are not “important” in mutant. Always stage; don’t expect single-window FEP to work.
  • h09 hydrogel monomer addition: monomer-free and monomer-bound states have very different important regions (case d). PMF along ξ via Recipe — ABF Adaptive Biasing Force Algorithm or umbrella sampling, never single-window FEP.
  • Universal check: every alchemical or NEW STRC script should plot P_f(W) and P_b(−W); if they don’t cross in the converged region, the calculation is incomplete regardless of how long it ran. This check belongs in phase5 outputs alongside the MM-PBSA gate.

Connections