Recipe — Receptor-Based Scoring Function Selection
P1 recipe — synthesized from 2014-schneider-de-novo-molecular-design-book §1.6.1 (Schneider & Baringhaus, Eqs. 1.10–1.12) and §16.3 (Westermaier & Hubbard, FE-method choice). When you need to score a receptor–ligand binding pose for de novo design, virtual screening, or fragment-grow, this recipe tells you which of the three scoring families to use and the citation pattern for that family.
This recipe complements Recipe — FEP Point-Mutation Algorithm (alchemical), Recipe — Bennett Acceptance Ratio Estimator (estimator), and Recipe — LRA Method for pKa Shift Calculation (pKa-shift) — those handle binding free energies; this recipe handles binding scores (cheap proxies).
Decision tree
Goal of the score?
├── (1) Quick rank of many poses (VS, fragment-grow scoring loop)
│ → use FORCE-FIELD class (Eq. 1.10) — Vina, AutoDock, GOLD ChemScore
│ O(seconds per pose), 1–2 kcal/mol noise typical
├── (2) Rescore short-list, want activity correlation
│ → use EMPIRICAL class (Eq. 1.11) — Glide SP/XP, ChemScore (regressed)
│ requires regression-trained weights; tied to training-set chemotypes
├── (3) Have rich PDB co-crystal training data, target similar to training set
│ → use KNOWLEDGE-BASED class (Eq. 1.12) — DSX, DrugScore, PMF, SMoG
│ Boltzmann inversion of atom-pair statistics; works best on common scaffolds
├── (4) Late stage, congeneric series, ≤0.5 kcal/mol matters
│ → escalate to FE METHOD per §16.3 — see Recipe—FEP-Point-Mutation
├── (5) Fast continuum-correction layer over force-field score
│ → MM-PBSA / MM-GBSA — Eq. 1.10 + Poisson-Boltzmann or Generalized-Born
│ 5–8 kcal/mol absolute error; useful for ΔΔG within congeneric set
└── (6) Reaction coordinate matters (binding pathway, gated pocket)
→ PMF — see Recipe—ABF-Adaptive-Biasing-Force
Family equations (verbatim from §1.6.1)
Force-field (Eq. 1.10):
E = Σ_{i∈ligand, j∈receptor} [ A_ij/r_ij^12 − B_ij/r_ij^6 + (q_i q_j)/(ε r_ij) ]
Where A_ij, B_ij are vdW repulsion/attraction parameters, q is partial charge, ε is dielectric. Failure mode: ε for ligand pockets is hard to assign and is the dominant systematic error.
Empirical (Eq. 1.11):
ΔG = ΔG_0 + Σ_i [ΔG_i · count_i · penalty_i]
Weights ΔG_i fitted to experimental pIC_50 / pK_d for known complexes. Failure mode: weights overfit the training-set chemotypes; transferability to novel scaffolds is the main concern.
Knowledge-based (Eq. 1.12):
E(i,j) = −k_B T · ln [ p_ij^observed(r) / p_ij^expected(r) ]
Atom-pair frequency comparison vs. random-distribution baseline. Failure mode: training set bias — pairs absent from PDB get arbitrary scores.
STRC parameter table
| Pipeline phase | Scoring class | Tool | Citation pattern | Typical noise |
|---|---|---|---|---|
| h01 phase 4b (Vina docking) | force-field | AutoDock Vina | Eberhardt 2021 JCIM (software); Schneider 2014 §1.6.1 (class) | ~0.5–0.8 kcal/mol paired (per pharmacochaperone) |
| h01 phase 4f (MM-GBSA) | force-field + continuum | OpenMM/Amber MMPBSA.py | Genheden & Ryde 2015 (error bands 2.6–3.3 kcal/mol); Schneider 2014 §16.2.5.2 | 2.6–3.3 kcal/mol std error |
| h01 phase 5 (alchemical FEP) | FE method | NAMD / GROMACS | Chipot 2007 §2.8.6; Westermaier-Hubbard §16.2.2 | 1–2 kcal/mol on protein, ≤0.5 kcal/mol on toy |
| h01 phase 3b (fragment pocket-fit) | composite empirical | in-house score_size + LE | Schneider 2014 §6.4.1 (LE); §1.6.1 (empirical) | descriptor-quality, not energy-quality |
| h26 phase 1d (cys triple-mut FEP) | FE method | FEP point-mutation, dual-topology | Chipot 2007 §2.8.6 + soft-core §2.8.5; Westermaier-Hubbard §16.6 Example 16.3 | DTA preferred (size-changing residues) |
Best-practice checklist
- Match scoring family to question. Don’t run MM-PBSA when Vina is enough; don’t trust Vina to rank potency for analogue series.
- Cite class + tool + version. “Vina (Eberhardt 2021) — force-field-class scoring per Schneider 2014 §1.6.1.”
- Document
εassumption. Continuum dielectric (78 water / 1–4 protein) lives in free-energy-methods parameter table; never hardcode in script body. - Consensus scoring (§1.6.1 / §4.2.4.4): when you don’t trust any single function, take the mean of force-field + empirical + knowledge-based ranks. Reduces systematic-error coupling.
- Don’t pretend a docking score is a binding free energy. Even MM-PBSA/GBSA produce relative energies for congeneric series; absolute affinities require FEP/TI per §16.3.
- STRC h01 audit-2026-04-23 lesson: docstrings must label score thresholds as pipeline-specific empirical gates when they don’t trace to a published universal cutoff (e.g., the Vina −5 kcal/mol gate is a positive-control gate, not a CASF threshold). Schneider 2014 reinforces this: §1.6.1 explicitly notes that empirical-scoring weights “are determined by regression analysis” → they live or die by their training set.
Relation to other STRC recipes
- For binding free energy (Δ-Δ free energy by alchemical mutation): Recipe — FEP Point-Mutation Algorithm + Recipe — Soft-Core Potential for Alchemical End Points (Chipot 2007).
- For PMF along a binding-coordinate: Recipe — ABF Adaptive Biasing Force Algorithm (Chipot 2007).
- For pKa shift on binding: Recipe — LRA Method for pKa Shift Calculation (Chipot 2007).
- For fragment efficiency: Ligand Efficiency Metrics Catalog (this book, §6.4.1).
Connections
[part-of]pharmacochaperone[source]2014-schneider-de-novo-molecular-design-book[applies]index[applies]index[see-also]De Novo Design Software Scoring Strategy Catalog[see-also]Recipe — FEP Point-Mutation Algorithm[see-also]Recipe — Bennett Acceptance Ratio Estimator[see-also]Ligand Efficiency Metrics Catalog