2014 Schneider (Ed.) — De novo Molecular Design

Edited monograph, 21 chapters, Wiley-VCH 2014. Provides the layer-above complement to 2007-chipot-free-energy-calculations-book: where Chipot is canonical for binding-free-energy computation, Schneider is canonical for compound construction (de novo, fragment-based, peptide), scoring (receptor-based, ligand-based, multiobjective), and peptide / protein redesign (SME, PSO, ACO; DEE; bioisosterism).

Citation

Schneider, G. (Ed.). De novo Molecular Design. Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim, 2014. ISBN-13 978-3-527-33461-2. xxiv + 545 pp.

TL;DR

Twenty-one expert chapters covering the algorithmic side of computational compound construction. Three load-bearing themes for STRC:

Scoring families and their valid use. Receptor-based scoring divides cleanly into force-field (Eq. 1.10), empirical (Eq. 1.11) and knowledge-based (Eq. 1.12) classes. Each has a distinct error mode; consensus or method-choice based on system properties is now standard. Ligand-based scoring uses pharmacophore / pseudoreceptor / shape similarity. Free-energy methods (Ch. 16) are reserved for late-stage congeneric optimization.
Fragment-based discovery as a property-headroom strategy. Lipinski rule of 5 (drugs) and Congreve rule of 3 (fragments) — verbatim in Table 5.1. Ligand efficiency (LE) is the core metric: optimization should aim at ≥0.3 kcal/mol per heavy atom; a plethora of derived metrics (FQ, SILE, GE, LLE, LELP, LLE_AT, KE) handle context (size, lipophilicity, kinetics). A diverse 1000–20000-member fragment library samples fragment-like chemical space better than millions of HTS compounds sample drug-like space.
Sequence-space search for peptide design. For h09’s peptide-hydrogel hypothesis, Ch. 18 is the methods spine: Shannon-entropy library diversity (Eqs. 18.2–18.4), modified Grantham amino-acid distance matrix (Table 18.1), Simulated Molecular Evolution (SME) with Gaussian mutation (Eq. 18.5), Particle Swarm Optimization, and a worked Ant Colony Optimization example (verbatim pseudocode) producing MHC-I octapeptides at 89%/95% accuracy. Peptide stability modifications (cyclization, stapling, end-capping, glycosylation, PASylation) are catalogued in §18.4.

Numbers that matter

The book is methods-heavy; load-bearing numbers are filter cutoffs, illustrative success rates, and one calibrated benchmark. Force-field parameters and binding constants are not tabulated here — cite the original force-field / SAR papers.

Parameter	Value	Units	Source (page/§/table)	Notes
Lipinski rule of 5 — molecular mass	≤500	Da	Table 5.1 (Ch.5)	drug-like cutoff
Lipinski rule of 5 — H-bond acceptors	≤10	—	Table 5.1	drug-like
Lipinski rule of 5 — H-bond donors	≤5	—	Table 5.1	drug-like
Lipinski rule of 5 — partition coefficient (clogP)	≤5.0	—	Table 5.1	drug-like
Congreve rule of 3 — molecular mass	≤300	Da	Table 5.1	fragment-like
Congreve rule of 3 — H-bond acceptors	≤3	—	Table 5.1	fragment-like
Congreve rule of 3 — H-bond donors	≤3	—	Table 5.1	fragment-like
Congreve rule of 3 — clogP	≤3.0	—	Table 5.1	fragment-like
Congreve rule of 3 — rotatable bonds	≤3	—	Table 5.1	fragment-like (optional)
Congreve rule of 3 — polar surface area	≤60	Å²	Table 5.1	fragment-like (optional)
LE optimization target (Hopkins 2004 convention)	≥0.3	kcal/mol per heavy atom	§6.4.1	typical for fragments selected for elaboration
LE plateau onset (Kuntz max-affinity ceiling)	≈ −1.5 (LE per HA at 15+ HA)	kcal/mol per heavy atom	§6.4.1 [221, 226]	binding-energy contribution levels off
Optimized-drug LE expectation (MW 500 Da, IC50 10 nM)	0.3	kcal/mol per heavy atom	§6.4.1	implies ~38 heavy atoms
Mean per-atom binding-affinity contribution during optimization	0.29	kcal/mol per non-H atom	§6.4.1 [146, 222]	linearity assumption baseline
Target LLE (LiPE) range	5–7 (or higher)	log units	§6.4.1 (4)	optimization goal
LELP “Lipinski-compliant” ceiling	<16.5	—	§6.4.1 (5)	log P / LE ceiling
LELP lead range	−10 to +10	—	§6.4.1 (5)	optimize toward 0
FQ_Scale formula	LE_Scale = −0.064 + 0.873·exp(−0.026·HAC)	—	§6.4.1 (1) [226]	size-corrected LE rescaling
SILE formula	affinity / HAC^0.3	—	§6.4.1 (2) [227]	size-independent LE
LLE_AT formula	(0.11·ln10·RT·(LogP − Log(activity)))/HAC	kcal/mol per HA	§6.4.1 (6) [230]	Astex size-corrected LLE
KE formula	t½ / (0.693·HAC)	(time per HA)	§6.4.1 [233]	kinetic efficiency
Astex generic fragment library size	327	compounds	§6.2.1 [53]	drug-fragment library
Mazanetz et al. biochemical fragment library size	20,000	compounds	§6.2.5	high-concentration FCS+plus screen
Aqueous solubility cutoff (Vernalis fragment library)	≥2	mM	§6.2.2 [66, 69]	removes >50% of vendor fragments
Aqueous solubility cutoff (Mazanetz library)	≥1	mM	§6.2.2 [68]	in-house QSAR filter
Vemurafenib starting library size	20,000	fragments	§6.1 [31, 39–41]	screened at 200 µM
Vemurafenib hit-call threshold	≥30% inhibition at 200 µM	—	§6.1	initial 7-azaindole
Drug-like compound count, MW 300–500 Da (Bohacek estimate)	10²⁰–10²⁰⁰	molecules	§6.2.5 [86]	combinatorial estimate
Reymond chemical universe (GDB-17, ≤17 heavy atoms)	166 × 10⁹	molecules	§6.2.5 [87], §1.5	enumerated
GDB-13 virtual library	970 × 10⁶	molecules	§6.2.1 [64, 65]	for MPO de novo design
Drug-like 30-atom space (Durrant ch.5)	10⁶³	molecules	§5.2.2 [19]	combinatorial estimate
Fragment 12-atom-or-less space	~10⁷	molecules	§5.3.1.1 [66]	combinatorial estimate
Fragment hit rate (LE > 0.3 in TSA/fxnal screen)	~5	%	§5.3.3 [8]	typical
HTS hit rate (Pilzulkil case study)	0.1	%	§5.2.3 [26]	typical, low
Fragment soaking concentration (X-ray crystallography)	25–100	mM	§5.3.2.6 [18, 64]	cocktail soaking
SPR fragment detectability lower bound	≥100	Da	§5.3.2.3 [8]	mass change limit
NMR fragment detectability protein-mass upper bound	≤40	kDa	§5.3.2.5 [23]	protein-detected NMR
MS fragment detectability protein-mass upper bound	≤100	kDa	§5.3.2.4 [23]	electrospray
Receptor concentration for protein-NMR fragment screen	>2	mg	§5.3.2.5 [23, 73]	unless cryoprobes
Ligand concentration for ligand-detected NMR	1–5	mM	§5.3.2.5 [1, 18]	cocktail screening
Phenprocoumon as Astex-rule-of-3 starting fragment	MW≤300, clogP≤3, HBD≤3	—	§6.1 [28, 30]	first FBDD-derived drug (Tipranavir, 2005)
Vemurafenib (PLX4032) reached market	2011	year	§6.1 [31]	first drug developed de novo from fragment screen
HCV helicase inhibitor 5 IC50	260	nM	§1.7 [206]	LigBuilder optimization
TOPAS CB1 inverse agonist 7 Ki	4	nM	§1.7 [207, 209]	from Ki=1500 nM design 6
Plk1 inhibitor compound 16 EC50	4	µM	§1.7 [217, 218]	LE = 0.66 (Eq. 1.8) — DOGS scaffold-hop
Aurora A inhibitor 18 IC50	3	µM	§1.7 [219]	DOGS molecule-grow from 17 (~10 µM)
ER de novo design — explicit-solvent FE hit-recovery rate	83.3	%	§16.6 Example 16.1 [4]	5/6 actives top-ranked (vs 37.5% for de novo scoring)
ER FE std-dev (explicit / implicit solvent)	1.0 / 0.7	kcal/mol	§16.6 Example 16.1 [4]	comparable accuracy
HIV-RT optimization endpoint potency	55	pM	§16.6 Example 16.2 [261]	from 5 µM docking hit (Jorgensen group)
FEP precision in well-localized perturbation (STA vs DTA)	STA 8–10× more precise	—	§16.6 Example 16.3 [268]	for congeneric series
T4 lysozyme L99A/M102Q absolute-FE RMS error	1.8	kcal/mol	§16.6 [219]	ITC reference
T4 lysozyme L99A/M102Q relative-FE RMS error (catechol)	1.1	kcal/mol	§16.6 [219]	most accurate class
Predicted-vs-experimental pose RMSD (catechol class)	1.2	Å	§16.6 [219]	post-hoc X-ray check
TI relative-FE precision (Westermaier outlook)	<0.1	kcal/mol	§16.9 [101]	maximum reachable
Shannon entropy peptide-library cardinality	4.32	bit (max for 20 residues)	§18.2.1 Fig.18.7	log₂(20)
MHC-I octapeptide ACO sequence-space	25.6 × 10⁹ (= 20⁸)	sequences	§18.3	decision space
ACO H-2K^b stabilizing accuracy	89	%	§18.3	designed peptides confirmed
ACO H-2K^b nonstabilizing accuracy	95	%	§18.3	designed peptides confirmed
ACO pheromone initialization	0.05	per residue position	§18.3 pseudocode	uniform prior
ACO pheromone bounds	[0.1, 0.9]	—	§18.3 pseudocode	escape early convergence
ACO update factor formula	(Fitness − 0.5)/100	—	§18.3 pseudocode	linear with fitness
α-conotoxin MII cyclic-derivative plasma stability gain	+15–20	%	§18.4.1 [76]	EndoGluc protease test
α-conotoxin cyclic distance to bridge (N-to-C terminus)	~11 (≤15)	Å	§18.4.1	cyclization geometry constraint
α-conotoxin cMII-6/7 IC50 (nicotinic acetylcholine receptor)	~1	µM	§18.4.1 [76]	activity preserved post-cyclization
Bioster database transformation count (v12.1, 2014)	~26,000	bioisosteric pairs	§17.3.1.1	bioisostere knowledge base
Cambridge Structure Database (CSD) entries (2014)	541,748	crystal structures	§17.3.1.2 [13]	drug-like subset ~60,000
ChEMBL distinct compounds (2014)	1,213,239	compounds	§17.3.1.3 [16]	bioactivity database
ChEMBL bioactivity measurements (2014)	10,129,256	data points	§17.3.1.3	over 9,003 targets
CATS pharmacophore-pair vector length	150	bits	§17.3.2.2 [25]	1–10 bond distances × 15 type pairs

Method essentials

Per-chapter takeaways (only what STRC needs to either use or cite):

Ch. 1 (Schneider, Baringhaus): receptor-based scoring decomposes into three classes — physically motivated FFs (Eq. 1.10 LJ + Coulomb), empirical regression scoring (Eq. 1.11 weighted sum of interaction-type counts), and knowledge-based scoring (Eq. 1.12 Boltzmann inversion of atom-pair frequencies). Ligand-based scoring uses pharmacophore/pseudoreceptor/shape descriptors. Table 1.3 catalogs ~40 named de novo design programs by year and scoring class — useful provenance for any “we used X-style scoring” claim. Fragment-based assembly relies on additivity (linker bond ≈ free) — but Fig. 1.22 documents non-additivity to −14 kJ/mol (factor Xa Ki=2 nM). Reaction-driven assembly (RECAP, DOGS, SYNOPSIS) suggests synthesis routes alongside structures.
Ch. 5 (Durrant, Amaro): Pilzulkil/Goode tutorial. Distills HTS vs FBDD into protocols. Three fragment-optimization strategies: linking (rare success — linker rigidity is hard), merging (requires overlap), and growing (most reliable — anchor fragment + medicinal-chemistry expansion). Click-chemistry (azide-alkyne Huisgen → 1,2,3-triazole) is the prototypical synthesis route for combinatorial fragment-grow. Detection-method matrix in §5.3.2 covers six biophysical assays with sensitivity/protein-consumption/MW-limit profiles.
Ch. 6 (Mazanetz, Law, Whittaker): the FBDD primer. Library-design constraints: rule of 3, ≥1–2 mM solubility, REOS substructure filter (PAINS), 2D fingerprint diversity selection (hole-filling or iterative removal). Screening-method choice: X-ray (highest information, ≤1000 fragments), NMR (40-kDa protein limit), SPR (KD + binding kinetics), thermal-shift (cheap, noisy), ITC (full thermodynamics, expensive in protein), high-concentration biochemical (functional readout, false-positive-prone, ≥20,000 compounds). Efficiency-metrics catalog §6.4.1 is the cross-cutting decision tool — captured verbatim in Ligand Efficiency Metrics Catalog.
Ch. 16 (Westermaier, Hubbard): decision matrix for FE methods. MM-PBSA / MM-GBSA: VS rescore, large structural change tolerated, 5–8 kcal/mol error band. LIE: empirical, depends on training set, treats electrostatics well. TI / FEP: best for congeneric small-modification series, ≤0.5 kcal/mol achievable on toy systems, ~1–2 kcal/mol on protein systems. PMF: needed when reaction coordinate matters (binding pathway, induced fit). Best practices §16.8: soft-core potentials (Shirts–Pande parameters); never leave partial charge while turning off LJ; transform electrostatics and LJ separately; insert/delete is less efficient than mutate. STA is 8–10× more precise than DTA for small congeneric perturbations; DTA is mandatory when geometries differ (e.g., size-changing mutations).
Ch. 17 (Firth, Blagg, Brown): bioisostere = “groups or molecules with chemical and physical similarities producing broadly similar biological properties” (Thornber 1979; Burger). Three replacement classes: knowledge-based (Bioster, CSD, ChEMBL/MMP, SwissBioisostere); descriptor-based (CATS pharmacophore pairs §17.3.2.2; Hammett σ + Hansch π Craig plot §17.3.2.1); shape-based. Drug Guru fully enumerates SMIRKS-rule space; IADE iteratively searches.
Ch. 18 (Hiss, Schneider): the spine for h09 peptide design. Shannon entropy (Eqs. 18.2–18.4) quantifies library diversity in bits; max for 20-residue alphabet is log₂(20) = 4.32 bit per position. Diversity vs hit-rate is monotonic — Fig. 18.4(b) shows 10 antibody-binding libraries, increasing entropy → decreasing hit count. Three nature-inspired algorithms: SME (Gaussian-distance mutation, Eq. 18.5, Table 18.1 distance matrix), PSO (social + personal memory; not yet applied to peptides), ACO (verbatim pseudocode in §18.3, MHC-I octapeptide design with ANN fitness — accuracies 89% / 95%). Peptide stability mods §18.4: backbone cyclization (≤15 Å termini distance, +15–20% plasma stability), all-hydrocarbon stapling (Verdine; (i,i+3) cross-links least helix-distorting), end-capping (N-acetylation / C-amidation; PASylation as PEG alternative), glycosylation (GlcNAc enzymatic transglycosylation).
Ch. 19 (Saven): sequence-search algorithms for protein design — Monte Carlo, Dead-End Elimination (DEE; Desmet 1992), Self-Consistent Mean Field (SCMF; Koehl–Delarue 1994), probabilistic / FASTER (Allen–Mayo 2010). Application zoo: Mayo’s Top7 fold (Kuhlman 2003), DeGrado’s Due Ferri / four-helix metalloproteins, Baker’s Kemp eliminase / Diels-Alder enzyme / Δgliadin peptidase, water-soluble KcsA / nicotinic-AChR analogs. Useful as method survey for h26 cysteine engineering and any future de novo bridge-domain design.

Limitations

Methods anthology, not a parameter handbook. For force-field constants, water models, and FF benchmarks, cite the original force-field papers.
Pre-AlphaFold (2014); zero coverage of structure prediction by deep learning. For h26 / h01 modern work, AF3 / RoseTTAFold / Boltz are NOT in this book — use 2024+ literature.
Chapter 1 software catalog (Table 1.3) ends in 2013. Modern de novo programs (REINVENT, Bidd, MolGAN, Pocket2Mol, DiffDock, etc.) absent.
Free-energy chapter (16) cites Chipot 2007 as reference and is complementary, not redundant. Use both: Chipot for theory, Westermaier-Hubbard for the LO-stage decision matrix and pharmaceutical case studies.
No quantitative benchmark tables for binding-affinity prediction methods. RMS errors quoted are illustrative single-system values (e.g., T4 lysozyme L99A/M102Q absolute FE = 1.8 kcal/mol).
Bioster, CSD, ChEMBL counts are 2014 snapshots. ChEMBL is now ChEMBL34+ (2024) with millions more activities.

Relevance to STRC

index — primary consumer.
- Phase 3c v4 fragment-grow on 3-amino-benzofuran-2-COOH scaffold should follow §6.5–6.7 (linking/merging/growing) decision tree → see Recipe — Fragment Optimization Linking Merging Growing. Growing is the right strategy for a single-anchor pharmacophore.
- Phase 4 receptor-based scoring (Vina, AutoDock) belongs to the force-field family (Eq. 1.10). Phase 4f MM-GBSA is a physically motivated continuum-corrected empirical scoring — distinct from Phase 4b Vina. Citing Schneider 2014 §1.6.1 + Westermaier-Hubbard §16.3 in phase4f.py docstring would close a long-standing literature gap.
- Phase 3b fragment filter pipeline already uses Congreve rule-of-3 (per pharmacochaperone row 11, audit “fixed”). Now the citation backs it: Schneider-Baringhaus Ch. 1 + Durrant-Amaro Ch. 5 Table 5.1.
- Hit-prioritization from v4 fragment library should track LE and LLE_AT (Astex size-corrected LiPE) per §6.4.1. See Ligand Efficiency Metrics Catalog — LE alone biases toward small fragments; LLE filters out lipophilic-promiscuity traps.
- Phase 7 / 8 cross-target panel: §17.4 Drug Guru + IADE workflow templates are useful for proposing bioisostere replacements when off-target hits force scaffold modification.
index — primary consumer for peptide design.
- For Phase 2c WH2-bundling and any RADA16 / EAK16 sequence variation, ACO with ANN fitness (Ch. 18.3) is a literature-first approach — captured verbatim in Recipe — Ant Colony Optimization for Peptide Sequence Design.
- Library-diversity decisions (e.g., how many sequences to test for self-assembly) should cite Shannon entropy framework — see Fragment Additivity Assumption and Superadditivity (relevance: peptide-blocked self-assembly is non-additive; treat carefully) and the diversity discussion in the ACO recipe.
- Therapeutic-stability path: any peptide entering rabbit / mouse delivery experiments needs §18.4 modifications — see Recipe — Therapeutic Peptide Stability Modifications.
- Amino-acid mutation severity for SME / ACO uses the modified Grantham matrix from Table 18.1 — verbatim in Amino Acid Physicochemical Distance Matrix Grantham Modified. Useful for h09 ablation series (Glu → Gln, Lys → Arg conservative swaps).
index — secondary consumer.
- Cysteine-engineering bioisostere search: §17.3.2 descriptor methods (CATS, Wagener-Lommerse fragment pharmacophore) provide a literature-first basis for proposing alternative crosslinking residues if the AF3-predicted A1078C/S1080C/S1579C disulfides do not stabilize the dimer.
- Side-chain repacking strategy for the cysteine triple mutant: §19.2.2 + §19.2.6 (DEE / SCMF / probabilistic search) — useful method-list when justifying which Rosetta protocol to use.
STRC Computational Scripts Inventory — every receptor-based scoring script (phase4*, phase5*) can now claim a textbook citation for its scoring-function class via this paper note.

Connections

[part-of] pharmacochaperone
[part-of] free-energy-methods
[part-of] rada16-geometry
[source] 2014-schneider-baringhaus-de-novo-molecular-design-book
[applies] index
[applies] index
[applies] index
[see-also] 2007-chipot-free-energy-calculations-book
[see-also] Lipinski Rule of Fives vs Congreve Rule of Threes Reference Table
[see-also] De Novo Design Software Scoring Strategy Catalog
[see-also] Ligand Efficiency Metrics Catalog
[see-also] Amino Acid Physicochemical Distance Matrix Grantham Modified
[see-also] Recipe — Receptor-Based Scoring Function Selection
[see-also] Recipe — Fragment Library Filtering Pipeline
[see-also] Recipe — Fragment Optimization Linking Merging Growing
[see-also] Recipe — Ant Colony Optimization for Peptide Sequence Design
[see-also] Recipe — Therapeutic Peptide Stability Modifications
[see-also] Fragment Additivity Assumption and Superadditivity
[see-also] Druggability vs Ligandability Distinction

STRC Research

Explorer

2014-schneider-de-novo-molecular-design-book