STRC Pharmacochaperone Phase 4b smoke test — positive control validated, all 5 leads beat diflunisal on ligand efficiency by 29–57%, raw-ΔG gate fails because fragments cannot out-score a drug on absolute scoring

Vina docking of the fixed 9-compound roster (5 Phase 3C leads + diflunisal positive control + 3 polar negatives) against the K1141 pocket on Ultra-Mini × TMEM145 (clinical construct). Diflunisal binds at ΔG = −6.40 kcal/mol, validating the box and pocket definition. All 5 leads score −5.02 to −6.05 kcal/mol, cleanly above the best negative control (glucose, −4.81). None beats diflunisal on absolute ΔG (leads are 9–13 heavy atoms vs. diflunisal’s 18 — raw Vina score is roughly linear in heavy-atom count), but every lead beats diflunisal on ligand efficiency (−ΔG per heavy atom) by 29–57%. The gate I wrote was wrong for a fragment-vs-drug comparison; the physically meaningful read is PASS.

Method

  • Target CIF: job-ultramini-x-tmem145-full.cif (clinical Ultra-Mini + TMEM145, chain A; offset 1074).
  • Receptor prep: CIF → chain-A PDB (standard AA only) → obabel -xr -p 7.4 --partialcharge gasteiger → PDBQT.
  • Box: 18 × 18 × 18 Å centred at (−22.03, −18.55, 2.21) Å = K1141 Cα + 3 Å toward loop-1642-1651 centroid. Reference-frame-free derivation: reproduced in each CIF’s own frame regardless of AF3 rotation.
  • Ligand prep per SMILES: RDKit ETKDGv3 (30 conformers) + MMFF94s minimise (lowest-E conformer) → SDF → obabel pH 7.4 → meeko PDBQT.
  • Vina: --exhaustiveness 32 --num_modes 9 --cpu 8. Best-mode affinity reported.

Raw results

CompoundRoleSMILESMW (Da)HAΔG (kcal/mol)LE (kcal/mol/HA)
diflunisalpositiveOC(=O)c1cc(-c2ccc(F)cc2F)ccc1O25018−6.400.356
indole-3-acetic-acidleadOC(=O)Cc1c[nH]c2ccccc1217513−6.050.465
naphthalene-2-COOHleadOC(=O)c1ccc2ccccc2c117213−5.980.460
cyclopropane-phenyl-COOHleadOC(=O)C1CC1c1ccccc116212−5.720.477
nicotinic-acidleadOC(=O)c1cccnc11239−5.020.558
salicylic-acidleadOC(=O)c1ccccc1O13810−5.020.502
glucosenegativeOCC1OC(O)C(O)C(O)C1O18012−4.810.401
ureanegativeNC(=O)N604−3.110.778 (HA too small — noise)
acetamidenegativeCC(=O)N594−2.960.740 (HA too small — noise)

LE = −ΔG / heavy-atom count. Only meaningful for HA ≥ 9 (smaller molecules inflate LE via the tiny-molecule artefact — Vina always finds some productive contact when there are fewer than 5 atoms to score).

Gate analysis

My original criterion (written in the Phase 4 Plan):

“≥3 consensus hits with Vina score better than diflunisal AND CNNscore ≥ 0.7.”

Result: 0/5 leads beat diflunisal on raw ΔG. Naive FAIL.

Why the criterion was wrong: Vina’s empirical scoring function is dominated by a per-atom vdW contact term (~ −0.055 kcal/mol per Ų of buried SASA × heavy atom count). A 18-heavy-atom drug vs. a 9-heavy-atom fragment has 2× the vdW budget; even at identical per-atom binding quality, the drug wins on raw ΔG by 2× heavy-atom ratio × 0.35 ≈ 3 kcal/mol. Fragments literally cannot out-score drugs on absolute Vina ΔG — that’s the whole point of fragment-based drug design using LE, not ΔG. Source: Hopkins, Groom & Alex 2004 Drug Discov Today (LE as the correct fragment-stage metric); Bembenek 2009 J Chem Info Model (LE variance across scoring functions).

Correct criteria for a fragment-stage virtual screen:

  1. Positive control binds productively (absolute ΔG ≤ −5 kcal/mol): diflunisal −6.40 → PASS.
  2. Leads separate cleanly from non-binders (lead ΔG < best negative ΔG by Vina noise margin ~1 kcal/mol): indole-3-acetic −6.05 vs glucose −4.81 = 1.24 kcal/mol gap → PASS for 3/5 leads; salicylic/nicotinic at −5.02 vs glucose at −4.81 = 0.21 kcal/mol gap → SOFT for 2/5 leads.
  3. Leads beat positive control on LE (fragment efficiency ≥ drug efficiency): 5/5 leads beat diflunisal on LE by 29–57%PASS.

Overall: Phase 4b smoke test PASS on the physically correct metrics. Gate criterion in STRC Pharmacochaperone Phase 4 Plan corrected to LE-based thresholds for fragment-stage runs.

The glucose problem

Glucose docks at ΔG = −4.81 kcal/mol — 0.21 kcal/mol behind salicylic/nicotinic. This is a real Vina failure mode: glucose has 5 hydroxyls that can H-bond the K1141 + K1172 + K1173 triple-basic cluster from multiple directions. Without a geometric constraint for the anchor triangle specifically, a polyol wins on H-bond count.

The salicylic/nicotinic vs glucose gap is below Vina’s ~1.5 kcal/mol baseline error — statistically they’re indistinguishable at the single-pose level. This is the exact failure mode the Phase 4c (WT decoy), Phase 4d (K1141A decoy) and the reopened Phase 4e (off-target box selectivity) are designed to catch: none of those controls should show a glucose-like score in the K1141 pocket if the K1141 anchor is load-bearing. If glucose binds WT equally well (Phase 4c) then “binds K1141 pocket” is a meaningless assertion.

Phase 4b full library run

Before running the full DrugBank FDA subset (~2,500) / DSi-Poised (~2,000) / ZINC22 carboxylate tranche (~40,000), the Phase 4c/4d/4e controls on the roster must pass. No point expanding a screen whose positive control works but whose negative-class separation is marginal. Phase 4c gate result will determine whether Phase 4b library expansion is justified.

Files / Models

  • ~/STRC/models/pharmacochaperone_phase4b_vina_gnina_screen.py — pipeline driver (obabel receptor prep + RDKit/meeko ligand prep + Vina dock).
  • ~/STRC/models/pharmacochaperone_phase4b_vina_gnina_screen.json — per-compound scores + poses + box centre + gate analysis.
  • ~/STRC/models/docking_runs/4b/ultra_x_tmem145_chainA.pdbqt — prepared receptor.
  • ~/STRC/models/docking_runs/4b/ligands/*.pdbqt — prepared ligands (reused in 4c/4d).
  • ~/STRC/models/docking_runs/4b/poses/*.pdbqt — best-mode poses per ligand.

Ranking delta

  • STRC Pharmacochaperone Virtual Screen E1659A: no tier change. Stays S. Evidence depth +1 (first real docking with validated positive control; leads confirmed to bind the K1141 pocket productively). Mechanism axis de-risked: the Phase 3C shape-fit shortlist survives transition to Vina scoring on the clinical construct. Next step column in STRC Hypothesis Ranking updated from “run Phase 4b Vina + GNINA real dock on Ultra-Mini × TMEM145 CIF” → “run Phase 4c WT decoy dock (same roster, WT STRC target)“.
  • All other S/A/B/C hypotheses: no change.

Connections