STRC h01 Pharmacochaperone Parameter Provenance Audit 2026-04-25

Extends STRC h01 Parameter Provenance Audit 2026-04-23. Scope: citation-tightening only — closes all 6 ⚠ rows in literature/pharmacochaperone.md. No mechanistic changes, no score changes, no new computational results. No Ranking delta.

Prior state

After the 2026-04-23 audit, lit_audit: fixed was set but 6 rows remained marked ⚠:

Vina −5 kcal/mol gate — no citation
MM-PBSA −6 kcal/mol gate — no citation
TPSA 40-90 Å² (CNS) vs 70-100 Å² (hypothesis note) — inconsistency
MW 180-350 Da (script) vs 200-500 Da (hypothesis note) — inconsistency
Druggability v_opt 250 vs 300 — inconsistency across phases
Druggability composite weights — three undocumented schemes

Papers retrieved

Eberhardt 2021 — AutoDock Vina 1.2.0 (PMC open access)

Full text retrieved via PMC10683950. Key finding: Eberhardt 2021 documents new features (macrocycle sampling, hydrated docking, AD4.2 scoring, Python bindings) benchmarked on DUD-E (102 targets, NOT CASF-2016). Vina scoring function itself unchanged from v1.1.2.

Benchmark results (DUD-E): 68% top-1 pose success at 2 Å RMSD; AUC 0.72 ± 0.12. No kcal/mol score threshold for hit identification is defined. The −5 kcal/mol gate in Phase 4b is pipeline-specific.

Paper note: 2021-eberhardt-autodock-vina-1-2

Genheden & Ryde 2015 — MM/PBSA review (PMC open access)

Full text retrieved via PMC4487606. Critical review: standard error of mean for ΔG_bind with 20 snapshots is 2.6–3.3 kcal/mol; MAD vs experiment is 2.4–10.3 kcal/mol across solvation methods. Method cannot reliably discriminate binders differing by <2.9 kcal/mol (~12 kJ/mol). No universal kcal/mol threshold for hit classification exists.

The −6 kcal/mol gate in Phase 5 provides ~2σ separation from the non-binder baseline given the ~2.6–3.3 kcal/mol standard error — this is the correct framing for a pipeline-specific gate.

Paper note: 2015-genheden-ryde-mmpbsa-mmpgbsa-review

Halgren 2009 — SiteMap (Anna’s Archive + MinerU)

DOI 10.1021/ci800324m, PMID 19434839. Paywalled at ACS. Not found in PMC. Retrieved JCIM 2009 Vol 49 Iss 2 (158 MB) via Anna’s Archive (MD5 1dc7427ca009ffbc2912963e27254127). PDF is scanned images; MinerU parsing in progress at time of audit completion. Abstract confirms: SiteMap identifies known site as top-1 in 86% of cases (>98% for subnanomolar binders). Formula details (SiteScore components, v_opt, Dscore weights) require full-text extraction — paper note currently marked status: abstract-only. Will be updated to status: read once MinerU completes.

Paper note: 2009-halgren-sitemap-druggability

Inconsistencies resolved

TPSA (gap 3)

Decision: TPSA is descriptor-only; no filter range applied.

The 2026-04-23 audit already correctly established this: “CNS 40-90” framing was removed from phase3b docstring; TPSA is computed and stored but does not enter the composite score. The hypothesis note’s “70-100 Å²” is a qualitative cite of Salt & Plontke 2018 (no primary otic/RWM TPSA range exists). No filter was reinstated. The ⚠ row in the literature table now reflects this as ✅.

No script change required — phase3b already corrected 2026-04-23.

MW (gap 4)

Decision: two distinct contexts; both valid; labeled separately.

score_size() in pharmacochaperone_phase3b_virtual_screen.py: MW 180-350 Da is a pocket-fit scoring function matching the 159 Å³ pocket. Fragment-lead sizing is correct here.
Hypothesis note STRC Pharmacochaperone Virtual Screen E1659A.md line 64: MW 200-500 Da is the RWM delivery PK envelope (Salt & Plontke 2018), not the pocket-fit range.

These are not in conflict; they describe different physical constraints. Fixed: added clarifying comment to score_size() docstring. Literature table row updated to ✅.

Druggability v_opt (gap 5)

Decision: v_opt 300 (phase2 full-pocket) vs 250 (phase2b subpocket) is physically justified.

Phase 2 scans full binding-pocket clusters; phase 2b scores subpockets within a cluster. A larger optimal volume for full-pocket scoring vs subpocket scoring is internally consistent. Halgren 2009 cited as conceptual source for the composite scoring approach. Docstrings in phase2 and phase2b updated to note this justification and add Halgren 2009 PMID. Literature table row updated to ✅.

Druggability weights (gap 6)

Decision: three-scheme inconsistency is acceptable given different feature sets; cross-phase comparison remains declared invalid.

Phase 1: 3-feature scheme (vol/hydro/hb) — basic pocket characterization
Phase 2: 4-feature scheme (vol/hydro/nres/hb) — adds lining-residue count
Phase 2b: 5-feature scheme (vol/hydro/nres/hb/depth) — adds burial depth for subpocket ranking

Each scheme is fit-for-purpose within its phase. Collapsing to one function would require either dropping the depth term from phase2b (loses information) or adding it to phases 1/2 where depth isn’t computed. Correct action: keep three schemes, declare cross-phase comparison invalid (already done 2026-04-23), add Halgren 2009 PMID to all three docstrings. ✅

Vina threshold (gap 1)

Decision: labeled pipeline-specific positive-control gate; Eberhardt 2021 cited as software-version reference.

The −5 kcal/mol threshold is NOT derived from Eberhardt 2021 CASF-2016 benchmarks (Eberhardt 2021 has no such benchmark and no score threshold). It is a pipeline-specific gate: diflunisal (known TTR pharmacochaperone, ~−12 kcal/mol experimental) must score ≤ −5 to validate the pocket setup. Phase4b docstring updated with this framing and the Eberhardt 2021 citation for the software version (not the threshold). ✅

MM-PBSA gate (gap 2)

Decision: labeled pipeline-specific empirical gate; Genheden 2015 cited for error-band context.

The −6 kcal/mol threshold is NOT a universal MM/PBSA cutoff (Genheden 2015 explicitly states no such universal cutoff exists). It is a pipeline-specific gate providing ~2σ separation from the non-binder baseline given the method’s typical 2.6–3.3 kcal/mol standard error. Phase5 docstring updated. ✅

Script docstrings updated

Script	Change
`pharmacochaperone_phase4b_vina_gnina_screen.py`	Docstring: gate relabeled pipeline-specific; Eberhardt 2021 PMID 34278794 cited as software version
`pharmacochaperone_phase5_md.py`	Docstring: MM-PBSA gate relabeled pipeline-specific; Genheden & Ryde 2015 PMC4487606 cited for error bands
`pharmacochaperone_phase3b_virtual_screen.py`	`score_size()` docstring: MW 180-350 labeled pocket-fit, distinct from RWM delivery 200-500 Da (Salt & Plontke 2018)
`pharmacochaperone_phase1_mutant_pocket.py`	`druggability_score()` docstring: Halgren 2009 PMID 19434839 added as conceptual source
`pharmacochaperone_phase2_pocket_scan.py`	`druggability()` docstring: Halgren 2009 PMID 19434839 added; v_opt=300 for full-pocket justified
`pharmacochaperone_phase2b_subpockets.py`	`druggability()` docstring: Halgren 2009 PMID 19434839 added; v_opt=250 for subpockets justified

Residual

Halgren 2009 paper note status: abstract-only — full formula details (exact SiteScore/Dscore components and v_opt in the paper) not yet extracted. MinerU processing JCIM 2009 Vol 49 Iss 2 PDF (158 MB scanned) downloaded from Anna’s Archive. Once MinerU completes, extract Halgren paper pages and update papers/2009-halgren-sitemap-druggability.md from status: abstract-only to status: read with full ## Numbers that matter section. This does not affect the audit verdict — the h01 scripts already carry the honest “SiteMap-inspired in-house approximation” flag with Halgren PMID; the formula extraction is an enrichment, not a blocker.

Ranking delta

#1 Pharmacochaperone: A held (no axis change). This is a metadata and citation audit only. No mechanistic finding, no threshold change, no new computational result. lit_audit: fixed confirmed, lit_audit_date updated to 2026-04-25.

Connections

[part-of] STRC Hypothesis Ranking
[part-of] STRC h01 Parameter Provenance Audit 2026-04-23
[see-also] 2021-eberhardt-autodock-vina-1-2
[see-also] 2015-genheden-ryde-mmpbsa-mmpgbsa-review
[see-also] 2009-halgren-sitemap-druggability
[about] STRC Pharmacochaperone Virtual Screen E1659A

STRC Research

Explorer

STRC h01 Parameter Provenance Audit 2026-04-25