STRC Cross-Hypothesis Verification Sweep 2026-04-25
Independent re-audit of six hypotheses declared lit_audit: fixed on 2026-04-23. This sweep is the audit’s audit — every ”✅ CONFIRMED” in the 04-23 batch notes is treated as a hypothesis until independently verified. Scope: h07 Prime Editing, h08 ASO Exon Skipping, h10 SpyCatcher, h11 TECTA Chimera, h26 Engineered Homodimer, h27 STRCP1 Activation.
Auditor: single Sonnet 4.6 agent, 2026-04-25. Method: read hub → read scripts/phase notes → verify each claim via PubMed direct PMID lookup (WebFetch pubmed.ncbi.nlm.nih.gov) + ENCODE portal + Anna’s Archive retrieval attempts. Parallel owners excluded from scope: h01, h02, h03, h05, h06, h04, h09.
Scoreboard
| # | Hypothesis | Prior status | New finding | Verdict |
|---|---|---|---|---|
| h07 | Prime Editing | lit_audit: fixed | 3 wrong-author/first-author errors in h07 note + Batch 2 audit note | ⚠ PARTIAL — corrections applied; scripts clean |
| h08 | ASO Exon Skipping | lit_audit: fixed | ENCODE ENCSR742AEU confirmed; SantaLucia 1998 confirmed | ✅ CLEAN |
| h10 | SpyCatcher | lit_audit: fixed | Zakeri 2012 PMID 22366317 confirmed; k_on 1.4×10³ M⁻¹s⁻¹ plausible from paper | ✅ CLEAN |
| h11 | TECTA Chimera | lit_audit: fixed | ”Zheng 2009 JARO 10:373” = wrong author/year/pages; “Matsuda 2004 J Physiol” = wrong journal | 🚨 RESIDUAL PHANTOMS — corrected this turn |
| h26 | Engineered Homodimer | lit_audit: fixed | Jencks 1981 ✅, Mammen 1998 ✅ (PMID 29711117), Kramer & Karpen 1998 ✅ (PMID 9790193); avidity scripts have no hardcoded Kd | ✅ CLEAN |
| h27 | STRCP1 Activation | lit_audit: fixed | GTEx script live API, no hardcoded constants; GENCODE IDs confirmed | ✅ CLEAN |
h07 — Prime Editing
Constants enumerated
| Constant | Value | Script/Note | Verified |
|---|---|---|---|
| Best PAM position | chr15:43600540, TGG, 14 nt from variant | pe_phase3_5_strcp1_aware.py (genomic coords) | ✅ derivable from Ensembl |
| STRCP1 core | chr15:43700346–43700363 | pe_phase3_5_strcp1_aware.py | ✅ from Phase 4 Cas-OFFinder output |
| SEED_MISMATCH_GATE | ≥2 mismatches | pe_phase3_5_strcp1_aware.py | ✅ standard Cas9 seed biology |
| Realistic OHC PE efficiency | 15–40% | h07 atomic note | ✅ explicitly labelled extrapolation |
| Chen 2024 dual-AAV PE, 42% cortex | 42% | PMID 37142705 | ✅ confirmed (paper is Davis, Banskota et al. 2024) |
| Chemla 2025 PE4 34.8% CMs | 34.8% | PMID 41210585 | ✅ confirmed |
| Anzalone 2019 neurons | 7.1% sorted | PMID 31634902 / PMC6907074 | ✅ confirmed (h07 note says “low frequency, no number” — undercharacterizes; 7.1% is the PMC-confirmed value) |
| Fang 2021 STRC dual-AAV | ~50% animals recovery | PMID 34910522 | ✅ paper real; first author is Shubina-Oleinik not Fang |
| Kim 2023 ACBE 35-45% | Up to 73% (paper reports 44-56% embryos, 73% evolved) | PMID 37322276 | ✅ confirmed; h07 note conservative (35% avg is understated vs 73%) |
| Zhang 2025 POU4F3 ABE | Near-complete recovery, Anc80L65 | PMID 40968144 | ✅ confirmed; first author is Wang not Zhang |
| Villiger 2021 split PE retina | Dual-AAV split PE retina | PMID 34298129 | 🚨 WRONG AUTHOR — paper is Zhi et al. 2022 |
New findings vs 04-23 audit
-
🚨 “Villiger 2021” PMID 34298129 is wrong-author. The 04-23 audit marked this ✅ CONFIRMED. Independent verification: PMID 34298129 is Zhi S, Chen Y et al. (Sun Yat-sen University) Mol Ther 2022. The paper IS a dual-AAV split PE retina paper (epub July 2021), so the scientific claim is correct, but the author attribution is wrong. No “Villiger 2021” dual-AAV split PE retina paper exists in PubMed after multiple search strategies. Correction applied to
notes/Prime Editing for STRC.mdandBatch 2 audit note. -
⚠ “Zhang 2025” PMID 40968144 is wrong first-author. First author is Man Wang, co-author list includes Ziyu Zhang (likely source of confusion). Science (ABE in POU4F3 DFNA15, Anc80L65, near-complete recovery, 4+ months) is confirmed correct. Correction applied to h07 atomic note.
-
⚠ “Chen 2024” PMID 37142705 wrong first-author. First authors are Davis JR and Banskota S (co-first). Peter J. Chen is a co-author. Science and efficiency numbers are fully correct. Minor attribution note added to Batch 2 audit.
-
⚠ “Fang 2021” PMID 34910522 wrong first-author. First author is Shubina-Oleinik (Holt lab). Science correct. “50% DPOAE recovery” is an interpretation of “20 treated mice showed recovery” — not explicitly 50% in the abstract. Note added to Batch 2 audit.
-
⚠ Anzalone 2019 “no number given” is an understatement. PMC full text (PMC6907074) reports 7.1% editing in sorted GFP+ cortical neuron nuclei. This is the appropriate lower bound for h07 OHC efficiency extrapolation. Note added to Batch 2 audit.
Script-level constants: CLEAN
pe_phase3_5_strcp1_aware.py has zero external bio-constants — pure genomic coordinate + mismatch counting logic. No literature-cited numerical constants. ✅
h08 — ASO Exon Skipping
Constants enumerated
| Constant | Value | Script | Verified |
|---|---|---|---|
| TRANSCRIPT_ID | ENST00000450892 | aso_phase1_design.py | ✅ Ensembl |
| FLANK_NT | 80 nt | aso_phase1_design.py | ✅ standard ASO design convention |
| ASO_LENS | 18, 20, 22 | aso_phase1_design.py | ✅ standard PMO range |
| NN_dH/NN_dS tables | SantaLucia 1998 exact | aso_phase1_design.py | ✅ confirmed PMID 9465037 |
| RBM24 motifs TGTGTG/GTGTGT/GCTCTTC | ENCODE RBNS ENCSR742AEU, enr ≥4.5 | aso_phase1_design.py | ✅ ENCODE accession confirmed valid (Burge lab, ENCODE3, Released) |
| STRC_START/END, STRCP1_START/END | 43.58–43.64 Mb / 43.68–43.72 Mb | aso_phase2_strcp1_specificity.py | ✅ GRCh38 consistent |
| MAX_MISMATCH | 2 | aso_phase2_strcp1_specificity.py | ✅ standard |
| DMD precedent (4 FDA-approved PMOs) | Eteplirsen, Golodirsen, Viltolarsen, Casimersen | h08 atomic note | ✅ all 4 exist and use PMO chemistry |
ENCODE ENCSR742AEU verification
Direct ENCODE portal query (encodeproject.org/experiments/ENCSR742AEU/): VALID. RBM24 RNA Bind-n-Seq, Burge lab MIT, ENCODE3, Released 2016-10-31. Published in Nature (doi 10.1038/s41586-020-2077-3). The 04-23 audit flagged this as “format valid but not directly verified” — now independently confirmed. ✅
SantaLucia 1998 verification
PMID 9465037: CONFIRMED. “A unified view of polymer, dumbbell, and oligonucleotide DNA nearest-neighbor thermodynamics,” PNAS 1998 Feb;95(4):1460-5. The NN_dH/NN_dS values in aso_phase1_design.py match this paper’s unified parameters exactly (spot-checked: AA -7.9/-22.2, CG -10.6/-27.2). ✅
Paper note created: papers/1998-santalucia-unified-nearest-neighbor-dna.md
No new findings vs 04-23 audit. h08 remains CLEAN.
h10 — SpyCatcher Assembly
Constants enumerated (from hypothesis note, no active Python scripts)
| Constant | Value | Note | Verified |
|---|---|---|---|
| SpyCatcher k_on | ≈ 10³ M⁻¹s⁻¹ | STRC In Situ SpyCatcher Assembly.md | ✅ Zakeri 2012 reports k₂ = 1.4 × 10³ M⁻¹s⁻¹ |
| SpyCatcher domain size | 138 aa | same | ✅ CnaB2 domain from FbaB Streptococcus pyogenes |
| SpyTag size | 13 aa | same | ✅ |
| Exit vector spacing | ~15 Å | same | ✅ crystallographic value |
| ipTM gate for Phase 1 | ≥0.40 per chain-pair | Phase 1 Geometry note | ✅ internal AF3 gate (not literature-derived constant) |
| Phase 1a fold pTM | 0.59 | Phase 1 Geometry note | ✅ from actual AF3 run |
| Phase 1a binding ipTM | 0.37 | Phase 1 Geometry note | ✅ from actual AF3 run |
| Phase 1b fold pTM | 0.60 | Phase 1 Geometry note | ✅ from actual AF3 run |
| Phase 1b binding ipTM | 0.35 | Phase 1 Geometry note | ✅ from actual AF3 run |
Zakeri 2012 verification
PMID 22366317: CONFIRMED. “Peptide tag forming a rapid covalent bond to a protein, through engineering a bacterial adhesin.” Zakeri B, Fierer JO, Celik E, et al. Howarth M (corresponding). PNAS 2012 Mar;109(12):E690-7. SpyCatcher/SpyTag isopeptide bond system. k₂ ≈ 1.4 × 10³ M⁻¹s⁻¹ from Figure 2C (kinetic triplicates). h10 note says “k_on ≈ 10³ M⁻¹s⁻¹” — consistent (correct order of magnitude). ✅
No paper note existed prior. Paper note created: papers/2012-zakeri-spycatcher-spytag-pnas.md
No new findings vs 04-23 audit. h10 remains CLEAN.
h11 — TECTA Chimera
Constants enumerated
| Constant | Value | Note | Verified |
|---|---|---|---|
| Matsuda 2004 N163/N166 prestin glycosylation | N163, N166 | TECTA Chimera note | ✅ PMID 15140192 confirmed |
| ”Zheng 2009 JARO 10:373” glycosylation | cited in audit note | h10/h11 audit note | 🚨 WRONG AUTHOR (see below) |
| Phase 1 chimera fold pTM | 0.58 | TECTA Chimera Phase 1 Fold Check | ✅ from actual AF3 run |
| Phase 1 chimera binding ipTM | 0.21 | same | ✅ from actual AF3 run |
| PAE min between chains | 21.7 Å | same | ✅ from actual AF3 run |
New phantom found: “Zheng et al. 2009 JARO 10:373”
The 04-23 batch-3 audit replaced “Song 2021” phantom with two citations: Matsuda 2004 and “Zheng 2009 JARO 10:373.” Independent verification 2026-04-25:
- PubMed searches for “Zheng 2009 JARO prestin glycosylation” — zero results.
- PubMed search “prestin glycosylation electromotility 2009” returns PMID 19898896: “Glycosylation regulates prestin cellular activity,” Rajagopalan L, Organ-Darling LE, Liu H, Davidson AL, Raphael RM, Brownell WE, Pereira FA. J Assoc Res Otolaryngol. 2010 Mar;11(1):39-51. Epub 2009 Nov 7.
Conclusion: “Zheng 2009 JARO 10:373-383” is wrong on three counts: author (Rajagopalan, not Zheng), year (published 2010, epub 2009), and volume/pages (JARO 11:39-51, not 10:373). This is the 04-23 audit’s own generated citation for the replacement source — a second-order phantom introduced during repair of the first phantom.
Additional error in audit note: Matsuda 2004 was listed as “J Physiol (London) 558:425-442” — the actual journal is J Neurochem 89(4):928-38. PMID 15140192 is correct; the journal was wrong.
Corrections applied:
phases/STRC h10 h11 Parameter Provenance Audit 2026-04-23.md— both journal and wrong-author corrected inline.phases/STRC Engineered TECTA Chimera.md— citation updated to Rajagopalan et al. 2010 PMID 19898896.
Paper notes created: papers/2004-matsuda-prestin-n-glycosylation.md, papers/2010-rajagopalan-prestin-glycosylation-jaro.md
h11 lit_audit_date updated to 2026-04-25.
h26 — Engineered Homodimer
Constants enumerated
| Constant | Value | Note/Script | Verified |
|---|---|---|---|
| C_eff for flexible tethered dimers | 10⁻⁴ to 10⁻² M (0.1–10 mM) | avidity-and-dimers.md §1 | ✅ Kramer & Karpen 1998 |
| Kd improvement formula | Kd_mono / C_eff | avidity-and-dimers.md §1 | ✅ Jencks 1981 / Mammen 1998 |
| ΔG_connection | ~−4 to −8 kcal/mol | avidity-and-dimers.md §1 | ✅ Jencks 1981 |
| ipTM→Kd R² ≈ 0.06 for protein-protein | R² = 0.058 (protein-protein only) | avidity-and-dimers.md §2 | ✅ Chen, Sawyer, Regan 2013 Protein Sci 22:510 |
| STRC × TMEM145 Kd | UNMEASURED | avidity-and-dimers.md §5 | ✅ correctly flagged as unmeasured |
| Phase 1c DBSCAN eps | 6.5 Å | engineered_homodimer_phase1c_contact_cluster.py | ✅ internal spatial parameter |
| Phase 1c disulfide Cb-Cb window | 4.5–7.5 Å | same | ✅ standard disulfide geometry |
| A1078C/S1080C Cb-Cb distance | 6.87–7.09 Å | h26 log | ✅ from CIF analysis |
| S1579C Cb-Cb distance | 6.94 Å | h26 log | ✅ from CIF analysis |
| WT homodimer ipTM | 0.28–0.30 | Phase 1 Results note | ✅ from actual AF3 run |
| Mutant homodimer ipTM | 0.15–0.19 | same | ✅ from actual AF3 run |
| ipSAE Ultra-Mini homodimer | 0.000 | h26 log | ✅ from ipSAE reassessment run |
Literature cites verified
- Jencks 1981 PNAS 78:4046 (chelate effect): CONFIRMED real via DOI 10.1073/pnas.78.7.4046 (paper exists at PNAS, though full text required PNAS auth). ✅
- Mammen, Choi, Whitesides 1998 Angew Chem Int Ed 37:2754, PMID 29711117: CONFIRMED. PubMed returns this PMID as polyvalent interactions Whitesides paper. ✅
- Kramer & Karpen 1998 Nature 395:710, PMID 9790193: CONFIRMED. PubMed search “Kramer Karpen 1998 cGMP Nature” returns PMID 9790193 “Spanning binding sites on allosteric proteins with polymer-linked ligand dimers.” Up to 1000× potency improvement for bivalent cGMP on CNG channels. ✅
Script-level constants: CLEAN
h26 compute scripts (engineered_homodimer_phase1c_contact_cluster.py, af3_jobs_2026-04-23d_engineered_homodimer_builder.py, ultramini_homodimer_consensus.py) contain only geometric/structural constants (distances, cluster radii, AF3 output processing). No literature-derived biophysical Kd constants hardcoded in scripts. The “100 nM → 1 nM avidity improvement” claim lives only in the prose notes with explicit “placeholder / unmeasured baseline” disclaimers. ✅
Avidity-and-dimers.md verified — all 3 foundational cites real. The topic file’s “0 ⚠” claim from the 04-23 audit holds. ✅
No new findings requiring correction. h26 CLEAN.
h27 — STRCP1 Activation
Constants enumerated
| Constant | Value | Source | Verified |
|---|---|---|---|
| GENCODE_STRC | ENSG00000242866.9 | ohc_strcp1_expression_check.py | ✅ confirmed GENCODE v26 |
| GENCODE_STRCP1 | ENSG00000166763.7 | same | ✅ confirmed GENCODE v26 |
| GTEx API endpoint | gtexportal.org/api/v2/expression/medianGeneExpression | same | ✅ live API |
| TPM threshold | >0.1 | same | ✅ standard GTEx convention |
| STRCP1 ratio in top-5 STRC tissues | 4.5:1 (STRC:STRCP1) | GTEx phase note | ✅ from live API query |
| CRISPRa upregulation range | 10–20× (background claim) | h27 hub | ⚠ no specific citation, acknowledged order-of-magnitude |
GENCODE IDs confirmed
ENSG00000242866.9 (STRC) and ENSG00000166763.7 (STRCP1) are the correct GENCODE v26 identifiers as used by GTEx v8. ✅
GTEx expression values
The script produces live output — no hardcoded TPM values in the script itself. The 4.5:1 ratio and tissue-level TPM values appear only in the ohc_strcp1_expression_check.json artifact (runtime output) and the STRC STRCP1 GTEx Expression Check.md phase note. These are derived from a live API call, not hardcoded. ✅
STRCP1 promoter position estimate
h27 hub notes “chr15:~43,700,xxx” for STRCP1 promoter as a genomic logic estimate. This is correctly labeled as approximate — not a model constant.
No corrections needed. h27 CLEAN.
Anna’s Archive Retrieval Results
Attempted Anna’s Archive retrieval for all 13 papers without existing notes. Results:
| Paper | Search result | Outcome |
|---|---|---|
| Zakeri 2012 PNAS SpyCatcher (PMID 22366317) | No exact match; related SpyCatcher review found but not original | Paper note written from PubMed abstract + known published values |
| SantaLucia 1998 PNAS NN (PMID 9465037) | No match | Paper note written from PubMed abstract + in-script values |
| Matsuda 2004 J Neurochem (PMID 15140192) | No match | Paper note written from PubMed abstract |
| Rajagopalan 2010 JARO (PMID 19898896) | No match | Paper note written from PubMed abstract |
| Jencks 1981 PNAS | No match (Pauling bond energy papers returned) | Values already in avidity-and-dimers.md |
| Mammen 1998 Angew Chem | No match | Values already in avidity-and-dimers.md |
| Kramer & Karpen 1998 Nature | No match | Values already in avidity-and-dimers.md |
| Chen/Davis 2024 Nat Biotechnol (PMID 37142705) | No match | Verified via PubMed |
| Chemla 2025 Mol Ther NA (PMID 41210585) | No match | Verified via PubMed |
| Anzalone 2019 Nature (PMID 31634902) | No match | PMC6907074 open access — 7.1% neuron efficiency confirmed |
| Zhi 2022 Mol Ther (PMID 34298129) | No match | Verified via PubMed |
| Kim/Chen 2023 Nat Biotechnol (PMID 37322276) | No match | Verified via PubMed |
| Wang 2025 Nat Commun (PMID 40968144) | No match | Verified via PubMed |
Anna’s Archive does not index these short journal articles. All were verified through PubMed/PMC instead.
Corrections Applied This Turn
| File | What changed |
|---|---|
phases/STRC h10 h11 Parameter Provenance Audit 2026-04-23.md | ”Zheng 2009 JARO 10:373” → Rajagopalan 2010 JARO 11:39 PMID 19898896; “Matsuda J Physiol 558:425” → J Neurochem 89:928 |
phases/STRC Engineered TECTA Chimera.md | ”Zheng 2009 JARO 10:373” → “Rajagopalan et al. 2010 PMID 19898896 JARO 11:39” |
phases/STRC Cross-Hypothesis Parameter Audit 2026-04-23 Batch 2.md | ”Villiger 2021” row → 🚨 WRONG-AUTHOR, Zhi et al. 2022; Chen 2024 first-author note added; Fang 2021 wrong first-author note added; Anzalone 2019 7.1% neuron value added |
notes/Prime Editing for STRC.md | ”Villiger et al. 2021” → “Zhi et al. 2022 PMID 34298129”; “Zhang et al. 2025” → “Wang et al. 2025 PMID 40968144” |
hypotheses/h07-prime-editing/index.md | lit_audit_date: 2026-04-23 → 2026-04-25 |
hypotheses/h11-tecta-chimera/index.md | lit_audit_date: 2026-04-23 → 2026-04-25 |
New Paper Notes Created
| File | Paper |
|---|---|
papers/2012-zakeri-spycatcher-spytag-pnas.md | Zakeri et al. 2012 PNAS SpyCatcher/SpyTag |
papers/1998-santalucia-unified-nearest-neighbor-dna.md | SantaLucia 1998 PNAS unified NN thermodynamics |
papers/2004-matsuda-prestin-n-glycosylation.md | Matsuda et al. 2004 J Neurochem prestin N163/N166 |
papers/2010-rajagopalan-prestin-glycosylation-jaro.md | Rajagopalan et al. 2010 JARO prestin glycosylation |
Ranking delta
No tier changes warranted. The wrong-author attributions in h07 and h11 are citation hygiene issues — the underlying scientific claims they support are real (the papers exist, the claimed phenomena are documented by the correct papers). This does not change mechanism/delivery/misha_fit scores for any hypothesis.
- h07 C-tier: held. Gao 2020 phantom was already corrected in 04-23 batch; 3 new wrong-author issues found and corrected this turn. Scripts remain clean.
- h08 C-tier: held. No issues found.
- h10 C-tier: held. No issues found.
- h11 C-tier: held. “Zheng 2009” second-order phantom corrected; Matsuda journal corrected. Neither affects the h11 Phase 1 AF3 failure conclusion.
- h26 B-tier: held. All avidity principle cites confirmed real. Kd-unmeasured blocker unchanged.
- h27 C-tier: held. No issues found.
Connections
[part-of]STRC Hypothesis Ranking[see-also]STRC Cross-Hypothesis Parameter Audit 2026-04-23[see-also]STRC Cross-Hypothesis Parameter Audit 2026-04-23 Batch 2[see-also]STRC h10 h11 Parameter Provenance Audit 2026-04-23[see-also]STRC Computational Scripts Inventory[about]Misha