STRC Prime Editing Phase 4 — STRCP1 paralog catches every PE3b discriminating pegRNA
Question
Prime Editing for STRC Phase 3 produced 61 PE3b allele-discriminating pegRNAs across 5 PAM variants (SpCas9-NGG, SpG-NGN, enCas9-NGN, SpRY-NRN, SpCas9NG-NG) for the maternal E1659A allele. Each was scored to discriminate edited (E1659A) vs mutation-mut (WT-A) — but not against the STRCP1 pseudogene paralog ~100 kb downstream. Phase 4 asks: how many of the 61 designs survive an off-target check on chr15 (which contains both STRC and STRCP1)? Gate: ≤5 off-targets at ≤3 mismatches AND 0 perfect off-targets outside the on-target locus.
Method
pe_phase4_cas_offinder.py driver against hg38 chr15 (~/STRC/genomes/hg38_chr15.fa, 102 Mb). Cas-OFFinder failed on Apple Silicon (clGetDeviceIDs Failed: -30 — bioconda build of cas-offinder cannot enumerate macOS OpenCL devices), so the script fell back to a pure-numpy off-target search (uint8-encoded genome, vectorised PAM-mask + Hamming-distance counter on both strands). Identical output schema to cas-offinder TSV; total 158 s for 61 guides at 4-mismatch tolerance. SaCas9-NNGRRT skipped (template length 21 nt vs 20 nt for the others).
On-target locus exclusion: chr15:43,600,543-43,600,572 ± 5 nt (the SpCas9-NGG strong-discrimination protospacer at variant pos 17).
Result — every guide hits the same single off-target locus
0 / 61 designs pass the gate. Every one of the 61 discriminating pegRNAs has exactly 1 perfect off-target outside the on-target locus, plus 1-9 hits at 3-4 mismatches.
The 61 perfect off-targets cluster into a single 18-bp window on chr15:
| Statistic | Position |
|---|---|
| min position | chr15:43,700,346 |
| max position | chr15:43,700,363 |
| median | chr15:43,700,353 |
| 10-kb bin containing all 61 hits | chr15:43,700,000–43,710,000 |
This is STRCP1, the STRC pseudogene paralog, ~100 kb downstream of STRC on chr15q15.3. STRCP1 is a high-identity pseudogene (per literature, ≥97 % identity over the entire STRC gene region, including the variant position) and it harbours the same nucleotide that Phase 3 designs as “edited” at the variant position. This means:
- The PE3b allele-discrimination check in Phase 3 was valid for STRC vs STRC but blind to STRCP1.
- Every pegRNA that discriminates between WT-A and E1659A-G on STRC also matches STRCP1 perfectly at the same 17 nt around the variant locus, because STRCP1 has the same flanking sequence.
- Off-target editing on STRCP1 would convert the silenced pseudogene into a putative new functional/quasi-functional STRC allele, with unknown phenotypic consequences (could rescue, could create a hypomorphic interferon, could do nothing — STRCP1 is not transcribed in OHCs but PE editing leaves the chromatin context unchanged).
Per-PAM breakdown
| PAM | n_guides | n_pass | mean off-targets at ≤3 mm | notes |
|---|---|---|---|---|
| SpCas9_NGG | 5 | 0 | 1.0 | strong-discrimination class; cleanest pegRNAs except for STRCP1 |
| SpG_NGN | 12 | 0 | 2.2 | adds 1-3 distal off-targets per guide |
| enCas9_NGN | 12 | 0 | 2.2 | identical PAM to SpG; same hit profile |
| SpRY_NRN | 20 | 0 | 3.3 | broadest PAM = most off-targets; STRCP1 still dominant |
| SpCas9NG_NG | 12 | 0 | 2.2 | NG PAM permissive; same STRCP1 hit |
Distal off-targets (the 4-mismatch hits at chr15:28M, 45M, 56M, 77M positions) are scattered chance matches with no functional concern at standard PE editing efficiencies.
Implications for the hypothesis
This is not a kill for Prime Editing for STRC — it’s a Phase 3 design oversight that catches in Phase 4. The fix is straightforward in principle: redesign pegRNAs to include positions where STRC and STRCP1 differ (~3-5 % of nucleotides over the relevant ~100 nt PE3b window). The redesign needs:
- Pull the STRCP1 gDNA region around the locus equivalent to STRC E1659.
- For each Phase 3 candidate, score how many positions in the 20-nt protospacer plus 3 nt PAM differ from STRCP1.
- Keep only candidates with ≥2-3 STRC-vs-STRCP1 mismatches in the seed region (positions 1-12 from PAM, where PE/Cas9 is most sensitive).
- Re-run this Phase 4 off-target check. Expected pass rate: 0–10 % of original candidates survive both discriminations (E1659A-vs-WT AND STRC-vs-STRCP1).
Realistic outcome: the SpCas9-NGG PAM (which has the smallest design space) may yield zero STRCP1-discriminating PE3b pegRNAs for E1659A. In that case, the engineered NGN PAMs (SpG, enCas9, SpRY, SpCas9NG) become the only options — and SpRY-NRN’s 7+ usable candidates per the original Phase 3 analysis are most likely to survive.
Why this matters more than typical off-target checks
For most disease loci, “1 perfect off-target on chr15” would be a yellow flag. For STRC specifically, the off-target IS the gene’s own pseudogene, with biological behaviour that depends on whether STRCP1 ever gets reactivated by PE-induced sequence changes. The literature on pseudogene reactivation by gene-editing-induced point mutations is sparse but the safety case for clinical translation requires resolving this. Holt’s lab is exactly the contact to ask whether STRCP1 has been an issue in the experimental gene-therapy pipelines (his group runs STRC mouse models — they will know if STRCP1 mutates concurrently).
Files
- Driver:
~/STRC/models/pe_phase4_cas_offinder.py(with pure-numpy fallback for macOS) - Output JSON:
~/STRC/models/pe_phase4_cas_offinder.json - Genome:
~/STRC/genomes/hg38_chr15.fa(UCSC hg38 chr15 single-chromosome FASTA, 99.2 MiB) - Logs:
~/STRC/logs/pe-offtarget-20260422-083058.log
Limitations
- chr15-only scan. The remaining ~96 % of the genome is unscanned. Distal (off-chromosome) perfect-match off-targets cannot be ruled out by this run. For clinical translation, a full hg38 scan is required (~30× slower; trivially scriptable by swapping
GENOME_PATHto a multi-FASTA). - Pure-numpy fallback computes Hamming distance only — does not score insertions or deletions. Cas-OFFinder also defaults to mismatch-only, so this is a fair substitute, but real Cas9/PE off-targets can include 1-2 nt bulges that this won’t catch. ~10× more sensitive scan would use
cas-offinder-bulge. - No DNA accessibility filter: many of the chr15 off-target loci may be in inaccessible chromatin in OHCs. Without ATAC-seq from cochlear hair cells, all chr15 hits are treated as equally weighted. The STRCP1 hit would be in accessible chromatin (it’s in the same q15.3 locus as STRC).
- No STRCP1 transcription/translation check: the literature says STRCP1 is silenced in cochlea, but PE-induced sequence changes could in principle relieve that silencing. This is a wet-lab question.
Ranking delta
- Prime Editing for STRC: Tier A → B. Mechanism unchanged at 4/5 (PE works on E1659A — the chemistry was correct in Phase 3). Misha-fit unchanged at 2/5 (this issue is universal to all E1659A patients, not Misha-specific). Delivery unchanged at 2/5. The downgrade reflects the Phase 3 design oversight: every existing pegRNA candidate hits STRCP1 perfectly. The hypothesis is not killed — it’s blocked on a redesign step that was always implicit but now is explicit. Evidence depth +1 (Phase 4 STRCP1 paralog hit confirmed across all 61 designs across 5 PAM variants). Status remains “active” but tier reflects that the candidate pool is now zero pending Phase 3.5 STRCP1-aware redesign. Next step changed from “maternal-allele only; Cas-OFFinder off-target scan” → “Phase 3.5 STRCP1-discriminating pegRNA redesign: pull STRCP1 sequence at the variant-equivalent locus, filter Phase 3 candidates by ≥2 mismatches against STRCP1 seed region, re-run Phase 4”.
- STRC ASO Exon Skipping: flag for re-check. ASOs may have the same STRCP1 problem — STRCP1 has paralogous splice sites. The Phase 1 ASO design (STRC ASO Phase1 Splice-Switch Design) was scored on STRC alone, not STRC-vs-STRCP1 discrimination. Add to next-step list: “STRCP1 paralog cross-hybridization check via NUPACK/BLAST”. No tier change yet — pending the check.
- STRC Mini-STRC Single-Vector Hypothesis: no change. Mini-STRC is a de novo protein delivery, not editing — STRCP1 is irrelevant to the AAV strategy.
- STRC mRNA-LNP Strategy B Full-Length: no change. Same reasoning as Mini-STRC — exogenous mRNA, no editing.
- STRC Pharmacochaperone Virtual Screen E1659A: no change. Small-molecule binder, no editing.
- All other S/A/B/C tier hypotheses: no change.
Connections
[part-of]Prime Editing for STRC- STRC pe_phase3_allele_discrimination — Phase 3 design output is the input to this proof
- Prime Editing for STRC — kills the current candidate pool, not the hypothesis itself
- STRC ASO Exon Skipping — same paralog risk applies, needs check
[see-also]STRC Hypothesis Ranking- Jeffrey Holt — natural follow-up question for him (does STRCP1 mutate in PE/CRISPR experiments?)
[applies]Misha — maternal allele PE pipeline is the path this proof gates