STRC Prime Editing Phase 4 — STRCP1 paralog catches every PE3b discriminating pegRNA

Question

Prime Editing for STRC Phase 3 produced 61 PE3b allele-discriminating pegRNAs across 5 PAM variants (SpCas9-NGG, SpG-NGN, enCas9-NGN, SpRY-NRN, SpCas9NG-NG) for the maternal E1659A allele. Each was scored to discriminate edited (E1659A) vs mutation-mut (WT-A) — but not against the STRCP1 pseudogene paralog ~100 kb downstream. Phase 4 asks: how many of the 61 designs survive an off-target check on chr15 (which contains both STRC and STRCP1)? Gate: ≤5 off-targets at ≤3 mismatches AND 0 perfect off-targets outside the on-target locus.

Method

pe_phase4_cas_offinder.py driver against hg38 chr15 (~/STRC/genomes/hg38_chr15.fa, 102 Mb). Cas-OFFinder failed on Apple Silicon (clGetDeviceIDs Failed: -30 — bioconda build of cas-offinder cannot enumerate macOS OpenCL devices), so the script fell back to a pure-numpy off-target search (uint8-encoded genome, vectorised PAM-mask + Hamming-distance counter on both strands). Identical output schema to cas-offinder TSV; total 158 s for 61 guides at 4-mismatch tolerance. SaCas9-NNGRRT skipped (template length 21 nt vs 20 nt for the others).

On-target locus exclusion: chr15:43,600,543-43,600,572 ± 5 nt (the SpCas9-NGG strong-discrimination protospacer at variant pos 17).

Result — every guide hits the same single off-target locus

0 / 61 designs pass the gate. Every one of the 61 discriminating pegRNAs has exactly 1 perfect off-target outside the on-target locus, plus 1-9 hits at 3-4 mismatches.

The 61 perfect off-targets cluster into a single 18-bp window on chr15:

StatisticPosition
min positionchr15:43,700,346
max positionchr15:43,700,363
medianchr15:43,700,353
10-kb bin containing all 61 hitschr15:43,700,000–43,710,000

This is STRCP1, the STRC pseudogene paralog, ~100 kb downstream of STRC on chr15q15.3. STRCP1 is a high-identity pseudogene (per literature, ≥97 % identity over the entire STRC gene region, including the variant position) and it harbours the same nucleotide that Phase 3 designs as “edited” at the variant position. This means:

  1. The PE3b allele-discrimination check in Phase 3 was valid for STRC vs STRC but blind to STRCP1.
  2. Every pegRNA that discriminates between WT-A and E1659A-G on STRC also matches STRCP1 perfectly at the same 17 nt around the variant locus, because STRCP1 has the same flanking sequence.
  3. Off-target editing on STRCP1 would convert the silenced pseudogene into a putative new functional/quasi-functional STRC allele, with unknown phenotypic consequences (could rescue, could create a hypomorphic interferon, could do nothing — STRCP1 is not transcribed in OHCs but PE editing leaves the chromatin context unchanged).

Per-PAM breakdown

PAMn_guidesn_passmean off-targets at ≤3 mmnotes
SpCas9_NGG501.0strong-discrimination class; cleanest pegRNAs except for STRCP1
SpG_NGN1202.2adds 1-3 distal off-targets per guide
enCas9_NGN1202.2identical PAM to SpG; same hit profile
SpRY_NRN2003.3broadest PAM = most off-targets; STRCP1 still dominant
SpCas9NG_NG1202.2NG PAM permissive; same STRCP1 hit

Distal off-targets (the 4-mismatch hits at chr15:28M, 45M, 56M, 77M positions) are scattered chance matches with no functional concern at standard PE editing efficiencies.

Implications for the hypothesis

This is not a kill for Prime Editing for STRC — it’s a Phase 3 design oversight that catches in Phase 4. The fix is straightforward in principle: redesign pegRNAs to include positions where STRC and STRCP1 differ (~3-5 % of nucleotides over the relevant ~100 nt PE3b window). The redesign needs:

  1. Pull the STRCP1 gDNA region around the locus equivalent to STRC E1659.
  2. For each Phase 3 candidate, score how many positions in the 20-nt protospacer plus 3 nt PAM differ from STRCP1.
  3. Keep only candidates with ≥2-3 STRC-vs-STRCP1 mismatches in the seed region (positions 1-12 from PAM, where PE/Cas9 is most sensitive).
  4. Re-run this Phase 4 off-target check. Expected pass rate: 0–10 % of original candidates survive both discriminations (E1659A-vs-WT AND STRC-vs-STRCP1).

Realistic outcome: the SpCas9-NGG PAM (which has the smallest design space) may yield zero STRCP1-discriminating PE3b pegRNAs for E1659A. In that case, the engineered NGN PAMs (SpG, enCas9, SpRY, SpCas9NG) become the only options — and SpRY-NRN’s 7+ usable candidates per the original Phase 3 analysis are most likely to survive.

Why this matters more than typical off-target checks

For most disease loci, “1 perfect off-target on chr15” would be a yellow flag. For STRC specifically, the off-target IS the gene’s own pseudogene, with biological behaviour that depends on whether STRCP1 ever gets reactivated by PE-induced sequence changes. The literature on pseudogene reactivation by gene-editing-induced point mutations is sparse but the safety case for clinical translation requires resolving this. Holt’s lab is exactly the contact to ask whether STRCP1 has been an issue in the experimental gene-therapy pipelines (his group runs STRC mouse models — they will know if STRCP1 mutates concurrently).

Files

  • Driver: ~/STRC/models/pe_phase4_cas_offinder.py (with pure-numpy fallback for macOS)
  • Output JSON: ~/STRC/models/pe_phase4_cas_offinder.json
  • Genome: ~/STRC/genomes/hg38_chr15.fa (UCSC hg38 chr15 single-chromosome FASTA, 99.2 MiB)
  • Logs: ~/STRC/logs/pe-offtarget-20260422-083058.log

Limitations

  • chr15-only scan. The remaining ~96 % of the genome is unscanned. Distal (off-chromosome) perfect-match off-targets cannot be ruled out by this run. For clinical translation, a full hg38 scan is required (~30× slower; trivially scriptable by swapping GENOME_PATH to a multi-FASTA).
  • Pure-numpy fallback computes Hamming distance only — does not score insertions or deletions. Cas-OFFinder also defaults to mismatch-only, so this is a fair substitute, but real Cas9/PE off-targets can include 1-2 nt bulges that this won’t catch. ~10× more sensitive scan would use cas-offinder-bulge.
  • No DNA accessibility filter: many of the chr15 off-target loci may be in inaccessible chromatin in OHCs. Without ATAC-seq from cochlear hair cells, all chr15 hits are treated as equally weighted. The STRCP1 hit would be in accessible chromatin (it’s in the same q15.3 locus as STRC).
  • No STRCP1 transcription/translation check: the literature says STRCP1 is silenced in cochlea, but PE-induced sequence changes could in principle relieve that silencing. This is a wet-lab question.

Ranking delta

  • Prime Editing for STRC: Tier A → B. Mechanism unchanged at 4/5 (PE works on E1659A — the chemistry was correct in Phase 3). Misha-fit unchanged at 2/5 (this issue is universal to all E1659A patients, not Misha-specific). Delivery unchanged at 2/5. The downgrade reflects the Phase 3 design oversight: every existing pegRNA candidate hits STRCP1 perfectly. The hypothesis is not killed — it’s blocked on a redesign step that was always implicit but now is explicit. Evidence depth +1 (Phase 4 STRCP1 paralog hit confirmed across all 61 designs across 5 PAM variants). Status remains “active” but tier reflects that the candidate pool is now zero pending Phase 3.5 STRCP1-aware redesign. Next step changed from “maternal-allele only; Cas-OFFinder off-target scan” → “Phase 3.5 STRCP1-discriminating pegRNA redesign: pull STRCP1 sequence at the variant-equivalent locus, filter Phase 3 candidates by ≥2 mismatches against STRCP1 seed region, re-run Phase 4”.
  • STRC ASO Exon Skipping: flag for re-check. ASOs may have the same STRCP1 problem — STRCP1 has paralogous splice sites. The Phase 1 ASO design (STRC ASO Phase1 Splice-Switch Design) was scored on STRC alone, not STRC-vs-STRCP1 discrimination. Add to next-step list: “STRCP1 paralog cross-hybridization check via NUPACK/BLAST”. No tier change yet — pending the check.
  • STRC Mini-STRC Single-Vector Hypothesis: no change. Mini-STRC is a de novo protein delivery, not editing — STRCP1 is irrelevant to the AAV strategy.
  • STRC mRNA-LNP Strategy B Full-Length: no change. Same reasoning as Mini-STRC — exogenous mRNA, no editing.
  • STRC Pharmacochaperone Virtual Screen E1659A: no change. Small-molecule binder, no editing.
  • All other S/A/B/C tier hypotheses: no change.

Connections