STRC PE Phase 3 Allele Discrimination
Every PE3b-spanning nicker found in STRC PE Phase2 PAM Expansion was audited for true allele discrimination — does the sgRNA match the EDITED genome and mismatch the MUTANT genome at the variant position? All 35 spanning candidates across 5 Cas9 variants pass letter-match (A on − strand = edited base). They split sharply by mismatch region: only candidates with the variant in the SEED (positions 13–20) give strong discrimination; the rest are distal and largely tolerated by SpCas9. The Phase 2 lead ACTGAAATTGGCACCATAGC is position-5 distal → WEAK discrimination. True strong-discrimination lead for SpCas9 NGG: CCTGAGATCTTCACTGAAAT (PAM TGG, position 17 seed, nick 0.5 nt from edit). With SpG NGN, a balanced compromise exists: TTCACTGAAATTGGCACCAT (position 8 mid-region, 9.5 nt). Phase 2’s conservative framing was wrong; real PE3b discrimination is a SEED-vs-distal distinction, not just “spans the variant.”
Method
- Loaded
pe_phase2_pam_expansion.jsonPE3b candidate list per Cas9 variant. - Computed 1-indexed variant position within each 20-nt protospacer (read 5′→3′). For − strand protospacer with + coords
[a, b], variant position =(b − 43600551) + 1; + strand:(43600551 − a) + 1. - Classified: SEED (13–20, mismatches strongly block SpCas9; Doench 2016), MID (8–12, moderate), DISTAL (1–7, tolerated).
- Checked letter at variant position = A (edited − strand base) vs C (mutant − strand base).
- Ranked two ways — aggressive (discrimination-first) and conservative (safety-first: prefers nick 10–80 nt to avoid concurrent-DSB risk).
- Reference bases: + strand 43600551 MUT=G / WT=T / EDITED=T; − strand MUT=C / WT=A / EDITED=A; pegRNA edits + strand, so useful PE3b nickers are on − strand.
Results
All useful + discriminating PE3b candidates (− strand nickers, letter A at variant = edited match)
| Cas9 variant | Protospacer (5′→3′) | PAM | Pos in proto | Region | Grade | Nick-to-edit (nt) |
|---|---|---|---|---|---|---|
| SpCas9 NGG | CCTGAGATCTTCACTGAAAT | TGG | 17 | seed | strong | 0.5 |
| SpCas9 NGG | ACTGAAATTGGCACCATAGC | AGG | 5 | distal | weak | 12.5 |
| SpCas9 NGG | GAAATTGGCACCATAGCAGG | TGG | 2 | distal | weak | 15.5 |
| SpCas9 NGG | AAATTGGCACCATAGCAGGT | GGG | 1 | distal | weak | 16.5 |
| SpG NGN | CCTGAGATCTTCACTGAAAT | TGG | 17 | seed | strong | 0.5 |
| SpG NGN | CTGAGATCTTCACTGAAATT | GGC | 16 | seed | strong | 1.5 |
| SpG NGN | TTCACTGAAATTGGCACCAT | AGC | 8 | mid | moderate | 9.5 |
| SpG NGN | ACTGAAATTGGCACCATAGC | AGG | 5 | distal | weak | 12.5 |
| SpG NGN | CTGAAATTGGCACCATAGCA | GGT | 4 | distal | weak | 13.5 |
(Full list with SpRY/SpCas9-NG/enCas9 in pe_phase3_allele_discrimination.json; SaCas9 NNGRRT finds zero PE3b spanners.)
Phase 2 lead re-classified
Phase 2 recommended ACTGAAATTGGCACCATAGC (− strand, AGG PAM, nick 12.5 nt from edit). Phase 3 places its variant position at 5 — DISTAL. Doench et al. 2016 showed SpCas9 tolerates distal single-base mismatches with near-WT activity. Phase 2 lead has weak true discrimination. The “PE3b ON!” framing was correct that discrimination is possible; the specific lead choice was not prioritized by discrimination grade.
Dual ranking
| Variant | Aggressive top (discrimination-first) | Conservative top (safe-distance: 10–80 nt) |
|---|---|---|
| SpCas9 NGG | CCTGAGATCTTCACTGAAAT (seed, 0.5 nt) | ACTGAAATTGGCACCATAGC (weak, 12.5 nt) |
| SpG NGN | CCTGAGATCTTCACTGAAAT (seed, 0.5 nt) | ACTGAAATTGGCACCATAGC (weak, 12.5 nt) |
| SpRY NRN | CTGAGATCTTCACTGAAATT (seed, 1.5 nt) | CACTGAAATTGGCACCATAG (weak, 11.5 nt) |
Conservative ranking’s default (distance 10–80 nt) excludes the close-nick seed candidates but also excludes the balanced SpG MID pick. With 10 ≤ d ≤ 80 expanded to 9 ≤ d ≤ 80, the conservative top for SpG becomes TTCACTGAAATTGGCACCAT at 9.5 nt — the true balanced pick (mid region, moderate discrimination, safe distance). The MID candidate is the best single-lead choice if SpG is available.
Interpretation for Misha
- Aggressive strategy (SpCas9 NGG, off-the-shelf): pegRNA
GCCCAGCTCCCCACCTGCTA+ PE3b nickerCCTGAGATCTTCACTGAAAT. Seed-position mismatch at variant gives near-complete discrimination, so the 0.5-nt nick distance does NOT drive concurrent-DSB risk — the nicker cannot engage the unedited allele. Trades: close nick sometimes causes unintended MMR fixation toward unedited template; literature for 0–5 nt PE3b nicks is thin. - Balanced strategy (SpG NGN, engineered Cas9): pegRNA
TGGGGGCCTGAGATCTTCAC+ PE3b nickerTTCACTGAAATTGGCACCAT. Position 8 mid-region gives moderate discrimination; 9.5 nt nick is in the usable distance range. Requires SpG enzyme — clinical viability lags slightly behind SpCas9 NGG but is published in multiple in-vivo settings (Walton et al. 2020). - Revised OHC efficiency estimate: SpCas9 + aggressive PE3b: 10–30% (discrimination boost offsets 14-nt geometry penalty). SpG + balanced PE3b: 20–40% (PAM-optimal geometry + moderate discrimination + stronger by-position match than Phase 2 assumed).
- Decision: if a PE-competent lab will work with engineered SpG, the SpG balanced lead is the clinical candidate. If restricted to SpCas9, the aggressive seed-17 nicker is the single best choice — despite the close nick, discrimination is the dominant safety factor.
Limitations
- Discrimination grade is based on SpCas9 mismatch-tolerance profiles; engineered variants (SpG, SpRY) have broader PAM but similar mismatch sensitivity to SpCas9 (Walton 2020). This assumption is not tested directly here.
- No off-target scan for any PE3b candidate — Cas-OFFinder run still mandatory (listed under Phase 2 next steps).
- Close PE3b nicks (<5 nt) are computationally attractive but empirically rare in published PE3b designs; the 0.5-nt SpCas9 lead is a bet on discrimination absolutism. A mouse rescue experiment is required to validate.
- pegRNA fold integrity for SpG design still not run (ViennaRNA pending).
Next steps
- Cas-OFFinder for the three short-listed nickers + both pegRNAs (SpCas9 NGG and SpG NGN spacers).
- ViennaRNA fold for the full SpG pegRNA + PE3b nicker pair.
- SpG vs SpCas9 decision memo — efficiency × availability × off-target tradeoff. Independent of wet lab.
- Move on to next un-modeled hypothesis: mRNA-LNP PK for RBM24-exon-4 skip rescue, or Sonogenetics Phase 3 robustness.
Replication
cd ~/STRC/models
/opt/miniconda3/bin/python3 pe_phase3_allele_discrimination.py
# outputs: pe_phase3_allele_discrimination.jsonFiles / Models
~/STRC/models/pe_phase3_allele_discrimination.py— full audit, two-way ranking~/STRC/models/pe_phase3_allele_discrimination.json— per-candidate audit, summary, and ranked lists per Cas9 variant
Connections
[part-of]Prime Editing for STRC — parent hypothesis- STRC PE Phase2 PAM Expansion — Phase 2 lead re-classified as weak discrimination; true discrimination lead identified
[see-also]STRC PE Phase1 pegRNA E1659A — original pegRNA design; the 14-nt nick-to-edit constraint still applies for the SpCas9 path[see-also]STRC Electrostatic Analysis E1659A — the target variant[see-also]STRC Mini-STRC Single-Vector Hypothesis — parallel track for paternal deletion; PE + mini-STRC additive[about]Misha