Ensembl REST API
Programmatic access to Ensembl genomic data. Essential for coordinate lookups, variant effect prediction (VEP), gene annotations, and cross-species comparisons.
What It Does
- Variant Effect Predictor (VEP) — predict consequences of variants
- Gene/transcript/exon coordinates
- Cross-species comparative genomics
- Regulatory features (promoters, enhancers)
- Sequence retrieval (DNA and protein)
How to Use
VEP (Most Important for STRC)
# Predict variant consequences
curl "https://rest.ensembl.org/vep/human/hgvs/NM_153700.2:c.4976A>C?" \
-H "Content-Type: application/json"Gene Lookup
# STRC gene info
curl "https://rest.ensembl.org/lookup/symbol/homo_sapiens/STRC?expand=1" \
-H "Content-Type: application/json"Sequence
# Get genomic sequence around STRC
curl "https://rest.ensembl.org/sequence/region/human/15:43599563..43618800:1?" \
-H "Content-Type: application/json"Python
import requests
# VEP query
r = requests.get(
"https://rest.ensembl.org/vep/human/hgvs/NM_153700.2:c.4976A>C",
headers={"Content-Type": "application/json"}
)
vep = r.json()
for csq in vep[0]['transcript_consequences']:
print(f"{csq['gene_symbol']}: {csq['consequence_terms']}")Verified Status
VERIFIED — STRC gene lookup works. VEP returns missense_variant for c.4976A>C (E1659A, p.Glu1659Ala). Also returns: SIFT=deleterious(0), PolyPhen=probably_damaging(0.991). Coordinates confirmed: chr15:43,891,390-43,907,636 (GRCh38, minus strand).
STRC Research Usage
- STRC Gene — coordinate reference
- STRC Variant c.4976A>C — Misha — VEP annotation
- Used for cDNA → genomic coordinate conversion
Critical Notes
- STRC is on MINUS strand — be careful with position/base conversions
- Pseudogene — Ensembl annotates both STRC and STRCP1, verify you’re looking at the right one
- Rate limits — 15 requests/second, use POST for batch queries
Results (April 2026)
- Regulatory features DONE: 3 features in STRC region — 1 promoter (43611040-43611264) + 2 enhancers (43602390-43602603, 43622231-43622692). E1659A at 43600551 is near the first enhancer.
- Transcript isoforms DONE: 13 STRC transcripts — 3 protein-coding (main: ENST00000450892, 29 exons), 3 NMD, 7 retained intron. E1659A is missense in the canonical 29-exon transcript.
- Next: batch VEP all STRC CDS variants, comparative genomics alignment blocks
Connections
- UCSC Genome Browser [see-also] — visual genome browser
- AlphaGenome [depends-on] — for coordinate lookups
- STRC Gene [used-in]
- SpliceAI [see-also] — VEP plugin