Ensembl REST API

Programmatic access to Ensembl genomic data. Essential for coordinate lookups, variant effect prediction (VEP), gene annotations, and cross-species comparisons.

What It Does

  • Variant Effect Predictor (VEP) — predict consequences of variants
  • Gene/transcript/exon coordinates
  • Cross-species comparative genomics
  • Regulatory features (promoters, enhancers)
  • Sequence retrieval (DNA and protein)

How to Use

VEP (Most Important for STRC)

# Predict variant consequences
curl "https://rest.ensembl.org/vep/human/hgvs/NM_153700.2:c.4976A>C?" \
  -H "Content-Type: application/json"

Gene Lookup

# STRC gene info
curl "https://rest.ensembl.org/lookup/symbol/homo_sapiens/STRC?expand=1" \
  -H "Content-Type: application/json"

Sequence

# Get genomic sequence around STRC
curl "https://rest.ensembl.org/sequence/region/human/15:43599563..43618800:1?" \
  -H "Content-Type: application/json"

Python

import requests
# VEP query
r = requests.get(
    "https://rest.ensembl.org/vep/human/hgvs/NM_153700.2:c.4976A>C",
    headers={"Content-Type": "application/json"}
)
vep = r.json()
for csq in vep[0]['transcript_consequences']:
    print(f"{csq['gene_symbol']}: {csq['consequence_terms']}")

Verified Status

VERIFIED — STRC gene lookup works. VEP returns missense_variant for c.4976A>C (E1659A, p.Glu1659Ala). Also returns: SIFT=deleterious(0), PolyPhen=probably_damaging(0.991). Coordinates confirmed: chr15:43,891,390-43,907,636 (GRCh38, minus strand).

STRC Research Usage

Critical Notes

  • STRC is on MINUS strand — be careful with position/base conversions
  • Pseudogene — Ensembl annotates both STRC and STRCP1, verify you’re looking at the right one
  • Rate limits — 15 requests/second, use POST for batch queries

Results (April 2026)

  • Regulatory features DONE: 3 features in STRC region — 1 promoter (43611040-43611264) + 2 enhancers (43602390-43602603, 43622231-43622692). E1659A at 43600551 is near the first enhancer.
  • Transcript isoforms DONE: 13 STRC transcripts — 3 protein-coding (main: ENST00000450892, 29 exons), 3 NMD, 7 retained intron. E1659A is missense in the canonical 29-exon transcript.
  • Next: batch VEP all STRC CDS variants, comparative genomics alignment blocks

Connections