Amazon BioDiscovery and AWS HealthOmics
Two separate AWS services. Egor opened an account on April 15, 2026.
Amazon Bio Discovery
Launched April 14, 2026. Agentic AI for antibody drug discovery. No-code interface where you describe goals in natural language, AI agents orchestrate model selection and pipeline creation.
Site: https://aws.amazon.com/biodiscovery/ Pricing: https://aws.amazon.com/biodiscovery/pricing/
What’s inside
40+ specialized biology models:
- BioPhi (humanization scoring)
- Boltz models (3D complex structure prediction + binding affinity)
- Diffusion models for de novo nanobody design
- Partners: Apheris, Boltz (now); Biohub, Profluent (soon)
- Upload your own custom models
Pricing (50% early access discount through October 15, 2026)
| Plan | Monthly | Experiment Units |
|---|---|---|
| Academic | Free | 5 |
| Starter | $180 | 5 |
| Pro | $486 | 15 |
| Pro+ | $2,142 | 70 |
First month: 5 free EUs, no credit card.
Relevance to STRC: LOW
Bio Discovery is antibody/nanobody-only. Not useful for variant effect prediction, protein structure of arbitrary proteins, or gene therapy vector design. Unless we pivot to antibody-based therapeutics for hearing loss.
AWS HealthOmics
The actual platform for our work. Managed infrastructure for bioinformatics workflows at scale.
Site: https://aws.amazon.com/healthomics/ Docs: https://docs.aws.amazon.com/omics/ Tutorials: https://github.com/aws-samples/aws-healthomics-tutorials
Available Models and Tools
Evo 2 (7B and 40B parameters)
- Published in Nature 2026. 1M token context window at single-nucleotide resolution.
- AWS sample repo: https://github.com/aws-samples/genomic-language-model-pretraining-with-healthomics-seq-store
- Requires A100 GPUs (p4 instances)
- This is the same model powering EVEE Evo Variant Effect Explorer
AlphaFold 2
- Ready2Run workflow on HealthOmics. Submit sequence, get structure.
- Two steps: MSA generation + structure prediction
- Pay only for compute runtime
- Directly useful for STRC Mini-STRC Single-Vector Hypothesis: submit mini-STRC sequences, compare to our existing AF3 results
ESM3 (EvolutionaryScale)
- 1.4B open-source on SageMaker JumpStart
- 7B and 98B proprietary models “coming soon”
- Reasons jointly over sequence, structure, and function
- GitHub: https://github.com/aws-samples/esm3-on-amazon-sagemaker
- Useful for: understanding which domains are dispensable in mini-STRC truncation design
ESMFold
- Faster than AlphaFold (no MSA required)
- Already tested locally for STRC: STRC ESMFold Disorder Validation
VEP (Variant Effect Predictor)
- Automated annotation workflows on HealthOmics
- Processes VCF files with functional predictions + ClinVar annotations
Pricing
- Free tier: $200 in credits for new customers (6-month window)
- Compute: per-second billing. Examples: omics.c.4xlarge = 0.259/hr
- Ready2Run AlphaFold: flat per-run fee
- Sequence store: $0.005769/gigabase/month
Potential Experiments for STRC
1. Evo 2 variant effect prediction (HIGH PRIORITY)
Run Misha’s variant (chr15:43600551:T:G) through Evo 2 directly.
- Get pathogenicity score independent of ClinVar/EVEE
- Get full disruption profile (splice, conservation, protein structure)
- DNA-level model bypasses STRC Pseudogene Problem
- Can’t run on Mac (CUDA-only deps, see EVEE Evo Variant Effect Explorer)
- **Cheapest option: RunPod/Vast.ai A100 for 3.
- AWS HealthOmics p4 instance also works but may be more expensive for a single run
2. AlphaFold for mini-STRC constructs
Run AlphaFold 2 on our truncation candidates:
- Mini-STRC conservative (res 616-1775)
- Shorter mini-STRC (res 700-1775)
- Compare with our AF3 results (pTM 0.81 and 0.86)
- AF2 might give different confidence scores, worth cross-validating
3. ESM3 for domain dispensability analysis
Use ESM3 to reason about which STRC domains are functionally critical:
- Feed full STRC sequence, ask about domain boundaries
- Compare with our pLDDT cut point analysis
- Might find better truncation points than res 616 or 700
4. VEP annotation pipeline
Run standard VEP on c.4976A>C for clean automated annotation.
- Not new data, but standardized pipeline output
- Good for documentation and submission to ClinVar
Next Steps
- Check what’s accessible under the free tier / $200 credits
- Try Evo 2 inference on the variant (needs p4 GPU instance)
- Run AlphaFold Ready2Run on mini-STRC 700-1775
- Document results in STRC E1659A Computational Tool Audit
Connections
- STRC E1659A Computational Tool Audit — AWS tools as new entries
- STRC Mini-STRC Single-Vector Hypothesis — AlphaFold and ESM3 experiments
- STRC Pseudogene Problem — Evo 2 DNA-level approach bypasses this
- EVEE Evo Variant Effect Explorer — powered by Evo 2, same model available on AWS
- STRC ESMFold Disorder Validation — ESMFold also available on HealthOmics
- STRC AlphaFold3 Computational Experiments — compare AF2 on HealthOmics vs AF3 results