Recipe — Ant Colony Optimization for Peptide Sequence Design
P1 recipe — verbatim pseudocode and parameter set from 2014-schneider-de-novo-molecular-design-book §18.3 (Hiss & Schneider, MHC-I octapeptide design, pp. ~450–456). ACO is the most h09-relevant nature-inspired algorithm because it (a) handles discrete sequence space directly, (b) explores sequence diversity before exploiting hits, (c) tolerates noise in the fitness function, and (d) can be coupled to either an ANN or to a physical fitness (RADA16-like β-sheet propensity, charge complementarity, hydrogel stiffness predictor).
The book’s worked example designed octapeptides binding mouse MHC-I H-2K^b at 89% accuracy for stabilizers / 95% accuracy for non-stabilizers — strong evidence of practical viability.
Verbatim algorithm (§18.3, Hiss & Schneider)
Step 1. Initialize the pheromone matrix with an equal pheromone concentration of 0.05 for each amino acid at each residue position. Step 2. Generate a new amino acid sequence by choosing a residue for each position guided by the pheromone concentration (first round: random selection). Step 3. Evaluate the resulting sequence (apply fitness function). Step 4. Update the pheromone matrix at the respective residue positions proportional to the fitness value of the peptide generated and evaluated in Step 3. Step 5. Evaporate pheromone in all sequence positions (optional). Step 6. Go to Step 2 or terminate.
— Schneider 2014 §18.3.
Verbatim pseudocode (§18.3)
Length_of_peptide = 8
Seq_pos = 1
Prob_for_aa = {8 × 20} matrix, all elements initialized with 0.05
While (Seq_pos ≤ Length_of_peptide):
Selection(Seq_pos) = select residue aa based on distribution in Prob_for_aa
Seq_pos = Seq_pos + 1
Fitness = Evaluate_with_ANN(Selection)
UpdateFactor = (Fitness − 0.5) / 100
# Matrix pheromone update
For i = 1 to Length_of_peptide:
If Prob_for_aa(i, aa_in Selection(i)) + UpdateFactor ≤ 0.9:
Prob_for_aa(i, aa_in Selection(i)) =
Prob_for_aa(i, aa_in Selection(i)) + UpdateFactor
Else:
Prob_for_aa(i, aa_in Selection(i)) = 0.9
# If pheromone is treated as probability (sums to 1) the increase of a 'winning' aa
# (+UpdateFactor) must be compensated by subtracting (UpdateFactor / 19) from the
# remaining 19 amino acids; never let the residual fall below 0.1.
# Evaporation (only if pheromone is NOT a probability summing to 1):
For i = 1 to (all positions in Prob_for_aa):
If Prob_for_aa(i) − 0.005 ≥ 0.1:
Prob_for_aa(i) = Prob_for_aa(i) − 0.005
Else:
Prob_for_aa(i) = 0.1— Schneider 2014 §18.3 verbatim pseudocode.
Diversity-pushing variant (§18.3 Eq. 18.6)
To prevent over-convergence on already-chosen residues, scale the update factor by selection frequency:
UpdateFactor = (Fitness − 0.5) / (100 + AA_count · 10⁻³)
— Schneider 2014 Eq. 18.6.
Used when broader sequence diversity is more valuable than fastest convergence.
Parameter table (verbatim from §18.3)
| Parameter | Default | Notes |
|---|---|---|
| Pheromone matrix shape | {8 × 20} for octapeptide × 20 residues | Generalize: {N × 20} for N-mer |
| Initial pheromone | 0.05 | uniform prior over residues |
| Lower-bound clip | 0.1 | escape early convergence |
| Upper-bound clip | 0.9 | escape early convergence |
| UpdateFactor numerator offset | 0.5 | (Fitness − 0.5) — fitness < 0.5 contracts probability |
| UpdateFactor denominator | 100 (or 100 + AA_count·10⁻³) | controls step size |
| Evaporation rate | 0.005 per iteration (optional) | speeds convergence |
| Probability-sum compensation | UpdateFactor / 19 subtracted from non-winning aa | maintains Σ p = 1 |
| Colony size (number of agents) | 1–10 (study used range) | larger colony → faster convergence |
| Convergence test | 10⁴ iterations without pheromone change | empirical |
| Sequence-space cardinality | 20⁸ ≈ 25.6 × 10⁹ | for octapeptides |
How to use in STRC h09
h09 RADA16-like assembly motif design
- Choose peptide length N. RADA16 is 16-mer; (RADA)₃ is 12-mer; (EAK)₄ is 12-mer. Use N appropriate to the assembly geometry.
- Build the fitness function. Three options, in order of computational cost:
- Cheap: descriptor-based predictor of β-sheet propensity (Chou-Fasman, AGADIR, AmyloGram, Tango). ANN trained on RADA16 family analogues if you have data.
- Medium: AlphaFold3 dimer prediction with
pae_pred_dimeras fitness signal (see STRC AF3 Static Pocket Blindness to Loop Dynamics for caveats). - Expensive: all-atom MD with secondary-structure metric — only for top-5 candidates.
- Run ACO with the verbatim pseudocode above, swapping
Evaluate_with_ANN(...)for your chosen fitness. - Convergence diagnosis: track
Σ |Δp_i|across the matrix per iteration. Two-phase pattern (Hiss-Schneider Fig. 18.11): exploration phase (rising total |Δp|) then exploitation phase (falling). Stop after 10⁴ stable iterations. - Output: top-K sequences ranked by their fitness during the exploitation phase, NOT just the final converged sequence.
- Validate experimentally: synthesize top 10–20 sequences, test for self-assembly (CD spectroscopy for β-sheet, AFM for fiber morphology, rheology for hydrogel modulus).
h09 mutation-tolerance for active sequences
To find the minimal-mutation set that breaks self-assembly (a falsifier for a residue’s load-bearing role):
- Use SME (Eq. 18.5, Gaussian-distance mutation) instead of ACO.
- Distance matrix: Amino Acid Physicochemical Distance Matrix Grantham Modified.
- σ = 0.10–0.15 for conservative mutations; σ = 0.5 for broad mutation.
- Generate offspring library (~20–100 sequences), score each with the same fitness function, plot fitness vs. Hamming distance from parent.
h09 stability post-design
After ACO produces a candidate self-assembling peptide, apply Recipe — Therapeutic Peptide Stability Modifications for protease-resistance / serum-half-life improvements before in vivo testing.
Citation pattern for h09 ACO script docstring
# Sequence-space exploration via Ant Colony Optimization (ACO) per
# Hiss & Schneider 2014 (Schneider Ed., De novo Molecular Design, Ch. 18).
# Pheromone matrix {N × 20}, initialized at 0.05, bounded [0.1, 0.9].
# UpdateFactor = (Fitness − 0.5)/100; probability-sum compensation per §18.3.
# Worked validation: MHC-I octapeptide design, 89%/95% accuracy on stabilizing/
# non-stabilizing classes (Schneider 2014 §18.3 [57]).
Connections
[part-of]rada16-geometry[source]2014-schneider-de-novo-molecular-design-book[applies]index[see-also]Amino Acid Physicochemical Distance Matrix Grantham Modified[see-also]Recipe — Therapeutic Peptide Stability Modifications[see-also]Fragment Additivity Assumption and Superadditivity