Recipe — Ant Colony Optimization for Peptide Sequence Design

P1 recipe — verbatim pseudocode and parameter set from 2014-schneider-de-novo-molecular-design-book §18.3 (Hiss & Schneider, MHC-I octapeptide design, pp. ~450–456). ACO is the most h09-relevant nature-inspired algorithm because it (a) handles discrete sequence space directly, (b) explores sequence diversity before exploiting hits, (c) tolerates noise in the fitness function, and (d) can be coupled to either an ANN or to a physical fitness (RADA16-like β-sheet propensity, charge complementarity, hydrogel stiffness predictor).

The book’s worked example designed octapeptides binding mouse MHC-I H-2K^b at 89% accuracy for stabilizers / 95% accuracy for non-stabilizers — strong evidence of practical viability.

Verbatim algorithm (§18.3, Hiss & Schneider)

Step 1. Initialize the pheromone matrix with an equal pheromone concentration of 0.05 for each amino acid at each residue position. Step 2. Generate a new amino acid sequence by choosing a residue for each position guided by the pheromone concentration (first round: random selection). Step 3. Evaluate the resulting sequence (apply fitness function). Step 4. Update the pheromone matrix at the respective residue positions proportional to the fitness value of the peptide generated and evaluated in Step 3. Step 5. Evaporate pheromone in all sequence positions (optional). Step 6. Go to Step 2 or terminate.

— Schneider 2014 §18.3.

Verbatim pseudocode (§18.3)

Length_of_peptide = 8
Seq_pos = 1
Prob_for_aa = {8 × 20} matrix, all elements initialized with 0.05
 
While (Seq_pos ≤ Length_of_peptide):
    Selection(Seq_pos) = select residue aa based on distribution in Prob_for_aa
    Seq_pos = Seq_pos + 1
 
Fitness = Evaluate_with_ANN(Selection)
UpdateFactor = (Fitness − 0.5) / 100
 
# Matrix pheromone update
For i = 1 to Length_of_peptide:
    If Prob_for_aa(i, aa_in Selection(i)) + UpdateFactor ≤ 0.9:
        Prob_for_aa(i, aa_in Selection(i)) =
            Prob_for_aa(i, aa_in Selection(i)) + UpdateFactor
    Else:
        Prob_for_aa(i, aa_in Selection(i)) = 0.9
    # If pheromone is treated as probability (sums to 1) the increase of a 'winning' aa
    # (+UpdateFactor) must be compensated by subtracting (UpdateFactor / 19) from the
    # remaining 19 amino acids; never let the residual fall below 0.1.
 
# Evaporation (only if pheromone is NOT a probability summing to 1):
For i = 1 to (all positions in Prob_for_aa):
    If Prob_for_aa(i) − 0.005 ≥ 0.1:
        Prob_for_aa(i) = Prob_for_aa(i) − 0.005
    Else:
        Prob_for_aa(i) = 0.1

— Schneider 2014 §18.3 verbatim pseudocode.

Diversity-pushing variant (§18.3 Eq. 18.6)

To prevent over-convergence on already-chosen residues, scale the update factor by selection frequency:

UpdateFactor = (Fitness − 0.5) / (100 + AA_count · 10⁻³)

— Schneider 2014 Eq. 18.6.

Used when broader sequence diversity is more valuable than fastest convergence.

Parameter table (verbatim from §18.3)

Parameter	Default	Notes
Pheromone matrix shape	{8 × 20} for octapeptide × 20 residues	Generalize: {N × 20} for N-mer
Initial pheromone	0.05	uniform prior over residues
Lower-bound clip	0.1	escape early convergence
Upper-bound clip	0.9	escape early convergence
UpdateFactor numerator offset	0.5	(Fitness − 0.5) — fitness < 0.5 contracts probability
UpdateFactor denominator	100 (or 100 + AA_count·10⁻³)	controls step size
Evaporation rate	0.005 per iteration (optional)	speeds convergence
Probability-sum compensation	UpdateFactor / 19 subtracted from non-winning aa	maintains Σ p = 1
Colony size (number of agents)	1–10 (study used range)	larger colony → faster convergence
Convergence test	10⁴ iterations without pheromone change	empirical
Sequence-space cardinality	20⁸ ≈ 25.6 × 10⁹	for octapeptides

How to use in STRC h09

h09 RADA16-like assembly motif design

Choose peptide length N. RADA16 is 16-mer; (RADA)₃ is 12-mer; (EAK)₄ is 12-mer. Use N appropriate to the assembly geometry.
Build the fitness function. Three options, in order of computational cost:
- Cheap: descriptor-based predictor of β-sheet propensity (Chou-Fasman, AGADIR, AmyloGram, Tango). ANN trained on RADA16 family analogues if you have data.
- Medium: AlphaFold3 dimer prediction with pae_pred_dimer as fitness signal (see STRC AF3 Static Pocket Blindness to Loop Dynamics for caveats).
- Expensive: all-atom MD with secondary-structure metric — only for top-5 candidates.
Run ACO with the verbatim pseudocode above, swapping Evaluate_with_ANN(...) for your chosen fitness.
Convergence diagnosis: track Σ |Δp_i| across the matrix per iteration. Two-phase pattern (Hiss-Schneider Fig. 18.11): exploration phase (rising total |Δp|) then exploitation phase (falling). Stop after 10⁴ stable iterations.
Output: top-K sequences ranked by their fitness during the exploitation phase, NOT just the final converged sequence.
Validate experimentally: synthesize top 10–20 sequences, test for self-assembly (CD spectroscopy for β-sheet, AFM for fiber morphology, rheology for hydrogel modulus).

h09 mutation-tolerance for active sequences

To find the minimal-mutation set that breaks self-assembly (a falsifier for a residue’s load-bearing role):

Use SME (Eq. 18.5, Gaussian-distance mutation) instead of ACO.
Distance matrix: Amino Acid Physicochemical Distance Matrix Grantham Modified.
σ = 0.10–0.15 for conservative mutations; σ = 0.5 for broad mutation.
Generate offspring library (~20–100 sequences), score each with the same fitness function, plot fitness vs. Hamming distance from parent.

h09 stability post-design

After ACO produces a candidate self-assembling peptide, apply Recipe — Therapeutic Peptide Stability Modifications for protease-resistance / serum-half-life improvements before in vivo testing.

Citation pattern for h09 ACO script docstring

# Sequence-space exploration via Ant Colony Optimization (ACO) per
# Hiss & Schneider 2014 (Schneider Ed., De novo Molecular Design, Ch. 18).
# Pheromone matrix {N × 20}, initialized at 0.05, bounded [0.1, 0.9].
# UpdateFactor = (Fitness − 0.5)/100; probability-sum compensation per §18.3.
# Worked validation: MHC-I octapeptide design, 89%/95% accuracy on stabilizing/
# non-stabilizing classes (Schneider 2014 §18.3 [57]).

Connections

[part-of] rada16-geometry
[source] 2014-schneider-de-novo-molecular-design-book
[applies] index
[see-also] Amino Acid Physicochemical Distance Matrix Grantham Modified
[see-also] Recipe — Therapeutic Peptide Stability Modifications
[see-also] Fragment Additivity Assumption and Superadditivity

STRC Research

Explorer

Recipe — Ant Colony Optimization for Peptide Sequence Design

Recipe — Ant Colony Optimization for Peptide Sequence Design

Verbatim algorithm (§18.3, Hiss & Schneider)

Verbatim pseudocode (§18.3)

Diversity-pushing variant (§18.3 Eq. 18.6)

Parameter table (verbatim from §18.3)

How to use in STRC h09

h09 RADA16-like assembly motif design

h09 mutation-tolerance for active sequences

h09 stability post-design

Citation pattern for h09 ACO script docstring

Connections

Graph View

Table of Contents

Backlinks