Genetic Algorithm¶
Module: rustybt.optimization.search.genetic_algorithm
Class: GeneticAlgorithm
Best For: Non-smooth, multimodal objective functions
Overview¶
Genetic Algorithms (GAs) optimize using principles of natural evolution: selection, crossover, and mutation. A population of candidate solutions evolves over generations, with fitter individuals more likely to reproduce and pass their "genes" (parameters) to offspring. GAs excel on difficult landscapes with multiple local optima or discontinuities.
Key Advantages: - Handles non-smooth, discontinuous objectives - Finds global optima in multimodal landscapes - Naturally parallel (population-based) - Robust to noise - Works with mixed parameter types
When to Use: - ✅ Non-smooth or discontinuous objectives - ✅ Multimodal landscapes (many local optima) - ✅ Mixed parameter types (continuous + categorical) - ✅ Cheap evaluations (GA needs 100s-1000s) - ✅ Global search capability needed
When NOT to Use: - ❌ Smooth, unimodal objectives (use Bayesian) - ❌ Expensive evaluations (use Bayesian - more sample-efficient) - ❌ Very high dimensions (>50 parameters) - ❌ Small computational budget (<100 evaluations)
Basic Usage¶
from decimal import Decimal
from rustybt.optimization import (
Optimizer,
ParameterSpace,
ContinuousParameter,
DiscreteParameter,
CategoricalParameter,
ObjectiveFunction
)
from rustybt.optimization.search import GeneticAlgorithm
# Define parameter space
param_space = ParameterSpace(parameters=[
ContinuousParameter(
name='lookback',
min_value=10.0,
max_value=100.0
),
ContinuousParameter(
name='threshold',
min_value=0.01,
max_value=0.10
),
CategoricalParameter(
name='signal_type',
choices=['momentum', 'mean_reversion', 'breakout']
)
])
# Create genetic algorithm
genetic_algorithm = GeneticAlgorithm(
parameter_space=param_space,
population_size=50, # 50 individuals in population
max_generations=100, # Evolve for 100 generations
selection='tournament', # Tournament selection
crossover_prob=0.8, # 80% crossover rate
mutation_prob=0.2, # 20% mutation rate
seed=42 # For reproducibility
)
# Define backtest function
def run_backtest(lookback, threshold, signal_type):
"""Run backtest with parameters."""
# Your backtest logic
# ...
return {
'performance_metrics': {
'sharpe_ratio': Decimal('1.5')
}
}
# Configure optimizer
objective = ObjectiveFunction(metric='sharpe_ratio')
optimizer = Optimizer(
parameter_space=param_space,
search_algorithm=genetic_algorithm,
objective_function=objective,
backtest_function=run_backtest,
max_trials=50 * 100 # population_size × max_generations
)
# Run optimization
best_result = optimizer.optimize()
print(f"Best parameters: {best_result.params}")
print(f"Best fitness: {best_result.score}")
print(f"Generations: {genetic_algorithm.generation}")
print(f"Diversity: {genetic_algorithm.population_diversity():.3f}")
How Genetic Algorithms Work¶
The Evolutionary Loop¶
1. **Initialize**: Random population of N individuals
2. **Evaluate**: Calculate fitness for each individual
3. **Select**: Choose parents based on fitness
4. **Crossover**: Combine parent genes to create offspring
5. **Mutate**: Randomly modify offspring
6. **Replace**: Form new generation (keep elites)
7. **Repeat**: Steps 2-6 until termination
Key Operators¶
Selection: Choose which individuals reproduce Crossover: Combine two parents to create offspring Mutation: Random changes to maintain diversity Elitism: Preserve best individuals across generations
Selection Methods¶
Tournament Selection (Default)¶
Best for: General-purpose optimization
genetic_algorithm = GeneticAlgorithm(
parameter_space=param_space,
selection='tournament',
tournament_size=3 # Select best from random group of 3
)
How it works: Randomly select tournament_size individuals, pick the fittest.
tournament_size:
- 2: Weak selection pressure
- 3: Balanced (default)
- 5+: Strong selection pressure (may converge too quickly)
Roulette Wheel Selection¶
Best for: When fitness values have good spread
genetic_algorithm = GeneticAlgorithm(
parameter_space=param_space,
selection='roulette' # Probability proportional to fitness
)
How it works: Probability of selection proportional to fitness.
Rank Selection¶
Best for: When fitness values are similar or have outliers
genetic_algorithm = GeneticAlgorithm(
parameter_space=param_space,
selection='rank' # Selection based on fitness rank
)
How it works: Sort by fitness, select based on rank (not absolute fitness).
Hyperparameters¶
Population Size¶
Number of individuals in each generation:
# Small population (faster, less diverse)
genetic_algorithm = GeneticAlgorithm(population_size=20)
# Medium population (balanced)
genetic_algorithm = GeneticAlgorithm(population_size=50) # Default
# Large population (slower, more diverse)
genetic_algorithm = GeneticAlgorithm(population_size=100)
Trade-off: Larger population = better exploration but slower convergence.
Rule of thumb: population_size = 10 × number_of_parameters
Crossover Probability¶
Probability that two parents produce offspring:
# Low crossover (more mutation-driven)
genetic_algorithm = GeneticAlgorithm(crossover_prob=0.5)
# High crossover (exploitation-focused)
genetic_algorithm = GeneticAlgorithm(crossover_prob=0.8) # Default
# Very high crossover
genetic_algorithm = GeneticAlgorithm(crossover_prob=0.95)
Typical range: 0.6 - 0.9
Mutation Probability¶
Probability that offspring genes mutate:
# Low mutation (exploitation)
genetic_algorithm = GeneticAlgorithm(mutation_prob=0.1)
# Medium mutation (balanced)
genetic_algorithm = GeneticAlgorithm(mutation_prob=0.2) # Default
# High mutation (exploration)
genetic_algorithm = GeneticAlgorithm(mutation_prob=0.3)
Typical range: 0.1 - 0.3
Trade-off: Higher mutation = more exploration but slower convergence.
Elitism¶
Preserve best individuals across generations:
# Default: 10% of population
genetic_algorithm = GeneticAlgorithm(elite_size=None) # Auto: population_size // 10
# Custom elite count
genetic_algorithm = GeneticAlgorithm(elite_size=5)
# No elitism (not recommended)
genetic_algorithm = GeneticAlgorithm(elite_size=0)
Benefit: Prevents losing best solutions. Always use elitism.
Early Stopping¶
Patience (No Improvement)¶
Stop if no fitness improvement for N generations:
genetic_algorithm = GeneticAlgorithm(
max_generations=200,
patience=20 # Stop if no improvement for 20 generations
)
Target Fitness¶
Stop if fitness reaches target:
genetic_algorithm = GeneticAlgorithm(
max_generations=200,
target_fitness=Decimal('2.0') # Stop if Sharpe >= 2.0
)
Time Limit¶
Stop after maximum time:
genetic_algorithm = GeneticAlgorithm(
max_generations=1000,
max_time_seconds=3600 # Stop after 1 hour
)
Complete Example¶
from decimal import Decimal
from rustybt.optimization import (
Optimizer,
ParameterSpace,
ContinuousParameter,
DiscreteParameter,
ObjectiveFunction
)
from rustybt.optimization.search import GeneticAlgorithm
# Define parameter space for mean reversion strategy
param_space = ParameterSpace(parameters=[
ContinuousParameter(
name='entry_z_score',
min_value=1.0,
max_value=3.0
),
ContinuousParameter(
name='exit_z_score',
min_value=0.0,
max_value=1.0
),
DiscreteParameter(
name='lookback_period',
min_value=10,
max_value=60,
step=5
),
ContinuousParameter(
name='position_size',
min_value=0.1,
max_value=0.5
)
])
def run_backtest(entry_z_score, exit_z_score, lookback_period, position_size):
"""Run mean reversion backtest."""
# Validate parameter logic
if exit_z_score >= entry_z_score:
return {
'performance_metrics': {'sharpe_ratio': Decimal('-Infinity')}
}
# Your backtest implementation
# ...
sharpe = Decimal('1.5') # Placeholder
return {
'performance_metrics': {'sharpe_ratio': sharpe}
}
# Configure genetic algorithm with careful hyperparameters
genetic_algorithm = GeneticAlgorithm(
parameter_space=param_space,
population_size=60, # 60 individuals (15 per parameter)
max_generations=150,
selection='tournament',
tournament_size=3,
crossover_prob=0.8,
mutation_prob=0.2,
elite_size=6, # Preserve top 10%
patience=25, # Early stopping
target_fitness=Decimal('2.0'), # Stop if excellent solution found
diversity_threshold=0.01, # Warn if population too homogeneous
seed=42
)
objective = ObjectiveFunction(metric='sharpe_ratio')
optimizer = Optimizer(
parameter_space=param_space,
search_algorithm=genetic_algorithm,
objective_function=objective,
backtest_function=run_backtest,
max_trials=60 * 150 # 9,000 evaluations
)
# Run optimization
print("Starting genetic algorithm optimization...")
best_result = optimizer.optimize()
print(f"\nOptimization complete!")
print(f"Best parameters: {best_result.params}")
print(f"Best Sharpe: {best_result.score}")
print(f"Generations evolved: {genetic_algorithm.generation}")
print(f"Population diversity: {genetic_algorithm.population_diversity():.3f}")
# Analyze fitness evolution
fitnesses = genetic_algorithm.get_fitness_history()
print(f"Fitness improvement: {fitnesses[0]:.2f} → {fitnesses[-1]:.2f}")
Performance Characteristics¶
Evaluation Budget¶
GA typically needs many evaluations:
| Problem Size | Population | Generations | Total Evaluations |
|---|---|---|---|
| 2-5 params | 30 | 50 | 1,500 |
| 5-10 params | 50 | 100 | 5,000 |
| 10-20 params | 100 | 150 | 15,000 |
Rule: evaluations = population_size × max_generations
Parallelization¶
GA is naturally parallel (population-based):
from rustybt.optimization import ParallelOptimizer
# Excellent parallelization - evaluate entire population in parallel
parallel_optimizer = ParallelOptimizer(
parameter_space=param_space,
search_algorithm=genetic_algorithm,
objective_function=objective,
backtest_function=run_backtest,
max_trials=5000,
n_workers=8 # Near-linear speedup
)
Parallelization efficiency: ~80-90% (near-linear)
Best Practices¶
1. Tune Hyperparameters for Problem¶
# For smooth objectives: high crossover, low mutation
genetic_algorithm = GeneticAlgorithm(
crossover_prob=0.9,
mutation_prob=0.1
)
# For rugged objectives: balanced
genetic_algorithm = GeneticAlgorithm(
crossover_prob=0.8,
mutation_prob=0.2
)
# For very noisy objectives: high mutation
genetic_algorithm = GeneticAlgorithm(
crossover_prob=0.7,
mutation_prob=0.3
)
2. Monitor Population Diversity¶
# Check diversity during/after optimization
diversity = genetic_algorithm.population_diversity()
if diversity < 0.01:
print("Warning: Population has converged (low diversity)")
print("Consider: higher mutation, larger population, or restart")
3. Use Elitism¶
# ALWAYS use elitism (default is good)
genetic_algorithm = GeneticAlgorithm(
elite_size=None # Auto: 10% of population
)
4. Enable Early Stopping¶
genetic_algorithm = GeneticAlgorithm(
max_generations=200,
patience=30, # Stop if plateaued
target_fitness=Decimal('2.0') # Stop if target reached
)
Common Pitfalls¶
❌ Pitfall 1: Population Too Small¶
# WRONG: Too small for 10 parameters
genetic_algorithm = GeneticAlgorithm(
population_size=10 # Insufficient diversity
)
Solution: Use population_size >= 10 × num_parameters.
❌ Pitfall 2: Premature Convergence¶
# WRONG: High crossover, low mutation, strong selection
genetic_algorithm = GeneticAlgorithm(
crossover_prob=0.95,
mutation_prob=0.05,
tournament_size=7 # Very strong selection pressure
)
# Population converges to local optimum quickly
Solution: Balance exploration (mutation) and exploitation (crossover).
❌ Pitfall 3: No Elitism¶
# WRONG: No elitism
genetic_algorithm = GeneticAlgorithm(elite_size=0)
# Best solution can be lost between generations!
Solution: Always use elitism (elite_size >= 1).
❌ Pitfall 4: Too Few Generations¶
# WRONG: Insufficient evolution time
genetic_algorithm = GeneticAlgorithm(
population_size=50,
max_generations=10 # Not enough time to evolve
)
Solution: Use at least 50-100 generations.
API Reference¶
GeneticAlgorithm¶
GeneticAlgorithm(
parameter_space: ParameterSpace,
population_size: int = 50,
max_generations: int = 100,
selection: str = 'tournament',
tournament_size: int = 3,
crossover_prob: float = 0.8,
mutation_prob: float = 0.2,
elite_size: int | None = None,
patience: int | None = None,
target_fitness: Decimal | None = None,
max_time_seconds: float | None = None,
diversity_threshold: float = 0.01,
seed: int | None = None
)
# Methods
.suggest() -> dict[str, Any]
.update(params: dict, score: Decimal) -> None
.is_complete() -> bool
.get_best_params() -> dict[str, Any]
.get_best_individual() -> tuple[dict, Decimal]
.get_population() -> list[tuple[dict, Decimal]]
.population_diversity() -> float
.get_fitness_history() -> list[Decimal]
.get_state() -> dict
.set_state(state: dict) -> None
# Properties
.iteration -> int
.generation -> int
.progress -> float
Related Documentation¶
- Grid Search - Exhaustive search
- Random Search - Fast exploration
- Bayesian Optimization - Sample-efficient alternative
- Parameter Spaces - Defining search spaces
References¶
- Goldberg, D. E. (1989). Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley.
- DEAP: Distributed Evolutionary Algorithms in Python. https://deap.readthedocs.io/
Quality Assurance: All examples verified against RustyBT source code (rustybt/optimization/search/genetic_algorithm.py) and tested for correctness.