Genetic Algorithm Optimization on Non-Smooth Functions¶
This notebook demonstrates genetic algorithm optimization on non-smooth, multimodal objective functions where GA excels compared to Bayesian optimization.
Contents¶
- Setup and imports
- Define Rastrigin function (non-smooth, multimodal)
- Run Genetic Algorithm optimization
- Compare with Bayesian optimization
- Visualize population evolution
- Diversity tracking analysis
1. Setup and Imports¶
from decimal import Decimal
import matplotlib.pyplot as plt
import numpy as np
from rustybt.optimization.parameter_space import ContinuousParameter, ParameterSpace
from rustybt.optimization.search.bayesian_search import BayesianOptimizer
from rustybt.optimization.search.genetic_algorithm import GeneticAlgorithm
# Set plot style
plt.style.use("seaborn-v0_8-darkgrid")
%matplotlib inline
2. Define Rastrigin Function¶
The Rastrigin function is a non-smooth, highly multimodal function with many local optima. It's a classic benchmark for testing optimization algorithms on difficult landscapes.
$$f(x, y) = 20 + x^2 + y^2 - 10(\cos(2\pi x) + \cos(2\pi y))$$
- Global optimum: $(0, 0)$ with $f(0, 0) = 0$
- Many local optima: Due to cosine terms
- Non-smooth: Discontinuities make gradient-based methods struggle
Why GA excels here:
- Population-based search explores multiple regions
- Doesn't rely on smoothness assumptions
- Crossover helps escape local optima
def rastrigin(params: dict[str, Decimal]) -> Decimal:
"""Rastrigin function (to maximize, so negate).
Args:
params: Dictionary with 'x' and 'y' parameters
Returns:
Negative Rastrigin value (for maximization)
"""
x = float(params["x"])
y = float(params["y"])
A = 10
result = 2 * A + (x**2 - A * np.cos(2 * np.pi * x)) + (y**2 - A * np.cos(2 * np.pi * y))
return Decimal(str(-result)) # Negate for maximization
# Visualize the Rastrigin function
x = np.linspace(-5.12, 5.12, 200)
y = np.linspace(-5.12, 5.12, 200)
X, Y = np.meshgrid(x, y)
A = 10
Z = 2 * A + (X**2 - A * np.cos(2 * np.pi * X)) + (Y**2 - A * np.cos(2 * np.pi * Y))
fig, ax = plt.subplots(figsize=(10, 8))
contour = ax.contour(X, Y, Z, levels=30, cmap="viridis")
ax.clabel(contour, inline=True, fontsize=8)
ax.plot(0, 0, "r*", markersize=20, label="Global optimum (0, 0)")
ax.set_xlabel("x")
ax.set_ylabel("y")
ax.set_title("Rastrigin Function: Non-smooth with Many Local Optima")
ax.legend()
plt.colorbar(contour, ax=ax, label="f(x, y)")
plt.show()
3. Run Genetic Algorithm Optimization¶
# Define parameter space
param_space = ParameterSpace(
parameters=[
ContinuousParameter(name="x", min_value=Decimal("-5.12"), max_value=Decimal("5.12")),
ContinuousParameter(name="y", min_value=Decimal("-5.12"), max_value=Decimal("5.12")),
]
)
# Initialize Genetic Algorithm
ga = GeneticAlgorithm(
parameter_space=param_space,
population_size=50,
max_generations=50,
selection="tournament",
tournament_size=3,
crossover_prob=0.8,
mutation_prob=0.2,
elite_size=5,
seed=42,
)
# Track all evaluated points for visualization
ga_evaluations = []
# Run optimization
iteration = 0
while not ga.is_complete():
params = ga.suggest()
score = rastrigin(params)
ga.update(params, score)
ga_evaluations.append((float(params["x"]), float(params["y"]), float(score)))
iteration += 1
if iteration % 100 == 0:
best_params, best_score = ga.get_best_result()
# Get final results
best_params, best_score = ga.get_best_result()
4. Compare with Bayesian Optimization¶
Let's compare GA with Bayesian optimization on the same problem to see how they differ.
# Initialize Bayesian Optimizer
bo = BayesianOptimizer(
parameter_space=param_space,
n_calls=2500, # Match GA evaluation budget
n_initial_points=50,
seed=42,
)
# Track all evaluated points
bo_evaluations = []
# Run optimization
iteration = 0
while not bo.is_complete():
params = bo.suggest()
score = rastrigin(params)
bo.update(params, score)
bo_evaluations.append((float(params["x"]), float(params["y"]), float(score)))
iteration += 1
if iteration % 100 == 0:
best_params_bo, best_score_bo = bo.get_best_result()
# Get final results
best_params_bo, best_score_bo = bo.get_best_result()
# Comparison
5. Visualize Population Evolution¶
Visualize how the GA population explores the search space over generations.
# Plot search trajectories
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(16, 6))
# GA trajectory
ga_x = [e[0] for e in ga_evaluations]
ga_y = [e[1] for e in ga_evaluations]
ga_scores = [e[2] for e in ga_evaluations]
contour1 = ax1.contour(X, Y, Z, levels=20, cmap="gray", alpha=0.3)
scatter1 = ax1.scatter(ga_x, ga_y, c=ga_scores, s=20, cmap="viridis", alpha=0.6)
ax1.plot(0, 0, "r*", markersize=20, label="Global optimum")
ax1.plot(float(best_params["x"]), float(best_params["y"]), "g*", markersize=15, label="GA best")
ax1.set_xlabel("x")
ax1.set_ylabel("y")
ax1.set_title("Genetic Algorithm: Exploration Pattern")
ax1.legend()
plt.colorbar(scatter1, ax=ax1, label="Score (negated Rastrigin)")
# Bayesian trajectory
bo_x = [e[0] for e in bo_evaluations]
bo_y = [e[1] for e in bo_evaluations]
bo_scores = [e[2] for e in bo_evaluations]
contour2 = ax2.contour(X, Y, Z, levels=20, cmap="gray", alpha=0.3)
scatter2 = ax2.scatter(bo_x, bo_y, c=bo_scores, s=20, cmap="viridis", alpha=0.6)
ax2.plot(0, 0, "r*", markersize=20, label="Global optimum")
ax2.plot(
float(best_params_bo["x"]),
float(best_params_bo["y"]),
"g*",
markersize=15,
label="Bayesian best",
)
ax2.set_xlabel("x")
ax2.set_ylabel("y")
ax2.set_title("Bayesian Optimization: Exploration Pattern")
ax2.legend()
plt.colorbar(scatter2, ax=ax2, label="Score (negated Rastrigin)")
plt.tight_layout()
plt.show()
6. Analyze Generation Statistics¶
Examine how fitness and diversity evolve over generations.
# Get generation history
history = ga.get_generation_history()
fig, axes = plt.subplots(2, 2, figsize=(14, 10))
# Plot 1: Best fitness over generations
ax1 = axes[0, 0]
ax1.plot(
history["generation"],
[float(f) for f in history["best_fitness"]],
"b-",
linewidth=2,
label="Best fitness",
)
ax1.plot(
history["generation"],
[float(f) for f in history["avg_fitness"]],
"r--",
linewidth=2,
label="Avg fitness",
)
ax1.set_xlabel("Generation")
ax1.set_ylabel("Fitness (negated Rastrigin)")
ax1.set_title("Fitness Evolution")
ax1.legend()
ax1.grid(True)
# Plot 2: Diversity over generations
ax2 = axes[0, 1]
ax2.plot(history["generation"], history["diversity"], "g-", linewidth=2)
ax2.axhline(
y=ga.diversity_threshold,
color="r",
linestyle="--",
label=f"Threshold ({ga.diversity_threshold})",
)
ax2.set_xlabel("Generation")
ax2.set_ylabel("Population Diversity")
ax2.set_title("Diversity Tracking")
ax2.legend()
ax2.grid(True)
# Plot 3: Fitness improvement rate
ax3 = axes[1, 0]
best_fitness = [float(f) for f in history["best_fitness"]]
improvement = [0] + [best_fitness[i] - best_fitness[i - 1] for i in range(1, len(best_fitness))]
ax3.bar(history["generation"], improvement, color="blue", alpha=0.7)
ax3.set_xlabel("Generation")
ax3.set_ylabel("Fitness Improvement")
ax3.set_title("Generation-to-Generation Improvement")
ax3.grid(True)
# Plot 4: Convergence progress
ax4 = axes[1, 1]
# Distance from optimum (0, 0) for best individual per generation
# For visualization, we'll show cumulative evaluations
cumulative_evals = [(i + 1) * ga.population_size for i in history["generation"]]
ax4.plot(cumulative_evals, [float(f) for f in history["best_fitness"]], "b-", linewidth=2)
ax4.set_xlabel("Total Evaluations")
ax4.set_ylabel("Best Fitness")
ax4.set_title("Convergence Curve")
ax4.grid(True)
plt.tight_layout()
plt.show()
# Print statistics
Key Takeaways¶
When to Use Genetic Algorithms¶
Use GA when:
- Non-smooth objectives: Discontinuities, noise, or lack of gradients
- Multimodal landscapes: Many local optima that trap gradient-based methods
- Mixed parameter types: Continuous + discrete + categorical parameters
- Cheap evaluations: GA needs 100s-1000s of evaluations
- Exploration important: Want diverse solutions, not just single optimum
Don't use GA when:
- Smooth, unimodal: Bayesian optimization is more sample-efficient
- Expensive evaluations: Each backtest takes minutes/hours
- Very high dimensions: >50 parameters (curse of dimensionality)
GA Configuration Tips¶
- Population size: 20-100 (larger for harder problems)
- Selection: Tournament (default) is robust; roulette for fitness-proportional
- Crossover prob: 0.7-0.9 (higher = more exploitation)
- Mutation prob: 0.1-0.3 (higher = more exploration)
- Elite size: 10-20% of population (prevents losing best solutions)
- Diversity tracking: Monitor to detect premature convergence
Comparison with Bayesian Optimization¶
On the Rastrigin function (non-smooth, multimodal):
- GA: Better at escaping local optima, explores broadly
- Bayesian: Can get trapped in local optima, focuses too narrowly
On smooth functions (e.g., sphere, Rosenbrock):
- GA: Works but needs many evaluations
- Bayesian: More sample-efficient, converges faster