"""Function optimization benchmarks.
Function optimization benchmarks
================================
These functions are similar to the ones found at BBOB (Real-Parameter
Black-Box Optimization Benchmarking). Code and documentation are based on the
software. See the following links for more details:
* http://coco.gforge.inria.fr/doku.php
* http://coco.gforge.inria.fr/doku.php?id=downloads
* http://coco.lri.fr/downloads/download13.09/bbobdocfunctions.pdf
Test Functions
==============
Search Domain
-------------
All functions are defined and can be evaluated over :math:`R^D`, while the
actual search domain is given as :math:`[-5,5]^D`. On some functions a
penalty boundary handling is applied (see :func:`f_pen`).
Location of the optimum
-----------------------
The optimum is denoted as :math:`x_{opt}` and :math:`f_{opt} = f(x_{opt})`.
All functions have their global optimum in :math:`[-5,5]^D`. The majority of
functions has the global optimum in :math:`[-4,4]^D` and for many of them
x_opt is drawn uniformly from this compact. The value is rounded after two
decimal places and clipped to :math:`[-1000, 1000]`. In the function
definitions a transformed variable vector z is often used instead of the
argument x. The vector z has its optimum in :math:`z_{opt} = 0`, if not
stated otherwise.
Linear Transformations
----------------------
Linear transformations of the search space are applied to derive
non-separable functions from separable ones and to control the
conditioning of the function. We use multiplication with orthogonal
(rotation) matrices :math:`Q, R`. Orthogonal matrices are generated
from standard normally distributed entries by Gram-Schmidt
orthonormalization. Columns and rows of an orthogonal matrix form an
orthonormal basis.
Non-Linear Transformations and Symmetry Breaking
------------------------------------------------
In order to make relatively simple, but well understood functions less
regular, on some functions non-linear transformations are applied in x-
or f-space. Both transformations :math:`T_{osc}: R^n \\rightarrow R^n, n
\in \{1, D\}`, and :math:`T_{asy}: R^D \\rightarrow R^D` are defined
coordinate-wise (see :func:`T_asy` and :func:`T_osc`). They are smooth
and have, coordinate-wise, a strictly positive derivative. :math:`T_{osc}`
is oscillating about the identity, where the oscillation is scale invariant
w.r.t. the origin. :math:`T_{asy}` is the identity for negative values. When
:math:`T_{asy}` is applied, a portion of :math:`1 / 2^D` of the search space
remains untransformed.
Function Properties
===================
Ill-Conditioning
----------------
Ill-conditioning is a typical challenge in real-parameter optimization and,
besides multimodality, probably the most common one. Conditioning of a
function can be rigorously formalized in the case of convex quadratic
functions, :math:`\\frac{1}{2} x^THx` where H is a symmetric positive
definit matrix, as the condition number of the Hessian matrix H. Since
contour lines associated to a convex quadratic function are ellipsoids,
the condition number corresponds to the square root of the ratio between
the largest axis of the ellipsoid and the shortest axis. For more general
functions, conditioning loosely refers to the square of the ratio between
the largest direction and smallest of a contour line. The testbed contains
ill-conditioned functions with a typical condtioning of :math:`10^6`. We
believe this is a realistic requirement, while we have seen practical
problems with condtioning as large as :math:`10^{10}`.
Regularity
----------
Functions from simple formulas are often highly regular. We have used a
non-linear transformation, :math:`T_{osc}`, in order to introduce small,
smooth but clearly visible irregularities. Furthermore, the testbed contains
a few highly irregular functions.
Separability
------------
In general, separable functions pose an essentially different search
problem to solve, because the search process can be reduced to D
one-dimensional search procedures. Consequently, non-separable problems
must be considered much more difficult and most benchmark functions are
designed being non-separable. The typical well-established technique to
generate non-separable functions from separable ones is the application
of a rotation matrix R.
Symmetry
--------
Stochastic search procedures often rely on Gaussian distributions to
generate new solutions and it has been argued that symmetric benchmark
functions could be in favor of these operators. To avoid a bias in favor
of highly symmetric operators we have used a symmetry breaking
transformation, :math:`T_{asy}`. We have also included some highly asymmetric
functions.
Target function value to reach
------------------------------
The typical target function value for all functions is :math:`f_{opt} -
10^{-8}`. On many functions a value of :math:`f_{opt} - 1` is not very
difficult to reach, but the difficulty versus function value is not uniform
for all functions. These properties are not intrinsic, that is
:math:`f_{opt} - 10^{-8}` is not intrinsically "very good". The value mainly
reflects a scalar multiplier in the function definition.
"""
# Based on BBOB code (copyright 2011 The BBOB Team members)
# Author: Alexander Fabisch <afabisch@informatik.uni-bremen.de>
from abc import ABCMeta, abstractmethod
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from ..utils import check_random_state
from .environment import Environment
def generate_x_opt(random_state, n_dims):
"""Generate random optimum parameters.
For many functions, :math:`x_opt` is drawn randomly from [-4, 4]. Each
component of :math:`x_opt` is guaranteed to be less or greater than 0
and they rounded to 4 decimal digits.
Parameters
----------
random_state : RandomState
Random number generator
n_dims : int
Number of dimensions
Returns
-------
x_opt : array-like, shape (n_dims,)
Optimum parameter vector
"""
x_opt = np.round(random_state.uniform(-4.0, 4.0, (n_dims,)), 4)
x_opt[x_opt == 0] = -1e-5
return x_opt
def generate_f_opt(random_state):
"""Generate random optimal function value.
The value for :math:`f_{opt}` is drawn from a Cauchy distributed random
variable, with zero median and with roughly 50% of the values between
-100 and 100. The value is rounded after two decimal places and clipped
to :math:`[-1000, 1000]`.
Parameters
----------
random_state : RandomState
Random number generator
"""
r = random_state.randn(2)
return np.clip(np.round(100 * r[0] / r[1], 2), -1000.0, 1000.0)
def T_osc(f):
"""Introduce small, smooth, but clearly visible irregularities.
.. math::
T_{osc}(x) = \\text{sign}(x)\\exp(\\hat{x} + 0.049(\\sin(c_1\\hat{x}) +
\\sin(c_2\\hat{x})))\\\\
\\hat{x} = \\begin{cases}
\\log{|x|} & \\text{if } x \\neq 0\\\\
0 & \\text{otherwise}
\\end{cases},
c_1 = \\begin{cases}
10 & \\text{if } x > 0\\\\
5.5 & \\text{otherwise}
\\end{cases},
c_2 = \\begin{cases}
7.9 & \\text{if } x > 0\\\\
3.1 & \\text{otherwise}
\\end{cases}
Parameters
----------
f : array-like, shape (n_dims,) or float
Input
Returns
-------
g : array-like, shape (n_dims,) or float
Output
"""
is_scalar = np.isscalar(f)
f = np.atleast_1d(f)
g = f.copy()
idx = f > 0
g[idx] = np.log(f[idx]) / 0.1
g[idx] = np.exp(g[idx] + 0.49 * (np.sin(g[idx]) +
np.sin(0.79 * g[idx]))) ** 0.1
idx = f < 0
g[idx] = np.log(-f[idx]) / 0.1
g[idx] = -np.exp(g[idx] + 0.49 * (np.sin(0.55 * g[idx]) +
np.sin(0.31 * g[idx]))) ** 0.1
if is_scalar:
g = float(g)
return g
def T_asy(x, beta):
"""Symmetry breaking transformation.
.. math::
T_{asy}^{\\beta} (x_i) =
\\begin{cases}x_i^{
1 + \\beta \\frac{i-1}{D-1}\sqrt{x_i}} & \\text{if } x_i > 0\\\\
x_i & \\text{otherwise}
\\end{cases}
Parameters
----------
x : array-like, shape (n_dims,)
Input
Returns
-------
z : array-like, shape (n_dims,)
Output
"""
exponent = np.linspace(0, beta, len(x))
idx = np.where(x > 0)
z = x.copy()
z[idx] **= 1.0 + exponent[idx] * np.sqrt(x[idx])
return z
def Lambda(n_dims, alpha):
"""Diagonal matrix with exponentially increasing values.
.. math::
\\lambda_{ii} = \\alpha^{0.5 \\frac{i-1}{D-1}}
Parameters
----------
n_dims : int
Number of dimensions
Returns
-------
Lambda : array-like, shape (n_dims,)
Diagonal part of the matrix
"""
return alpha ** np.linspace(0.0, 0.5, n_dims)
def generate_rotation(random_state, n_dims):
"""Returns an orthogonal basis.
The rotation is used in several ways and in combination with
non-linear transformations. Search space rotation invariant
algorithms are not expected to be invariant under this rotation.
Parameters
----------
random_state : RandomState
Random number generator
n_dims : int
Number of dimensions
Returns
-------
B : array-like, shape (n_dims,)
Orthogonal basis
"""
B = random_state.randn(n_dims, n_dims)
for i in range(n_dims):
for j in range(i):
B[i] -= B[i].dot(B[j]) * B[j]
B[i] /= np.linalg.norm(B[i])
return B
def f_pen(x):
"""Penalty outside of the boundaries.
.. math::
f_{pen} = \\sum_{i=1}^D \\max (0, |x_i| - 5)^2
Parameters
----------
x : array-like, shape (n_dims,)
Input
Returns
-------
f_pen : float
Penalty
"""
return np.sum(np.maximum(0.0, np.abs(x) - 5.0) ** 2)
def linear_transformation(random_state, n_dims, alpha, return_rotations=False):
"""Generate a linear transformation.
The linear transformation has the form
.. math::
Q \Lambda^\\alpha
Parameters
----------
random_state : RandomState
Random number generator
n_dims : int
Number of dimensions
alpha : float
See :func:`Lambda`
return_rotations : bool, optional (default: False)
"""
Q = generate_rotation(random_state, n_dims)
R = generate_rotation(random_state, n_dims)
lin_trans = Q.dot(Lambda(n_dims, alpha)[:, np.newaxis] * R)
if return_rotations:
return lin_trans, Q, R
else:
return lin_trans
def rastrigin(x, penalize_norm=True):
"""Original Rastrigin function.
.. math::
f(x) = 10 \\left(D - \\sum_{i=1}^D \\cos ( 2 \\pi x_i) \\right) + ||x||^2
Parameters
----------
x : array-like, shape (n_dims,)
Input
penalize_norm : bool (default: True)
Add squared norm term to result
Returns
-------
f : float
Function value
"""
f = 10.0 * (x.shape[0] - np.sum(np.cos(2.0 * np.pi * x)))
if penalize_norm:
f += x.dot(x)
return f
def rosenbrock(x):
"""Original Rosenbrock function.
.. math::
f(x) = \sum_{i=1}^{D-1} 100 (x_i^2 - x_{i+1})^2 + (x_i - 1)^2
Parameters
----------
x : array-like, shape (n_dims,)
Input
Returns
-------
f : float
Function value
"""
return np.sum(100 * (x[:-1] ** 2 - x[1:]) ** 2 + (x[:-1] - 1) ** 2)
class ObjectiveFunctionBase(object):
"""Base of objective functions for black-box optimization.
The default constructor of a objective function will initialize the optimal
function value :math:`f_{opt}` according to :func:`generate_f_opt` and
the optimal parameter vector :math:`x_{opt}` according to
:func:`generate_x_opt`.
Parameters
----------
random_state : RandomState or int
Random number generator or seed
n_dims : int
Number of dimensions
Attributes
----------
`x_opt` : array-like, shape (n_dims,)
Optimum parameter vector
`f_opt` : float
Maximum function value
"""
__metaclass__ = ABCMeta
def __init__(self, random_state, n_dims):
self.random_state = check_random_state(random_state)
self.n_dims = n_dims
self.x_opt = generate_x_opt(self.random_state, self.n_dims)
self.f_opt = generate_f_opt(self.random_state)
@abstractmethod
def feedback(self, x):
"""Evaluate function at a given point.
Parameters
----------
x : array-like, shape (n_dims,)
Parameters
Returns
-------
f : float
Function value
"""
class Sphere(ObjectiveFunctionBase):
"""Sphere function.
Properties
* unimodal
* highly symmetric, in particular rotationally invariant, scal invariant
Information gained from this function
* What is the optimal convergence rate of an algorithm?
.. plot::
from contextual_optimization.objective_functions import plot_function
plot_function("Sphere")
"""
def __init__(self, random_state, n_dims):
super(Sphere, self).__init__(random_state, n_dims)
def feedback(self, x, x_opt_offset=None):
x_opt = self.x_opt
if x_opt_offset is not None:
x_opt = x_opt + x_opt_offset
return -(np.linalg.norm(x - x_opt) ** 2) + self.f_opt
class Ellipsoidal(ObjectiveFunctionBase):
"""Separable ellipsoid with monotone transformation.
Properties
* unimodal
* conditioning is about :math:`10^6`
Information gained from this function
* In comparison to Sphere: Is symmetry exploited?
* In comparison to EllipsoidalRotated: Is separability exploited?
.. plot::
from contextual_optimization.objective_functions import plot_function
plot_function("Ellipsoidal")
"""
def __init__(self, random_state, n_dims, condition=1e6):
super(Ellipsoidal, self).__init__(random_state, n_dims)
self.condition = condition
self.scales = self.condition ** np.linspace(0, 1, self.n_dims)
def feedback(self, x):
z = T_osc(x - self.x_opt)
return -np.dot(self.scales, z ** 2) + self.f_opt
class Rastrigin(ObjectiveFunctionBase):
"""Rastrigin with monotone transformation separable condition 10.
The implementation is based on :func:`rastrigin`.
Properties
Highly multimodal function with a comparatively regular structure for the
placement of the optima. The transformations T_asy and T_osc alleviate the
symmetry and regularity of the original Rastrigin function
* roughly :math:`10^6` local optima
* conditioning is about 10
Information gained from this function
* In comparison to Ellipsoidal: What is the effect of multimodality?
.. plot::
from contextual_optimization.objective_functions import plot_function
plot_function("Rastrigin")
"""
def __init__(self, random_state, n_dims, condition=10.0):
super(Rastrigin, self).__init__(random_state, n_dims)
self.condition = condition
self.Lambda = Lambda(self.n_dims, self.condition)
def feedback(self, x, x_opt_offset=None):
x_opt = self.x_opt
if x_opt_offset is not None:
x_opt = x_opt + x_opt_offset
z = self.Lambda * T_asy(T_osc(x - x_opt), 0.2)
return -rastrigin(z) + self.f_opt
class BuecheRastrigin(ObjectiveFunctionBase):
"""Skew Rastrigin-Bueche, condition 10, skew-"condition" 100.
The implementation is based on :func:`rastrigin`.
Properties
Highly multimodal function with a structured but highly asymmetric
placement of the optima. Constructed as a deceptive function for
symmetrically distributed search operators.
* roughly :math:`10^D` local optima, conditioning is about 10, skew factor
is about 10 in x-space and 100 in f-space
Information gained from this function
* In comparison to Rastrigin: What is the effect of asymmetry?
.. plot::
from contextual_optimization.objective_functions import plot_function
plot_function("BuecheRastrigin")
"""
def __init__(self, random_state, n_dims, condition=10.0):
super(BuecheRastrigin, self).__init__(random_state, n_dims)
self.condition = condition
self.Lambda = Lambda(self.n_dims, self.condition)
def feedback(self, x):
z = T_osc(x - self.x_opt)
tmp_z = z[::2]
tmp_z[tmp_z > 0] *= 10.0
z *= self.Lambda
return -(rastrigin(z) + 100.0 * f_pen(x)) + self.f_opt
class LinearSlope(ObjectiveFunctionBase):
"""Linear slope.
Properties
Purely linear function testing whether the search can go outside the
initial convex hull of solutions right into the domain boundary.
* x_opt is on the domain boundary
Information gained from this function
* Can the search go outside the initial convex hull of solutions into the
domain boundary? Can the step size be increased accordingly?
.. plot::
from contextual_optimization.objective_functions import plot_function
plot_function("LinearSlope")
"""
def __init__(self, random_state, n_dims):
super(LinearSlope, self).__init__(random_state, n_dims)
self.x_opt = 5 * np.sign(self.x_opt)
self.scales = np.sign(self.x_opt) * Lambda(self.n_dims, 100)
self.offset = 5 * np.sum(np.abs(self.scales))
def feedback(self, x):
z = x.copy()
idx = self.x_opt * x > 25
z[idx] = np.sign(z[idx])
return -(self.offset - np.dot(self.scales, z)) + self.f_opt
class AttractiveSector(ObjectiveFunctionBase):
"""Attractive sector function.
Properties
Only a "hypercone" (with angular base area) with a volume of roughly
:math:`0.5^D` yields low function values. The optimum is located at the
tip of this cone. This function can be deceptive for cumulative step size
adaptation.
* highly asymmetric
* unimodal
Information gained from this function
* In comparison to Sphere: What is the effect of a highly asymmetric
landscape?
.. plot::
from contextual_optimization.objective_functions import plot_function
plot_function("AttractiveSector")
"""
def __init__(self, random_state, n_dims, alpha=100.0):
super(AttractiveSector, self).__init__(random_state, n_dims)
self.alpha = alpha
self.lin_trans = linear_transformation(self.random_state, self.n_dims,
self.alpha)
def feedback(self, x):
z = np.dot(self.lin_trans, x - self.x_opt)
z[z * self.x_opt > 0] *= self.alpha
return -T_osc(z.dot(z)) ** 0.9 + self.f_opt
class StepEllipsoidal(ObjectiveFunctionBase):
"""Step ellipsoidal function.
Properties
The function consists of many plateaus of different sizes. Apart from a
small area close to the global optimum, the gradient is zero almost
everywhere.
* unimodal, non-separable, conditioning is about 100
Information gained from this function
* Does the search get stuck on plateaus?
.. plot::
from contextual_optimization.objective_functions import plot_function
plot_function("StepEllipsoidal")
"""
def __init__(self, random_state, n_dims, condition=100.0, alpha=10.0):
super(StepEllipsoidal, self).__init__(random_state, n_dims)
self.condition = condition
self.alpha = alpha
R = generate_rotation(self.random_state, self.n_dims)
self.lin_trans = Lambda(self.n_dims, self.condition / 10)[:, np.newaxis] * R
self.Q = generate_rotation(self.random_state, self.n_dims)
self.scales = self.condition ** np.linspace(0, 1, self.n_dims)
def feedback(self, x):
z_hat = np.dot(self.lin_trans, x - self.x_opt)
z_sim = np.empty(self.n_dims)
idx = z_hat > 0.5
z_sim[idx] = np.round(z_hat[idx])
idx = np.logical_not(idx)
z_sim[idx] = np.round(self.alpha * z_hat[idx]) / self.alpha
z = self.Q.dot(z_sim)
return -(0.1 * np.maximum(1e-4 * np.abs(z_hat[0]),
np.dot(self.scales, z ** 2)) +
100.0 * f_pen(z)) + self.f_opt
class Rosenbrock(ObjectiveFunctionBase):
"""Rosenbrock function.
The implementation is based on :func:`rosenbrock`.
Properties
So-called banana function due to its 2D contour lines as a bent ridge (or
valley). In the beginning, the prominent first term of the function
definition attracts to the point z=0. Then, a long bending valley needs to
be followed to reach the global optimum. The ridge changes its orientation
D-1 times.
* partial separable (tri-band structure)
* in larger dimensions the function has a local optimum with an attraction
volume of about 25%
Information gained from this function
* Can the search follow a long path with D-1 changes in the direction?
.. plot::
from contextual_optimization.objective_functions import plot_function
plot_function("Rosenbrock")
"""
def __init__(self, random_state, n_dims):
super(Rosenbrock, self).__init__(random_state, n_dims)
self.scale = np.maximum(1, np.sqrt(self.n_dims) / 8)
def feedback(self, x, x_opt_offset=None):
x_opt = self.x_opt
if x_opt_offset is not None:
x_opt = x_opt + x_opt_offset
z = self.scale * (x - self.x_opt) + 1
return -rosenbrock(z) + self.f_opt
class RosenbrockRotated(ObjectiveFunctionBase):
"""Rotated Rosenbrock function.
The implementation is based on :func:`rosenbrock`.
Properties
Rotated version of the Rosenbrock function
Information gained from this function
* In comparison to Rosenbrock: Can the search follow a long path with D-1
changes in the direction without exploiting partial separability?
.. plot::
from contextual_optimization.objective_functions import plot_function
plot_function("RosenbrockRotated")
"""
def __init__(self, random_state, n_dims):
super(RosenbrockRotated, self).__init__(random_state, n_dims)
scale = np.maximum(1, np.sqrt(self.n_dims) / 8)
self.R = scale * generate_rotation(self.random_state, self.n_dims)
self.x_opt = np.dot(0.5 * np.ones(self.n_dims), self.R.T) / scale ** 2
def feedback(self, x):
z = x.dot(self.R) + 0.5
return -rosenbrock(z) + self.f_opt
class EllipsoidalRotated(ObjectiveFunctionBase):
"""Rotated ellipsoid with monotone transformation.
Properties
Globally quadratic ill-conditioned function with smooth local
irregularities, non-separable counterpart to Ellipsoidal
* unimodal
* conditioning is about :math:`10^6`
Information gained from this function
* In comparison to Ellipsoidal: What is the effect of rotation
(non-separability)?
.. plot::
from contextual_optimization.objective_functions import plot_function
plot_function("EllipsoidalRotated")
"""
def __init__(self, random_state, n_dims, condition=1e6):
super(EllipsoidalRotated, self).__init__(random_state, n_dims)
self.condition = condition
self.scales = self.condition ** np.linspace(0, 1, self.n_dims)
self.R = generate_rotation(self.random_state, self.n_dims)
def feedback(self, x):
z = T_osc(np.dot(self.R, x - self.x_opt))
f = np.dot(self.scales, z ** 2)
return -f + self.f_opt
class Discus(ObjectiveFunctionBase):
"""Discus (tablet) with monotone transformation.
Properties
Globally quadratic function with irregularities. A single direction
in search space is a thousand times more sensitive than all others.
* conditioning is about :math:`10^6`
Information gained from this function
* In comparison to EllipsoidalRotated: What is the effect of constraints?
.. plot::
from contextual_optimization.objective_functions import plot_function
plot_function("Discus")
"""
def __init__(self, random_state, n_dims, condition=1e6):
super(Discus, self).__init__(random_state, n_dims)
self.condition = condition
self.R = generate_rotation(self.random_state, self.n_dims)
def feedback(self, x):
z = T_osc((x - self.x_opt).dot(self.R))
return -((self.condition - 1) * z[0] ** 2 + z.dot(z)) + self.f_opt
class BentCigar(ObjectiveFunctionBase):
"""Bent cigar with asymmetric space distortion.
Properties
A ridge defined as :math:`\\sum_{i=2}^D z_i^2 = 0` needs to be followed.
The ridge is smooth but very narrow. Due to :math:`T_{asy}^{0.5}` the overall
shape deviates remarkably from being quadratic.
* conditioning is about :math:`10^6`
* rotated
* unimodal
Information gained from this function
* Can the search continuously change its search direction?
.. plot::
from contextual_optimization.objective_functions import plot_function
plot_function("BentCigar")
"""
def __init__(self, random_state, n_dims, condition=1e6, beta=0.5):
super(BentCigar, self).__init__(random_state, n_dims)
self.condition = condition
self.beta = beta
self.R = generate_rotation(self.random_state, self.n_dims)
def feedback(self, x):
z = T_asy((x - self.x_opt).dot(self.R), self.beta).dot(self.R)
return -(self.condition * z.dot(z) +
(1 - self.condition) * z[0] ** 2) + self.f_opt
class SharpRidge(ObjectiveFunctionBase):
"""Sharp ridge.
Properties
As for the previous function, a ridge defined as :math:`\\sum_{i=2}^D
z_i^2 = 0` needs to be followed. The ridge is shape (non-differential)
and the gradient remains constant, when the ridge is approached from a
given point. Approaching the ridge is initially effective, but becomes
ineffective close to the ridge where the ridge needs to be followed
in :math:`z_1`-direction to its optimum. The necessary change in "search
behavior" close to the ridge is difficult to diagnose, because the
gradient towards the ridge does not flatten out.
Information gained from this function
* In comparison to BentCigar: What is the effect of non-smooth,
non-differentiable ridge?
.. plot::
from contextual_optimization.objective_functions import plot_function
plot_function("SharpRidge")
"""
def __init__(self, random_state, n_dims):
super(SharpRidge, self).__init__(random_state, n_dims)
self.lin_trans = linear_transformation(self.random_state, self.n_dims,
10)
def feedback(self, x):
z = np.dot(self.lin_trans, x - self.x_opt)
return -(z[0] ** 2 + 100.0 * np.sqrt(np.sum(z[1:] ** 2))) + self.f_opt
class DifferentPowers(ObjectiveFunctionBase):
"""Abstract Sum of different powers.
Properties
Due to the different exponents the sensitivies of the :math:`z_i`-variables
become more and more different when approching the optimum.
* unimodal
* small solution volume
* rotated
Information gained from this function
* In comparison to EllipsoidalRotated: What is the effect of missing
self-similarity?
.. plot::
from contextual_optimization.objective_functions import plot_function
plot_function("DifferentPowers")
"""
def __init__(self, random_state, n_dims):
super(DifferentPowers, self).__init__(random_state, n_dims)
self.R = generate_rotation(self.random_state, self.n_dims)
def feedback(self, x):
z = np.dot(self.R, x - self.x_opt)
return -(np.sum(np.abs(z) ** np.linspace(2, 6, self.n_dims))) + self.f_opt
class RastriginRotated(ObjectiveFunctionBase):
"""Rastrigin with asymmetric non-linear distortion.
The implementation is based on :func:`rastrigin`.
Properties
Prototypical highly multimodal function which has originally a very
regular and symmetric structure for the placement of the optima. The
transformations T_asy and T_osc alleviate the symmetry and regularity
of the original Rastrigin function.
* non-separable less regular counterpart of Rastrigin
* roughly 10^D local optima
* conditioning is about 10
* global amplitude large compared to local amplitudes
Information gained from this function
* In comparison to Rastrigin: What is the effect of non-separability
for a highly multimodal function?
.. plot::
from contextual_optimization.objective_functions import plot_function
plot_function("RastriginRotated")
"""
def __init__(self, random_state, n_dims):
super(RastriginRotated, self).__init__(random_state, n_dims)
self.lin_trans, _, self.R = linear_transformation(
self.random_state, self.n_dims, 10, return_rotations=True)
def feedback(self, x):
z = np.dot(self.lin_trans, T_asy(T_osc(np.dot(self.R, x - self.x_opt)),
0.2))
return -rastrigin(z) + self.f_opt
class Weierstrass(ObjectiveFunctionBase):
"""Weierstrass.
Properties
Highly rugged and moderately repetitive landscape, where the global
optimum is not unique.
* the term :math:`\sum_k 0.5^k cos(2 \\pi 3^k...)` introduces the
ruggedness, where lower frequencies have a larger weight :math:`0.5^k`
* rotated
* locally irregular
* non-unique global optimum
Information gained from this function
* In comparison to SchaffersF7: Does ruggedness or a repetitive landscape
deter the search behavior.
.. plot::
from contextual_optimization.objective_functions import plot_function
plot_function("Weierstrass")
"""
def __init__(self, random_state, n_dims):
super(Weierstrass, self).__init__(random_state, n_dims)
self.lin_trans, self.R, _ = linear_transformation(
self.random_state, self.n_dims, 0.01, return_rotations=True)
self.k_range = np.arange(12)
self.f0 = np.sum(0.5 ** self.k_range * np.cos(np.pi * 3 ** self.k_range))
def feedback(self, x):
z = np.dot(self.lin_trans, T_osc(np.dot(self.R, x - self.x_opt)))
return -(10.0 * (np.sum([0.5 ** self.k_range *
np.cos(2 * np.pi * 3 ** self.k_range *
(zi + 0.5)) / self.n_dims
for zi in z]) - self.f0) ** 3 +
10.0 / self.n_dims * f_pen(x)) + self.f_opt
class SchaffersF7(ObjectiveFunctionBase):
"""Schaffers F7 with asymmetric non-linear transformation.
Properties
A highly multimodal function where frequency and amplitude of the
modulation vary.
* asymmetric
* rotated
* conditioning is low
Information gained from this function
* In comparison to RastriginRotated: What is the effect of multimodality
on a less regular function?
.. plot::
from contextual_optimization.objective_functions import plot_function
plot_function("SchaffersF7")
"""
def __init__(self, random_state, n_dims, condition=10.0):
super(SchaffersF7, self).__init__(random_state, n_dims)
self.condition = condition
self.R = generate_rotation(self.random_state, self.n_dims)
self.Q = generate_rotation(self.random_state, self.n_dims)
self.Lambda = Lambda(self.n_dims, self.condition)
def feedback(self, x):
z = np.diag(self.Lambda).dot(self.Q).dot(T_asy(self.R.dot(x - self.x_opt), 0.5))
s = np.sqrt(z[:-1] ** 2 + z[1:] ** 2)
f = np.mean(np.sqrt(s) + np.sqrt(s) * np.sin(50 * s ** 0.2) ** 2) ** 2
return -(f + 10.0 * f_pen(x)) + self.f_opt
class SchaffersF7Ill(SchaffersF7):
"""Schaffers F7 with asymmetric non-linear transformation, ill-conditioned.
Properties
Moderately ill-conditioned counterpart to SchaffersF7.
* conditioning is about 1000
Information gained from this function
* In comparison to SchaffersF7: What is the effect of ill-conditioning?
.. plot::
from contextual_optimization.objective_functions import plot_function
plot_function("SchaffersF7Ill")
"""
def __init__(self, random_state, n_dims):
super(SchaffersF7Ill, self).__init__(random_state, n_dims, 1000.0)
class CompositeGriewankRosenbrockF8F2(ObjectiveFunctionBase):
"""F8F2 sum of Griewank-Rosenbrock 2-D blocks.
The implementation is based on :func:`rosenbrock`.
Properties
Resembling the Rosenbrock function in a highly multimodal way.
Information gained from this function
* In comparison to RosenbrockRotated: What is the effect of high
signal-to-noise ratio?
.. plot::
from contextual_optimization.objective_functions import plot_function
plot_function("CompositeGriewankRosenbrockF8F2")
"""
def __init__(self, random_state, n_dims):
super(CompositeGriewankRosenbrockF8F2, self).__init__(
random_state, n_dims)
self.scale = np.maximum(1.0, np.sqrt(self.n_dims) / 8.0)
self.R = generate_rotation(self.random_state, self.n_dims)
self.x_opt = np.dot(0.5 / self.scale * np.ones(self.n_dims), self.R)
def feedback(self, x):
z = self.scale * np.dot(self.R, x) + 0.5
s = rosenbrock(z)
f = 10.0 + 10.0 * np.mean(s / 4000.0 - np.cos(s))
return -f + self.f_opt
class Schwefel(ObjectiveFunctionBase):
"""Schwefel with tridiagonal variable transformation.
Properties
The most prominent 2^D minima are located comparatively close to the
corners of the unpenalized search area.
* the penalization is essential, as otherwise more and better optima
occur further away from the search space origin
* diagonal structure
* partial separable
* combinatorial problem
* two search regimes
Information gained from this function
* In comparison to e.g. SchaffersF7: What is the effect of a weak
global structure?
.. plot::
from contextual_optimization.objective_functions import plot_function
plot_function("Schwefel")
"""
def __init__(self, random_state, n_dims):
super(Schwefel, self).__init__(random_state, n_dims)
self.x_signs = np.sign(self.x_opt)
self.x_opt = 0.5 * 4.2096874633 * self.x_signs
self.Lambda = Lambda(self.n_dims, 10.0)
def feedback(self, x):
z_hat = 2 * self.x_signs * x
z_hat[1:] += 0.25 * (z_hat[:-1] - 2 * self.x_opt[:-1])
z = 100.0 * (self.Lambda * (z_hat - 2 * self.x_opt) + 2 * self.x_opt)
fpen = 100.0 * f_pen(z / 100.0)
f = 4.189828872724339 - 0.01 * np.mean(z * np.sin(np.sqrt(np.abs(z))))
return -(f + fpen) + self.f_opt
class GallaghersGaussian101mePeaks(ObjectiveFunctionBase):
"""Gallagher with 101 Gaussian peaks, condition 30, one global rotation.
Properties
* 101 optima with position and height being unrelated and rnadomly chosen
(different for each instantiation of the function).
* the conditioning around the global optimum is about 30
Information gained from this function
* Is the search effective without any global structure?
.. plot::
from contextual_optimization.objective_functions import plot_function
plot_function("GallaghersGaussian101mePeaks")
"""
def __init__(self, random_state, n_dims, conditioning=1000.0 ** 0.5, n_peaks=101):
# TODO how did they generate different rotations for different peaks?
super(GallaghersGaussian101mePeaks, self).__init__(random_state, n_dims)
self.alpha_base = conditioning ** 2
self.n_peaks = n_peaks
self.w = np.hstack([10.0, 1.1 + np.linspace(0, 8, self.n_peaks - 1)])
self.R = generate_rotation(self.random_state, self.n_dims)
self.alpha = np.empty(self.n_peaks)
self.alpha[0] = 1000.0
self.alpha[1:] = self.alpha_base ** np.linspace(0, 2, self.n_peaks - 1)
self.random_state.shuffle(self.alpha)
self.C = np.empty((self.n_peaks, self.n_dims))
for i in range(self.n_peaks):
self.C[i] = (Lambda(self.n_dims, self.alpha[i]) /
self.alpha[i] ** 0.25)
self.Y = np.empty((self.n_peaks, self.n_dims))
self.Y[0] = self.random_state.uniform(-5, 5, (self.n_dims,))
self.Y[1:] = self.random_state.uniform(-4, 4, (self.n_peaks - 1,
self.n_dims))
self.x_opt = self.Y[0]
def feedback(self, x):
D = (x - self.Y).dot(self.R.T)
p = np.exp(-0.5 / self.n_dims *
np.array([D[i].T.dot(np.diag(self.C[i])).dot(D[i])
for i in range(self.n_peaks)]))
return -(T_osc(10.0 - np.max(self.w * p)) ** 2 + f_pen(x)) + self.f_opt
class GallaghersGaussian21hiPeaks(GallaghersGaussian101mePeaks):
"""Gallagher with 21 Gaussian peaks, condition 1000, one global rotation.
Properties
* 21 optima with position and height being unrelated and randomly chosen
(different for each instantiation of the function).
* the conditioning around the global optimum is about 1000
Information gained from this function
* In comparison to GallaghersGaussian101mePeaks: What is the effect of
higher condition?
.. plot::
from contextual_optimization.objective_functions import plot_function
plot_function("GallaghersGaussian21hiPeaks")
"""
def __init__(self, random_state, n_dims):
super(GallaghersGaussian21hiPeaks, self).__init__(
random_state, n_dims, 1000.0, 21)
class Katsuura(ObjectiveFunctionBase):
"""Katsuura function.
Properties
Focus on global search behavior.
* highly rugged
* highly repetitive
* more than 10^D global optima
Information gained from this function
* What is the effect of regular local structure on the global search?
.. plot::
from contextual_optimization.objective_functions import plot_function
plot_function("Katsuura")
"""
def __init__(self, random_state, n_dims):
super(Katsuura, self).__init__(random_state, n_dims)
R = generate_rotation(self.random_state, self.n_dims)
Q = generate_rotation(self.random_state, self.n_dims)
self.lin_trans = Q.dot(Lambda(self.n_dims, 100)[:, np.newaxis] * R)
self.scale = 10.0 / self.n_dims ** 2
def feedback(self, x):
z = self.lin_trans.dot(x - self.x_opt)
return -(self.scale * np.prod(
1 + np.arange(1, self.n_dims + 1) *
np.sum([np.abs(2 ** j * z - np.round(2 ** j * z)) / 2 ** j
for j in range(1, 33)])) ** (10.0 / self.n_dims ** 1.2) -
self.scale + f_pen(x)) + self.f_opt
class LunacekBiRastrigin(ObjectiveFunctionBase):
"""Lunacek bi-Rastrigin.
The implementation is based on :func:`rastrigin`.
Properties
* highly multimodal function
* two funnels around mu_0 1+- and -mu_1 1+- being superimposed by the
cosine.
* the funnel of the local optimum at -mu_1 has roughly 70% of the search
space volume within [-5, 5]^D
Presumably different approaches need to be used for "selecting the funnel"
and for search the highly multimodal function "within" the funnel. The
function was constructed to be deceptive for some evolutionary algorithms
with large population size.
Information gained from this function
* Can the search behavior be local on the global scale but global on the
local scale?
.. plot::
from contextual_optimization.objective_functions import plot_function
plot_function("LunacekBiRastrigin")
"""
def __init__(self, random_state, n_dims):
super(LunacekBiRastrigin, self).__init__(random_state, n_dims)
R = generate_rotation(self.random_state, self.n_dims)
Q = generate_rotation(self.random_state, self.n_dims)
self.lin_trans = Q.dot(Lambda(self.n_dims, 100)[:, np.newaxis] * R)
self.mu_0 = 2.5
self.d = 1.0
self.s = 1 - 1.0 / (2 * np.sqrt(self.n_dims + 20) - 8.2)
self.mu_1 = -np.sqrt((self.mu_0 ** 2 - self.d) / self.s)
self.x_opt = 0.5 * np.sign(self.x_opt) * self.mu_0
def feedback(self, x):
x_hat = 2 * np.sign(self.x_opt) * x
z = self.lin_trans.dot(x_hat - self.mu_0)
d_mu_0 = x_hat - self.mu_0
d_mu_1 = x_hat - self.mu_1
return -(np.minimum(d_mu_0.dot(d_mu_0), self.d * self.n_dims +
self.s * d_mu_1.dot(d_mu_1)) +
rastrigin(z, penalize_norm=False) + 1e4 * f_pen(x)) + self.f_opt
SEPARABLE_FUNCTIONS = [
"Sphere",
"Ellipsoidal",
"Rastrigin",
"BuecheRastrigin",
"LinearSlope",
]
MODERATE_FUNCTIONS = [
"AttractiveSector",
"StepEllipsoidal",
"Rosenbrock",
"RosenbrockRotated",
]
ILL_CONDITIONED_FUNCTIONS = [
"EllipsoidalRotated",
"Discus",
"BentCigar",
"SharpRidge",
"DifferentPowers",
]
MULTIMODAL_STRUCTURED_FUNCTIONS = [
"RastriginRotated",
"Weierstrass",
"SchaffersF7",
"SchaffersF7Ill",
"CompositeGriewankRosenbrockF8F2",
]
MULTIMODAL_WEAKLY_STRUCTURED_FUNCTIONS = [
"Schwefel",
"GallaghersGaussian101mePeaks",
"GallaghersGaussian21hiPeaks",
"Katsuura",
"LunacekBiRastrigin",
]
NON_SMOOTH_FUNCTIONS = [
"StepEllipsoidal",
"Weierstrass",
"Katsuura",
]
UNIMODAL_FUNCTIONS = [
"Sphere",
"Ellipsoidal",
"LinearSlope",
"AttractiveSector",
"StepEllipsoidal",
"Rosenbrock",
"RosenbrockRotated",
"EllipsoidalRotated",
"Discus",
"BentCigar",
"SharpRidge",
"DifferentPowers",
]
FUNCTIONS = {
"Sphere": Sphere,
"Ellipsoidal": Ellipsoidal,
"Rastrigin": Rastrigin,
"BuecheRastrigin": BuecheRastrigin,
"LinearSlope": LinearSlope,
"AttractiveSector": AttractiveSector,
"StepEllipsoidal": StepEllipsoidal,
"Rosenbrock": Rosenbrock,
"RosenbrockRotated": RosenbrockRotated,
"EllipsoidalRotated": EllipsoidalRotated,
"Discus": Discus,
"BentCigar": BentCigar,
"SharpRidge": SharpRidge,
"DifferentPowers": DifferentPowers,
"RastriginRotated": RastriginRotated,
"Weierstrass": Weierstrass,
"SchaffersF7": SchaffersF7,
"SchaffersF7Ill": SchaffersF7Ill,
"CompositeGriewankRosenbrockF8F2": CompositeGriewankRosenbrockF8F2,
"Schwefel": Schwefel,
"GallaghersGaussian101mePeaks": GallaghersGaussian101mePeaks,
"GallaghersGaussian21hiPeaks": GallaghersGaussian21hiPeaks,
"Katsuura": Katsuura,
"LunacekBiRastrigin": LunacekBiRastrigin,
}
def plot_function(name, random_state=None, fig=None, contour=False):
"""Plot function values of area [-5, 5] x [-5, 5].
This will show two plots, one with linear scale and one with logarithmic
scale.
Parameters
----------
name : string
Function name
random_state : RandomState or int
Random number generator or seed
fig : Figure, optional (default: None)
Matplotlib figure
contour : bool, optional (default: False)
Plot function as contour plot (in 3D otherwise)
Returns
-------
ax_lin : Axis
Subplot with linear scale
ax_log : Axis
Subplot with logarithmic scale
f : ObjectiveFunction
Objective function
"""
if name not in FUNCTIONS:
raise ValueError("Could not find function %r, available functions are "
"%r" % (name, FUNCTIONS.keys()))
f = FUNCTIONS[name](random_state, 2)
X = np.arange(-5, 5, 0.2)
Y = np.arange(-5, 5, 0.2)
X, Y = np.meshgrid(X, Y)
F = np.ndarray(X.shape)
for i in range(X.shape[0]):
for j in range(X.shape[1]):
F[i, j] = f.feedback(np.array([X[i, j], Y[i, j]]))
if fig is None:
fig = plt.figure(figsize=(12, 5))
if contour:
ax_lin = fig.add_subplot(121)
ax_lin.set_title(name)
_plot_function_lin_contour(f, ax_lin, X, Y, F)
ax_log = fig.add_subplot(122)
_plot_function_log_contour(f, ax_log, X, Y, F)
else:
ax_lin = fig.add_subplot(121, projection="3d")
ax_lin.set_title(name)
_plot_function_lin_3d(f, ax_lin, X, Y, F)
ax_log = fig.add_subplot(122, projection="3d")
_plot_function_log_3d(f, ax_log, X, Y, F)
return ax_lin, ax_log, f
def _plot_function_lin_contour(f, ax, X, Y, F):
ax.set_xlabel("$x_1$")
ax.set_ylabel("$x_2$")
ax.contourf(X, Y, F, rstride=1, cstride=1, cmap=plt.cm.jet)
def _plot_function_log_contour(f, ax, X, Y, F):
F = -np.log(-(F - f.f_opt) + 0.1)
ax.set_xlabel("$x_1$")
ax.set_ylabel("$x_2$")
ax.contourf(X, Y, F, rstride=1, cstride=1, cmap=plt.cm.jet)
def _plot_function_lin_3d(f, ax, X, Y, F):
ax.set_xlabel("$x_1$")
ax.set_ylabel("$x_2$")
ax.set_zlabel("$f(x)$")
ax.plot_surface(X, Y, F, rstride=1, cstride=1, cmap=plt.cm.jet, lw=0)
def _plot_function_log_3d(f, ax, X, Y, F):
F = -np.log(-(F - f.f_opt) + 0.1)
ax.set_xlabel("$x_1$")
ax.set_ylabel("$x_2$")
ax.set_zlabel("$-\log(-(f - f_{opt}) + 0.1)$")
ax.plot_surface(X, Y, F, rstride=1, cstride=1, cmap=plt.cm.jet, lw=0)
[docs]class ObjectiveFunction(Environment):
"""Artificial benchmark function.
Parameters
----------
name : string, optional (default: 'Sphere')
Name of the objective function. Possible options are:
'Sphere', 'Ellipsoidal', 'Rastrigin', 'BuecheRastrigin', 'LinearSlope',
'AttractiveSector', 'StepEllipsoidal', 'Rosenbrock',
'RosenbrockRotated', 'EllipsoidalRotated', 'Discus', 'BentCigar',
'SharpRidge', 'DifferentPowers', 'RastriginRotated', 'Weierstrass',
'SchaffersF7', 'SchaffersF7Ill', 'CompositeGriewankRosenbrockF8F2',
'Schwefel', 'GallaghersGaussian101mePeaks',
'GallaghersGaussian21hiPeaks', 'Katsuura', 'LunacekBiRastrigin'
n_params : int, optional (default: 2)
Number of dimensions
random_state : RandomState or int, optional (default: None)
Random number generator or seed
"""
[docs] def __init__(self, name="Sphere", n_params=2, random_state=None):
self.name = name
self.n_params = n_params
self.random_state = random_state
def init(self):
if self.n_params <= 0:
raise ValueError("Number of parameters (%d) must be > 0"
% self.n_params)
if not self.name in FUNCTIONS:
raise ValueError("Unknown function '%s' requested, select one of "
"%s instead" % (self.name, FUNCTIONS.keys()))
self.objective = FUNCTIONS[self.name](self.random_state, self.n_params)
self.params = np.empty(self.n_params)
self.f = np.nan
def reset(self):
self.done = False
def get_num_inputs(self):
return self.n_params
def get_num_outputs(self):
return 0
def get_outputs(self, _):
pass
def set_inputs(self, values):
self.params[:] = values[:]
self.done = True
def step_action(self):
self.f = self.objective.feedback(self.params)
def is_evaluation_done(self):
return self.done
def get_feedback(self):
return np.array([self.f])
def is_behavior_learning_done(self):
return False
def get_maximum_feedback(self):
return self.objective.f_opt
def plot(self, fig):
plot_function(self.name, self.random_state, fig)