Optimizer using Gradient Descent
GradientDescentOptimizer
- class l2l.optimizers.gradientdescent.optimizer.GradientDescentOptimizer(traj, optimizee_create_individual, optimizee_fitness_weights, parameters, optimizee_bounding_func=None)[source]
Bases:
OptimizerClass for a generic gradient descent solver. In the pseudo code the algorithm does:
- For n iterations do:
Explore the fitness of individuals in the close vicinity of the current one
Calculate the gradient based on these fitnesses.
- Create the new ‘current individual’ by taking a step in the parameters space along the direction
of the largest ascent of the plane
NOTE: This expects all parameters of the system to be of floating point
- Parameters:
traj (Trajectory) – Use this trajectory to store the parameters of the specific runs. The parameters should be initialized based on the values in parameters
optimizee_create_individual – Function that creates a new individual
optimizee_fitness_weights – Fitness weights. The fitness returned by the Optimizee is multiplied by these values (one for each element of the fitness vector)
parameters – Instance of
namedtuple()ClassicGDParameters,namedtuple()StochasticGDParameters,namedtuple()RMSPropParametersornamedtuple()AdamParameterscontaining the parameters needed by the Optimizer. The type of this parameter is used to select one of the GD variants.
- post_process(traj, fitnesses_results)[source]
See
post_process()
- init_classic_gd(parameters, traj)[source]
Classic Gradient Descent specific initializiation.
- Parameters:
traj (Trajectory) – The trajectory on which the parameters should get stored.
- Returns:
- init_rmsprop(parameters, traj)[source]
RMSProp specific initializiation.
- Parameters:
traj (Trajectory) – The trajectory on which the parameters should get stored.
- Returns:
- init_adam(parameters, traj)[source]
ADAM specific initializiation.
- Parameters:
traj (Trajectory) – The trajectory on which the parameters should get stored.
- Returns:
- init_ada_max(parameters, traj)[source]
ADAMAX specific initializiation.
- Parameters:
traj (Trajectory) – The trajectory on which the parameters should get stored.
- Returns:
- init_stochastic_gd(parameters, traj)[source]
Stochastic Gradient Descent specific initializiation.
- Parameters:
traj (Trajectory) – The trajectory on which the parameters should get stored.
- Returns:
- classic_gd_update(traj, gradient)[source]
Updates the current individual using the classic Gradient Descent algorithm.
- Parameters:
traj (Trajectory) – The trajectory which contains the parameters required by the update algorithm
gradient (ndarray) – The gradient of the fitness curve, evaluated at the current individual
- Returns:
- rmsprop_update(traj, gradient)[source]
Updates the current individual using the RMSProp algorithm.
- Parameters:
traj (Trajectory) – The trajectory which contains the parameters required by the update algorithm
gradient (ndarray) – The gradient of the fitness curve, evaluated at the current individual
- Returns:
- adam_update(traj, gradient)[source]
Updates the current individual using the ADAM algorithm.
- Parameters:
traj (Trajectory) – The trajectory which contains the parameters required by the update algorithm
gradient (ndarray) – The gradient of the fitness curve, evaluated at the current individual
- Returns:
- ada_max_update(traj, gradient)[source]
Updates the current individual using the ADAMAX algorithm.
- Parameters:
traj (Trajectory) – The trajectory which contains the parameters required by the update algorithm (in this case: first and second order decay)
gradient (ndarray) – The gradient of the fitness curve, evaluated at the current individual
- Returns:
- stochastic_gd_update(traj, gradient)[source]
Updates the current individual using a stochastic version of the gradient descent algorithm.
- Parameters:
traj (Trajectory) – The trajectory which contains the parameters required by the update algorithm
gradient (ndarray) – The gradient of the fitness curve, evaluated at the current individual
- Returns:
ClassicGDParameters
- class l2l.optimizers.gradientdescent.optimizer.ClassicGDParameters(learning_rate, exploration_step_size, n_random_steps, n_iteration, stop_criterion, seed)
Bases:
tuple- Parameters:
learning_rate – The rate of learning per step of gradient descent
exploration_step_size – The standard deviation of random steps used for finite difference gradient
n_random_steps – The amount of random steps used to estimate gradient
n_iteration – number of iteration to perform
stop_criterion – Stop if change in fitness is below this value
- exploration_step_size
- learning_rate
- n_iteration
- n_random_steps
- seed
- stop_criterion
StochasticGDParameters
- class l2l.optimizers.gradientdescent.optimizer.StochasticGDParameters(learning_rate, stochastic_deviation, stochastic_decay, exploration_step_size, n_random_steps, n_iteration, stop_criterion, seed)
Bases:
tuple- Parameters:
learning_rate – The rate of learning per step of gradient descent
stochastic_deviation – The standard deviation of the random vector used to perturbate the gradient
stochastic_decay – The decay of the influence of the random vector that is added to the gradient (set to 0 to disable stochastic perturbation)
exploration_step_size – The standard deviation of random steps used for finite difference gradient
n_random_steps – The amount of random steps used to estimate gradient
n_iteration – number of iteration to perform
stop_criterion – Stop if change in fitness is below this value
- exploration_step_size
- learning_rate
- n_iteration
- n_random_steps
- seed
- stochastic_decay
- stochastic_deviation
- stop_criterion
AdamParameters
- class l2l.optimizers.gradientdescent.optimizer.AdamParameters(learning_rate, exploration_step_size, n_random_steps, first_order_decay, second_order_decay, n_iteration, stop_criterion, seed)
Bases:
tuple- Parameters:
learning_rate – The rate of learning per step of gradient descent
exploration_step_size – The standard deviation of random steps used for finite difference gradient
n_random_steps – The amount of random steps used to estimate gradient
first_order_decay – Specifies the amount of decay of the historic first order momentum per gradient descent step
second_order_decay – Specifies the amount of decay of the historic second order momentum per gradient descent step
n_iteration – number of iteration to perform
stop_criterion – Stop if change in fitness is below this value
- exploration_step_size
- first_order_decay
- learning_rate
- n_iteration
- n_random_steps
- second_order_decay
- seed
- stop_criterion
RMSPropParameters
- class l2l.optimizers.gradientdescent.optimizer.RMSPropParameters(learning_rate, exploration_step_size, n_random_steps, momentum_decay, n_iteration, stop_criterion, seed)
Bases:
tuple- Parameters:
learning_rate – The rate of learning per step of gradient descent
exploration_step_size – The standard deviation of random steps used for finite difference gradient
n_random_steps – The amount of random steps used to estimate gradient
momentum_decay – Specifies the decay of the historic momentum at each gradient descent step
n_iteration – number of iteration to perform
stop_criterion – Stop if change in fitness is below this value
seed – The random seed used for random number generation in the optimizer
- exploration_step_size
- learning_rate
- momentum_decay
- n_iteration
- n_random_steps
- seed
- stop_criterion