Mutation

Mutation is periodically used to explore the hyperparameter space, allowing different hyperparameter combinations to be trialled during training. If certain hyperparameters prove relatively beneficial to training, then that agent is more likely to be preserved in the next generation, and so those characteristics are more likely to remain in the population.

The Mutations() class is used to mutate agents with pre-set probabilities. The available mutations currently implemented are:
  • No mutation

  • Network architecture mutation - adding layers or nodes. Trained weights are reused and new weights are initialized randomly.

  • Network parameters mutation - mutating weights with Gaussian noise.

  • Network activation layer mutation - change of activation layer.

  • RL algorithm mutation - mutation of learning hyperparameter, such as learning rate or batch size.

Mutations.mutation() returns a mutated population.

Tournament selection and mutation should be applied sequentially to fully evolve a population between evaluation and learning cycles.

from agilerl.hpo.mutation import Mutations
import torch

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

mutations = Mutations(algo='DQN',                           # Algorithm
                      no_mutation=0.4,                      # No mutation
                      architecture=0.2,                     # Architecture mutation
                      new_layer_prob=0.2,                   # New layer mutation
                      parameters=0.2,                       # Network parameters mutation
                      activation=0,                         # Activation layer mutation
                      rl_hp=0.2,                            # Learning HP mutation
                      rl_hp_selection=['lr', 'batch_size'], # Learning HPs to choose from
                      mutation_sd=0.1,                      # Mutation strength
                      arch=NET_CONFIG['arch'],              # Network architecture
                      rand_seed=1,                          # Random seed
                      device=device)

Parameters

class agilerl.hpo.mutation.Mutations(algo, no_mutation, architecture, new_layer_prob, parameters, activation, rl_hp, rl_hp_selection, mutation_sd, min_lr=0.0001, max_lr=0.01, min_learn_step=1, max_learn_step=120, min_batch_size=8, max_batch_size=1024, agent_ids=None, arch='mlp', mutate_elite=True, rand_seed=None, device='cpu', accelerator=None)

The Mutations class for evolutionary hyperparameter optimization.

Parameters:
  • algo (str or dict) – RL algorithm. Use str e.g. ‘DQN’ if using AgileRL algos, or provide a dict with names of agent networks

  • no_mutation (float) – Relative probability of no mutation

  • architecture (float) – Relative probability of architecture mutation

  • new_layer_prob (float) – Relative probability of new layer mutation (type of architecture mutation)

  • parameters (float) – Relative probability of network parameters mutation

  • activation (float) – Relative probability of activation layer mutation

  • rl_hp (float) – Relative probability of learning hyperparameter mutation

  • rl_hp_selection (list[str]) – Learning hyperparameter mutations to choose from

  • mutation_sd (float) – Mutation strength

  • min_lr (float, optional) – Minimum learning rate in the hyperparameter search space

  • max_lr (float, optional) – Maximum learning rate in the hyperparameter search space

  • min_learn_step (int, optional) – Minimum learn step in the hyperparameter search space

  • max_learn_step (int, optional) – Maximum learn step in the hyperparameter search space

  • min_batch_size (int, optional) – Minimum batch size in the hyperparameter search space

  • max_batch_size (int, optional) – Maximum batch size in the hyperparameter search space

  • agents_id (list[str]) – List of agent ID’s for multi-agent algorithms

  • arch (str, optional) – Network architecture type. ‘mlp’ or ‘cnn’, defaults to ‘mlp’

  • mutate_elite (bool, optional) – Mutate elite member of population, defaults to True

  • rand_seed (int, optional) – Random seed for repeatability, defaults to None

  • device (str, optional) – Device for accelerated computing, ‘cpu’ or ‘cuda’, defaults to ‘cpu’

  • accelerator (accelerate.Accelerator(), optional) – Accelerator for distributed computing, defaults to None

activation_mutation(individual)

Returns individual from population with activation layer mutation.

Parameters:

individual (object) – Individual agent from population

architecture_mutate(individual)

Returns individual from population with network architecture mutation.

Parameters:

individual (object) – Individual agent from population

classic_parameter_mutation(network)

Returns network with mutated weights.

Parameters:

network – Neural network to mutate

get_algo_nets(algo)

Returns dictionary with agent network names.

Parameters:

algo (str) – RL algorithm

mutation(population, pre_training_mut=False)

Returns mutated population.

Parameters:
  • population (list[object]) – Population of agents

  • pre_training_mut (bool, optional) – Boolean flag indicating if the mutation is before the training loop

no_mutation(individual)

Returns individual from population without mutation.

Parameters:

individual (object) – Individual agent from population

parameter_mutation(individual)

Returns individual from population with network parameters mutation.

Parameters:

individual (object) – Individual agent from population

rl_hyperparam_mutation(individual)

Returns individual from population with RL hyperparameter mutation.

Parameters:

individual (object) – Individual agent from population