Mutation

Mutation is periodically used to explore the hyperparameter space, allowing different hyperparameter combinations to be trialled during training. If certain hyperparameters prove relatively beneficial to training, then that agent is more likely to be preserved in the next generation, and so those characteristics are more likely to remain in the population.

The Mutations class is used to mutate agents with pre-set probabilities. The available mutations currently implemented are:

  • No mutation: An “identity” mutation, whereby the agent is returned unchanged.

  • Network architecture mutations: Currently involves adding layers or nodes. Trained weights are reused and new weights are initialized randomly.

  • Network parameters mutation: Mutating weights with Gaussian noise.

  • Network activation layer mutation: Change of activation layer.

  • RL hyperparameter mutation: Mutation of a learning hyperparameter (e.g. learning rate or batch size).

Mutations.mutation(population) returns a mutated population.

Tournament selection and mutation should be applied sequentially to fully evolve a population between evaluation and learning cycles.

from agilerl.hpo.mutation import Mutations
import torch

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

mutations = Mutations(
  no_mutation=0.4,                      # No mutation
  architecture=0.2,                     # Architecture mutation
  new_layer_prob=0.2,                   # New layer mutation
  parameters=0.2,                       # Network parameters mutation
  activation=0,                         # Activation layer mutation
  rl_hp=0.2,                            # Learning HP mutation
  mutation_sd=0.1,                      # Mutation strength
  rand_seed=1,                          # Random seed
  device=device
)

Parameters

class agilerl.hpo.mutation.Mutations(no_mutation: float, architecture: float, new_layer_prob: float, parameters: float, activation: float, rl_hp: float, mutation_sd: float = 0.1, activation_selection: list[str] | None = None, mutate_elite: bool = True, rand_seed: int | None = None, device: str = 'cpu', accelerator: Accelerator | None = None)

Allow performing mutations on a population of EvolvableAlgorithm agents. Calling Mutations.mutation() on a population of agents will return a mutated population of agents. The type of mutation applied to each agent is sampled randomly from the probabilities given by the user. The supported types of mutations that can be applied to an agent are:

  • No mutation

  • Network architecture mutation - adding layers or nodes. Trained weights are reused and new weights are initialized randomly.

  • Network parameters mutation - mutating weights with Gaussian noise.

  • Network activation layer mutation - change of activation layer.

  • RL algorithm mutation - mutation of learning hyperparameter, (e.g. learning rate or batch size).

See Evolutionary Hyperparameter Optimization for more details.

Parameters:
  • no_mutation (float) – Relative probability of no mutation

  • architecture (float) – Relative probability of architecture mutation

  • new_layer_prob (float) – Relative probability of new layer mutation (type of architecture mutation)

  • parameters (float) – Relative probability of network parameters mutation

  • activation (float) – Relative probability of activation layer mutation

  • rl_hp (float) – Relative probability of learning hyperparameter mutation

  • rl_hp_selection (list[str]) – Learning hyperparameter mutations to choose from

  • mutation_sd (float) – Mutation strength

  • activation_selection (list[str], optional) – Activation functions to choose from, defaults to [“ReLU”, “ELU”, “GELU”]

  • mutate_elite (bool, optional) – Mutate elite member of population, defaults to True

  • rand_seed (int, optional) – Random seed for repeatability, defaults to None

  • device (str, optional) – Device for accelerated computing, ‘cpu’ or ‘cuda’, defaults to ‘cpu’

  • accelerator (accelerate.Accelerator(), optional) – Accelerator for distributed computing, defaults to None

activation_mutation(individual: IndividualType) IndividualType

Perform a random mutation of the activation layer of the evaluation networks of an agent.

Note

This is currently not supported for LLMAlgorithm agents.

Parameters:

individual (RLAlgorithm or MultiAgentRLAlgorithm) – Individual agent from population

Returns:

Individual from population with activation layer mutation

Return type:

RLAlgorithm or MultiAgentRLAlgorithm

architecture_mutate(individual: IndividualType) IndividualType

Perform a random mutation to the architecture of the policy network of an agent. The way in which we apply an architecture mutation to single and multi-agent RL algorithms inherently differs given the nested nature of the networks in the latter.

  • Single-agent: A mutation method is sampled from the policy network and then applied to the rest of the evaluation modules (e.g. critics). This can be done generally because all of the networks in a single-agent algorithm share the same architecture (given there is only one observation space).

  • Multi-agent: A sub-agent is sampled to perform the mutation on for the policy. We then iterate over the rest of the sub-agent policies and perform the same mutation if they share the same observation space. For the rest of the evaluation networks (e.g. critics) there is a possibility they are centralized, in which case their underlying architecture will differ from the policy and therefore the mutation methods won’t exactly match. In such cases, we try to find an analogous mutation method to apply.

Note

This is currently not supported for LLMAlgorithm agents.

Parameters:

individual (RLAlgorithm or MultiAgentRLAlgorithm) – Individual agent from population

Returns:

Individual from population with network architecture mutation

Return type:

RLAlgorithm or MultiAgentRLAlgorithm

mutation(population: list[IndividualType], pre_training_mut: bool = False) list[IndividualType]

Return a mutated population of agents. See Evolutionary Hyperparameter Optimization for more details.

Parameters:
  • population (list[EvolvableAlgorithm]) – Population of agents

  • pre_training_mut (bool, optional) – Boolean flag indicating if the mutation is before the training loop

Returns:

Mutated population

Return type:

list[EvolvableAlgorithm]

no_mutation(individual: IndividualType) IndividualType

Return individual from population without mutation.

Parameters:

individual – Individual agent from population

parameter_mutation(individual: IndividualType) IndividualType

Perform a random mutation to the weights of the policy network of an agent through the addition of Gaussian noise.

Note

This is currently not supported for LLMAlgorithm agents.

Parameters:

individual (RLAlgorithm or MultiAgentRLAlgorithm) – Individual agent from population

Returns:

Individual from population with network parameters mutation

Return type:

RLAlgorithm or MultiAgentRLAlgorithm

rl_hyperparam_mutation(individual: IndividualType) IndividualType

Perform a random mutation of a learning hyperparameter of an agent. To do this, sample a hyperparameter from those specified through the HyperparameterConfig passed during initialization of the agent. The hyperparameter is then mutated and the optimizer is reinitialized if the learning rate has been mutated.

Parameters:

individual (EvolvableAlgorithm) – Individual agent from population

Returns:

Individual from population with RL hyperparameter mutation

Return type:

EvolvableAlgorithm