Algorithms Mutations Registry¶

Parameters¶

class agilerl.algorithms.core.registry.RLParameter(min: float, max: float, shrink_factor: float = 0.8, grow_factor: float = 1.2, dtype: type[float] | type[int] | type[~numpy.ndarray] = <class 'float'>)¶

Dataclass for storing the configuration of a hyperparameter that will be mutated during training. The hyperparameter is defined by a range of values that it can take, and the shrink and grow factors that will be used to mutate the hyperparameter value.

Parameters:

min (float) – The minimum value that the hyperparameter can take. For numpy arrays, this will be broadcast.
max (float) – The maximum value that the hyperparameter can take. For numpy arrays, this will be broadcast.
shrink_factor (float) – The factor by which the hyperparameter will be shrunk during mutation. Default is 0.8.
grow_factor (float) – The factor by which the hyperparameter will be grown during mutation. Default is 1.2.
dtype (type[float] | type[int] | type[np.ndarray]) – The data type of the hyperparameter. Default is float.
value (Number | np.ndarray | None) – The current value of the hyperparameter. Default is None.

dtype¶: alias of float

mutate() → Number | ndarray¶

Mutate the hyperparameter value by either growing or shrinking it.

For scalar values (int/float), the mutation applies the grow/shrink factor uniformly. For numpy arrays, the mutation is applied element-wise, with proper broadcasting of min/max constraints and preservation of the original array’s dtype.

Returns:: The mutated hyperparameter value.
Return type:: Number | np.ndarray

class agilerl.algorithms.core.registry.HyperparameterConfig(**kwargs: dict[str, RLParameter])¶

Stores the RL hyperparameters that will be mutated during training. For each hyperparameter, we store the name of the attribute where the hyperparameter is stored, and the range of values that the hyperparameter can take.

sample() → tuple[str, RLParameter]¶

Sample a hyperparameter from the configuration.

Returns:: The name of the hyperparameter and its configuration.
Return type:: tuple[str, RLParameter]

Dataclass for storing a group of networks. This consists of an evaluation network (i.e. a network that is optimized during training) and, optionally, some other networks that share parameters with the evaluation network (e.g. the target network in DQN). If the networks are passed as an agilerl.modules.base.ModuleDict object, we assume that the networks are part of a multiagent setting.

Parameters:

eval_network (NetworkType) – The evaluation network.
shared_networks (NetworkType | None) – The list of shared networks.
policy (bool) – Whether the network is a policy (e.g. the network used to get the actions of the agent). There must be one network group in an algorithm which sets this to True. Default is False.

class agilerl.algorithms.core.registry.MutationRegistry(hp_config: HyperparameterConfig | None = None)¶

Registry to keep track of the components of an algorithms that may evolve during training. This is interpreted by a Mutations object when performing evolutionary hyperparameter optimization. This includes:

The hyperparameter configuration of the algorithm.
The network groups of the algorithm.
The optimizers of the algorithm.
The mutation hooks of the algorithm (i.e. functions that are called after a mutation is performed).

Parameters:: hp_config (HyperparameterConfig) – The hyperparameter configuration of the algorithm.

all_registered() → list[str]¶

Return all of the members in the registry.

Returns:: A list of all the members in the registry.
Return type:: list[str]

networks() → list[NetworkConfig]¶

Get a list of network configurations in the registry.

Returns:: A list of network configurations in the registry. This includes the evaluation

and shared networks. :rtype: list[NetworkConfig]

property optimizer_networks: dict[str, list[str]]¶

Get a dictionary of optimizer names and the network attribute names that they update.

Returns:: A dictionary of optimizer names and the network attribute names that they update.
Return type:: dict[str, list[str]]

policy(return_group: bool = False) → str | NetworkGroup | None¶

Get the name of the policy network in the registry.

Parameters:: return_group (bool) – Whether to return the network group instead of just the name.
Returns:: The name of the policy network in the registry.
Return type:: str | NetworkGroup | None

register_group(group: NetworkGroup) → None¶

Parameters:: group (NetworkGroup) – The network group to be registered.

register_hook(hook: Callable) → None¶

Register a hook in the registry as its name. This is used to store the names of the mutation hooks that will be applied after a mutation is performed.

Parameters:: hook (Callable) – The hook to be registered.

register_optimizer(optimizer: OptimizerConfig) → None¶

Parameters:: optimizer (OptimizerConfig) – The optimizer configuration to be registered.