OptimizerWrapper¶

Parameters¶

Wrapper to initialize optimizer and store metadata relevant for evolutionary hyperparameter optimization. In AgileRL algorithms, all optimizers should be initialized using this wrapper. This allows us to access the relevant networks that they optimize inside Mutations to be able to reinitialize them after mutating an individual.

Parameters:

optimizer_cls (type[torch.optim.Optimizer]) – The optimizer class to be initialized.
networks (EvolvableModule, ModuleDict) – The network/s that the optimizer will update.
lr (float) – The learning rate of the optimizer.
optimizer_kwargs (dict[str, Any]) – The keyword arguments to be passed to the optimizer.
network_names (list[str]) – The attribute names of the networks in the parent container.
lr_name (str | tuple[str, str] | None) – Attribute name(s) on the parent for learning rate(s): str or ("lr_actor", "lr_critic") when is_llm_optimizer is True.
is_llm_optimizer (bool) – If True, build actor/critic param groups via init_llm_optimizer() (single module only). Requires network_names, lr_name as a 2-tuple, and lr_critic.
lr_critic (float | None) – Learning rate for the critic/value-head group when is_llm_optimizer is True.
is_llm_optimizer – If True, the optimizer is an LLM optimizer.

checkpoint_dict(name: str) → dict[str, Any]¶

Return a dictionary of the optimizer’s state and parameters.

Parameters:: name (str) – The name of the optimizer.
Returns:: A dictionary of the optimizer’s state and parameters.
Return type:: dict[str, Any]

load_state_dict(state_dict: dict[str, Any] | dict[str, dict[str, Any]] | list[dict[str, Any]]) → None¶

Load the state of the optimizer from the passed state dictionary.

Parameters:: state_dict (dict[str, Any]) – State dictionary of the optimizer.

optimizer_cls_names() → str | dict[str, str]¶: Return the names of the optimizers.

state_dict() → dict[str, Any] | dict[str, dict[str, Any]] | list[dict[str, Any]]¶

Return the state of the optimizer as a dictionary.

Returns:: State dictionary of the optimizer.
Return type:: StateDict

step() → None¶: Perform a single optimization step.

zero_grad() → None¶: Zero the gradients of the optimizer.