EvolvableNetwork

Parameters

class agilerl.networks.base.EvolvableNetwork(*args, **kwargs)

Base class for evolvable networks, i.e., evolvable modules that are configured in a specific way for a reinforcement learning algorithm, similar to how CNNs are used as building blocks in ResNet, VGG, etc. An evolvable network automatically inspects the passed observation space to determine the appropriate encoder to build through the AgileRL evolvable modules, inheriting the mutation methods of any underlying evolvable module.

Note

Currently, evolvable networks should only have the encoder (which, if not specified by the user, is automatically built from the observation space) and a ‘head_net’ attribute that processes the latent encodings into the desired number of outputs as evolvable components. For example, in RainbowQNetwork, we disable mutations for the advantage net and apply the same mutations to it as the ‘value’ net, which is the network head in this case. Users should follow the same philosophy.

Parameters:
  • observation_space (spaces.Space) – Observation space of the environment.

  • encoder_cls (Optional[Union[str, Type[EvolvableModule]]]) – Encoder class to use for the network. Defaults to None, whereby it is automatically built using an AgileRL module according the observation space.

  • encoder_config (Optional[ConfigType]) – Configuration of the encoder. Defaults to None.

  • action_space (Optional[spaces.Space]) – Action space of the environment. Defaults to None.

  • min_latent_dim (int) – Minimum dimension of the latent space representation. Defaults to 8.

  • max_latent_dim (int) – Maximum dimension of the latent space representation. Defaults to 128.

  • n_agents (Optional[int]) – Number of agents in the environment. Defaults to None, which corresponds to single-agent environments.

  • encoder_mutations (bool) – If True, allow mutations to the encoder. Defaults to False.

  • latent_dim (int) – Dimension of the latent space representation. Defaults to 32.

  • simba (bool) – If True, use a SimBa network for the encoder for vector spaces. Defaults to False.

  • device (DeviceType) – Device to use for the network. Defaults to “cpu”.

property activation: str

Activation function of the network.

Returns:

Activation function.

Return type:

str

add_latent_node(numb_new_nodes: int | None = None) Dict[str, Any]

Add a latent node to the network.

Parameters:

numb_new_nodes (int, optional) – Number of new nodes to add, defaults to None

Returns:

Configuration for adding a latent node.

Return type:

Dict[str, Any]

build_network_head(*args, **kwargs) None

Build the head of the network.

change_activation(activation: str, output: bool = False) None

Change the activation function for the network.

Parameters:
  • activation (str) – Activation function to use.

  • output (bool, optional) – If True, change the output activation function, defaults to False

create_mlp(num_inputs: int, num_outputs: int, name: str, net_config: Dict[str, Any]) EvolvableMLP

Builds the head of the network based on the passed configuration.

Parameters:
  • num_inputs (int) – Number of inputs to the network head.

  • num_outputs (int) – Number of outputs of the network head.

  • name (str) – Name of the network head.

  • net_config (Dict[str, Any]) – Configuration of the network head.

Returns:

Network head.

Return type:

EvolvableMLP

property encoder_config: Dict[str, Any]

Net configuration for encoder.

Returns:

Initial dictionary for the network.

Return type:

Dict[str, Any]

forward(x: Tensor | Dict[str, Tensor] | Tuple[Tensor, ...]) Tensor

Forward pass of the network.

Parameters:

x (TorchObsType) – Input to the network.

Returns:

Output of the network.

Return type:

torch.Tensor

property head_config: Dict[str, Any]

Net configuration for head.

Returns:

Initial dictionary for the network.

Return type:

Dict[str, Any]

init_weights_gaussian(std_coeff: float = 4.0, output_coeff: float = 2.0) None

Initialize the weights of the network with a Gaussian distribution.

Parameters:
  • std_coeff (float, optional) – Coefficient for the standard deviation of the Gaussian distribution, defaults to 4.0

  • output_coeff (float, optional) – Coefficient for the standard deviation of the Gaussian distribution for the output layer, defaults to 2.0

static modify_multi_agent_config(net_config: Dict[str, Any], observation_space: Space) Dict[str, Any]

In multi-agent settings, it is not clear what the shape of the input to the encoder is based on the passed observation space. If kernel sizes are passed as integers, we add a depth dimension of 1 for all layers. Note that for e.g. value functions the first layer should have a depth corresponding to the number of agents to receive a single output rather than self.n_agents

modules() Dict[str, EvolvableModule]

Modules of the network.

Returns:

Modules of the network.

Return type:

Dict[str, EvolvableModule]

recreate_encoder() None

Recreate the encoder of the network.

remove_latent_node(numb_new_nodes: int | None = None) Dict[str, Any]

Remove a latent node from the network.

Parameters:

numb_new_nodes (int, optional) – Number of nodes to remove, defaults to None

Returns:

Configuration for removing a latent node.

Return type:

Dict[str, Any]