Evolvable Multi-Input Neural Network (Dict / Tuple Observations)

Parameters

class agilerl.modules.multi_input.EvolvableMultiInput(*args: Any, **kwargs: Any)

Evolvable multi-input network for Tuple or Dict observation spaces. It inspects the observation space to determine the type of network to build for each key. It builds an EvolvableCNN for image subspaces and a nn.Flatten() for other types. Vector observations are concatenated with the extracted features before passing through an EvolvableMLP to produce the output tensor. Optionally, users may specify an additional EvolvableMLP to be applied to the concatenated vector observations before concatenation with the extracted features.

Supports the following types of architecture mutations during training:

  • Adding or removing latent nodes

  • Inherits the mutation methods of any nested EvolvableModule objects used in the network

Parameters:
  • observation_space (spaces.Dict or spaces.Tuple) – Dictionary or Tuple space of observations.

  • num_outputs (int) – Dimension of the output tensor.

  • latent_dim (int, optional) – Dimension of the latent space representation. Default is 16.

  • vector_space_mlp (bool, optional) – Whether to use an MLP for the vector spaces. This is done by concatenating the flattened observations and passing them through an EvolvableMLP. Default is False, whereby the observations are concatenated directly to the feature encodings before the final MLP.

  • cnn_config (MultiInputConfigType, optional) – Configuration for the CNN feature extractor. Default is None.

  • mlp_config (MultiInputConfigType, optional) – Configuration for the MLP feature extractor. Default is None.

  • init_dicts (dict[str, dict[str, Any]], optional) – Dictionary of initialization dictionaries for the feature extractors. Default is {}.

  • output_activation (str | None, optional) – Activation function for the output layer. Default is None.

  • min_latent_dim (int, optional) – Minimum dimension of the latent space. Default is 8.

  • max_latent_dim (int, optional) – Maximum dimension of the latent space. Default is 128.

  • device (str, optional) – Device to use for the network. Default is “cpu”.

  • name (str, optional) – Name of the network. Default is “multi_input”.

  • random_seed (int | None) – Random seed to use for the network. Defaults to None.

property activation: str

Get the activation function for the network.

Returns:

Activation function

Return type:

str

add_latent_node(numb_new_nodes: int | None = None) dict[str, Any]

Add a latent node to the network.

Parameters:

numb_new_nodes (int, optional) – Number of new nodes to add, defaults to None

Returns:

Dictionary specifying the number of nodes added.

Return type:

dict[str, Any]

build_feature_extractor() dict[str, EvolvableCNN | EvolvableMLP | EvolvableLSTM | SelfMultiInput]

Create the feature extractor and final MLP networks.

Returns:

Dictionary of feature extractors.

Return type:

dict[str, EvolvableMLP | EvolvableCNN | EvolvableLSTM | EvolvableMultiInput]

calc_extracted_features_dim() int

Calculate the toal dimension of the features extracted by the evolvable feature extractors.

Returns:

Total dimension of the extracted features.

Return type:

int

change_activation(activation: str, output: bool = False) None

Set the activation function for the network.

Parameters:
  • activation (str) – Activation function to use.

  • output (bool, optional) – Flag indicating whether to set the output activation function, defaults to False

Returns:

Activation function

Return type:

str

property cnn_init_dict: dict[str, Any]

Return the initialization dictionary for the CNN.

forward(x: dict[str, ndarray | Tensor] | tuple[ndarray | Tensor]) Tensor

Forward pass of the composed network. Extracts features from each observation key and concatenates them with the corresponding observation key if specified. The concatenated features are then passed through the final MLP to produce the output tensor.

Parameters:

x (dict[str, ArrayOrTensor], tuple[ArrayOrTensor]) – Dictionary of observations.

Returns:

Output tensor.

Return type:

torch.Tensor

get_inner_init_dict(key: str, default: ModuleType) dict[str, dict[str, Any] | Any]

Return the initialization dictionary for the specified key.

Parameters:
  • key (str) – Key of the observation space.

  • default (ModuleType) – Default value to return if the key is not found.

Returns:

Initialization dictionary.

Return type:

ConfigType

property init_dicts: dict[str, dict[str, Any]]

Return the initialization dictionaries for the network.

Returns:

Initialization dictionaries

Return type:

dict[str, dict[str, Any]]

init_weights_gaussian(std_coeff: float = 4, output_coeff: float = 4) None

Initialise weights of linear layers using Gaussian distribution.

property mlp_init_dict: dict[str, Any]

Return the initialization dictionary for the MLP.

property net_config: dict[str, Any]

Return the configuration of the network.

Returns:

Network configuration

Return type:

dict[str, Any]

recreate_network() None

Recreates the network with the new latent dimension.

remove_latent_node(numb_new_nodes: int | None = None) dict[str, Any]

Remove a latent node from the network.

Parameters:

numb_new_nodes (int, optional) – Number of nodes to remove, defaults to None

Returns:

Dictionary specifying the number of nodes removed.

Return type:

dict[str, Any]