Evolvable Long Short-Term Memory (LSTM)

Parameters

class agilerl.modules.lstm.EvolvableLSTM(*args: Any, **kwargs: Any)

The Evolvable Long Short-Term Memory (LSTM) class.

Parameters:
  • input_size (int) – Size of input features

  • hidden_state_size (int) – Size of hidden state

  • num_outputs (int) – Output dimension

  • num_layers (int) – Number of LSTM layers stacked together

  • output_activation (str, optional) – Output activation layer, defaults to None

  • min_hidden_state_size (int, optional) – Minimum hidden state size, defaults to 32

  • max_hidden_state_size (int, optional) – Maximum hidden state size, defaults to 512

  • min_layers (int, optional) – Minimum number of LSTM layers, defaults to 1

  • max_layers (int, optional) – Maximum number of LSTM layers, defaults to 3

  • dropout (float, optional) – Dropout probability between LSTM layers, defaults to 0.0

  • device (str, optional) – Device for accelerated computing, ‘cpu’ or ‘cuda’, defaults to ‘cpu’

  • name (str, optional) – Name of the network, defaults to ‘lstm’

  • random_seed (int | None) – Random seed to use for the network. Defaults to None.

property activation: str

Return activation function.

add_layer() None

Add an LSTM layer to the network. Falls back on add_node() if max layers reached.

add_node(numb_new_nodes: int | None = None) dict[str, int]

Increases hidden size of the LSTM.

Parameters:

numb_new_nodes (int, optional) – Number of nodes to add to hidden size, defaults to None

Returns:

Dictionary with number of new nodes

Return type:

dict[str, int]

change_activation(activation: str, output: bool = False) None

Set the output activation function for the network.

Parameters:
  • activation (str) – Activation function to use.

  • output (bool, optional) – Flag indicating whether to set the output activation function, defaults to False

create_lstm() ModuleDict

Create and returns an LSTM network with the current configuration.

Returns:

LSTM network

Return type:

nn.ModuleDict

forward(x: ndarray | Tensor, hidden_state: dict[str, ndarray | Tensor] | None = None) tuple[Tensor, dict[str, Tensor]]

Forward pass of the network.

Parameters:
  • x (ArrayOrTensor) – Input tensor

  • hidden_state (dict[str, torch.Tensor] | None) – Dict containing hidden and cell states, defaults to None

Returns:

Output tensor and next hidden state

Return type:

tuple[torch.Tensor, dict[str, torch.Tensor]]

get_output_dense() Module

Return output layer of neural network.

property hidden_state_architecture: dict[str, tuple[int, ...]]

Return the hidden state architecture.

property net_config: dict[str, Any]

Return model configuration in dictionary format.

recreate_network() None

Recreates the LSTM network with current parameters.

remove_layer() None

Remove an LSTM layer from the network. Falls back on add_node() if min layers reached.

remove_node(numb_new_nodes: int | None = None) dict[str, int]

Decreases hidden size of the LSTM.

Parameters:

numb_new_nodes (int, optional) – Number of nodes to remove from hidden size, defaults to None

Returns:

Dictionary with number of new nodes

Return type:

dict[str, int]