Evolvable Long Short-Term Memory (LSTM)¶

Parameters¶

class agilerl.modules.lstm.EvolvableLSTM(*args: Any, **kwargs: Any)¶

The Evolvable Long Short-Term Memory (LSTM) class.

Parameters:

input_size (int) – Size of input features
hidden_state_size (int) – Size of hidden state
num_outputs (int) – Output dimension
num_layers (int) – Number of LSTM layers stacked together
output_activation (str, optional) – Output activation layer, defaults to None
min_hidden_state_size (int, optional) – Minimum hidden state size, defaults to 32
max_hidden_state_size (int, optional) – Maximum hidden state size, defaults to 512
min_layers (int, optional) – Minimum number of LSTM layers, defaults to 1
max_layers (int, optional) – Maximum number of LSTM layers, defaults to 3
dropout (float, optional) – Dropout probability between LSTM layers, defaults to 0.0
device (str, optional) – Device for accelerated computing, ‘cpu’ or ‘cuda’, defaults to ‘cpu’
name (str, optional) – Name of the network, defaults to ‘lstm’
random_seed (int | None) – Random seed to use for the network. Defaults to None.

add_layer() → None¶: Add an LSTM layer to the network. Falls back on add_node() if max layers reached.

add_node(numb_new_nodes: int | None = None) → dict[str, int]¶

Increases hidden size of the LSTM.

Parameters:: numb_new_nodes (int, optional) – Number of nodes to add to hidden size, defaults to None
Returns:: Dictionary with number of new nodes
Return type:: dict[str, int]

change_activation(activation: str, output: bool = False) → None¶

Set the output activation function for the network.

Parameters:

activation (str) – Activation function to use.
output (bool, optional) – Flag indicating whether to set the output activation function, defaults to False

create_lstm() → ModuleDict¶

Create and returns an LSTM network with the current configuration.

forward(x: ndarray | Tensor, hidden_state: dict[str, ndarray | Tensor] | None = None) → tuple[Tensor, dict[str, Tensor]]¶

Forward pass of the network.

Parameters:

x (ArrayOrTensor) – Input tensor
hidden_state (dict[str, torch.Tensor] | None) – Dict containing hidden and cell states, defaults to None

Returns:

Output tensor and next hidden state

Return type:

tuple[torch.Tensor, dict[str, torch.Tensor]]

property hidden_state_architecture: dict[str, tuple[int, ...]]¶: Return the hidden state architecture.

property net_config: dict[str, Any]¶: Return model configuration in dictionary format.

recreate_network() → None¶: Recreates the LSTM network with current parameters.

remove_layer() → None¶: Remove an LSTM layer from the network. Falls back on add_node() if min layers reached.

remove_node(numb_new_nodes: int | None = None) → dict[str, int]¶

Decreases hidden size of the LSTM.

Parameters:: numb_new_nodes (int, optional) – Number of nodes to remove from hidden size, defaults to None
Returns:: Dictionary with number of new nodes
Return type:: dict[str, int]