Evolvable Convolutional Neural Network (CNN)

Parameters

class agilerl.modules.cnn.EvolvableCNN(*args, **kwargs)

The Evolvable Convolutional Neural Network class. It supports the evolution of the CNN architecture by adding or removing convolutional layers, changing the number of channels in each layer, changing the kernel size and stride size of each layer, and changing the number of nodes in the fully connected layer.

Parameters:
  • input_shape (List[int]) – Input shape

  • num_outputs (int) – Action dimension

  • channel_size (List[int]) – CNN channel size

  • kernel_size (List[KernelSizeType]) – Convolution kernel size

  • stride_size (List[int]) – Convolution stride size

  • sample_input (Optional[torch.Tensor], optional) – Sample input tensor, defaults to None

  • block_type (Literal["Conv2d", "Conv3d"], optional) – Type of convolutional block, either ‘Conv2d’ or ‘Conv3d’, defaults to ‘Conv2d’

  • activation (str, optional) – CNN activation layer, defaults to ‘ReLU’

  • output_activation (Optional[str], optional) – MLP output activation layer, defaults to None

  • min_hidden_layers (int, optional) – Minimum number of hidden layers the fully connected layer will shrink down to, defaults to 1

  • max_hidden_layers (int, optional) – Maximum number of hidden layers the fully connected layer will expand to, defaults to 6

  • min_channel_size (int, optional) – Minimum number of channels a convolutional layer can have, defaults to 32

  • max_channel_size (int, optional) – Maximum number of channels a convolutional layer can have, defaults to 256

  • layer_norm (bool, optional) – Normalization between layers, defaults to False

  • init_layers (bool, optional) – Initialise network layers, defaults to True

  • device (str, optional) – Device for accelerated computing, ‘cpu’ or ‘cuda’, defaults to ‘cpu’

  • name (str, optional) – Name of the CNN, defaults to ‘cnn’

property activation: str

Returns the activation function of the network.

add_channel(hidden_layer: int | None = None, numb_new_channels: int | None = None) Dict[str, int]

Adds channel to hidden layer of convolutional neural network.

Parameters:
  • hidden_layer (int, optional) – Depth of hidden layer to add channel to, defaults to None

  • numb_new_channels (int, optional) – Number of channels to add to hidden layer, defaults to None

Returns:

Dictionary containing the hidden layer and number of new channels added

Return type:

dict[str, int]

add_layer() None

Adds a hidden layer to convolutional neural network.

change_activation(activation: str, output: bool = False) None

Set the activation function for the network.

Parameters:
  • activation (str) – Activation function to use.

  • output (bool, optional) – Flag indicating whether to set the output activation function, defaults to False

change_kernel(kernel_size: int | None = None, hidden_layer: int | None = None) Dict[str, int | None]

Randomly alters convolution kernel of random CNN layer.

Parameters:
  • kernel_size (int, optional) – Kernel size to change to, defaults to None

  • hidden_layer (int, optional) – Depth of hidden layer to change kernel size of, defaults to None

Returns:

Dictionary containing the hidden layer and kernel size

Return type:

Dict[str, Union[int, None]]

create_cnn(in_channels: int, channel_size: List[int], kernel_size: List[int | Tuple[int, ...]], stride_size: List[int], sample_input: Tensor) Sequential

Creates and returns a convolutional neural network.

Parameters:
  • in_channels (int) – The number of input channels.

  • channel_size (List[int]) – A list of integers representing the number of channels in each convolutional layer.

  • kernel_size (List[int]) – A list of integers representing the kernel size of each convolutional layer.

  • stride_size (List[int]) – A list of integers representing the stride size of each convolutional layer.

  • sample_input (torch.Tensor) – A sample input tensor.

  • name (str) – The name of the CNN.

Returns:

The created convolutional neural network.

Return type:

nn.Sequential

forward(x: _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes] | Tensor) Tensor

Returns output of neural network.

Parameters:

x (torch.Tensor()) – Neural network input

Returns:

Output of the neural network

Return type:

torch.Tensor

get_output_dense() Module

Returns output layer of neural network.

init_weights_gaussian(std_coeff: float = 4) None

Initialise weights of linear layer using Gaussian distribution.

property kernel_size: List[int | Tuple[int, ...]]

Returns the kernel size of the network.

recreate_network(shrink_params: bool = False) None

Recreates neural networks.

Parameters:

shrink_params (bool, optional) – Flag indicating whether to shrink the parameters, defaults to False

remove_channel(hidden_layer: int | None = None, numb_new_channels: int | None = None) Dict[str, int]

Remove channel from hidden layer of convolutional neural network.

Parameters:
  • hidden_layer (int, optional) – Depth of hidden layer to add channel to, defaults to None

  • numb_new_channels (int, optional) – Number of channels to add to hidden layer, defaults to None

Returns:

Dictionary containing the hidden layer and number of new channels

Return type:

Dict[str, Union[int, None]]

remove_layer() None

Removes a hidden layer from convolutional neural network.

reset_noise() None

Resets noise of the model layers.

static shrink_preserve_parameters(old_net: Module, new_net: Module) Module

Returns shrunk new neural network with copied parameters from old network.

Parameters:
  • old_net (nn.Module) – Old neural network

  • new_net (nn.Module) – New neural network

Returns:

Shrunk new neural network with copied parameters

Return type:

nn.Module