Evolvable Convolutional Neural Network (CNN)

Parameters

class agilerl.modules.cnn.EvolvableCNN(*args: Any, **kwargs: Any)

The Evolvable Convolutional Neural Network class. Consists of a sequence of convolutional layers with an optional activation function between each layer. Supports using layer normalization. Allows for the following types of architecture mutations during training:

  • Adding or removing convolutional layers

  • Adding or removing channels from convolutional layers

  • Changing the kernel size and stride size of convolutional layers

  • Changing the activation function between layers (e.g. ReLU to GELU)

  • Changing the activation function for the output layer (e.g. ReLU to GELU)

Parameters:
  • input_shape (list[int]) – Input shape

  • num_outputs (int) – Action dimension

  • channel_size (list[int]) – CNN channel size

  • kernel_size (list[KernelSizeType]) – Convolution kernel size

  • stride_size (list[int]) – Convolution stride size

  • sample_input (torch.Tensor | None, optional) – Sample input tensor, defaults to None

  • block_type (Literal["Conv1d", "Conv2d", "Conv3d"], optional) – Type of convolutional block, either ‘Conv1d’, ‘Conv2d’ or ‘Conv3d’, defaults to ‘Conv2d’.

  • activation (str, optional) – CNN activation layer, defaults to ‘ReLU’

  • output_activation (str | None, optional) – MLP output activation layer, defaults to None

  • min_hidden_layers (int, optional) – Minimum number of hidden layers the fully connected layer will shrink down to, defaults to 1

  • max_hidden_layers (int, optional) – Maximum number of hidden layers the fully connected layer will expand to, defaults to 6

  • min_channel_size (int, optional) – Minimum number of channels a convolutional layer can have, defaults to 32

  • max_channel_size (int, optional) – Maximum number of channels a convolutional layer can have, defaults to 256

  • layer_norm (bool, optional) – Normalization between layers, defaults to False

  • init_layers (bool, optional) – Initialise network layers, defaults to True

  • device (str, optional) – Device for accelerated computing, ‘cpu’ or ‘cuda’, defaults to ‘cpu’

  • name (str, optional) – Name of the CNN, defaults to ‘cnn’

  • random_seed (int | None) – Random seed to use for the network. Defaults to None.

property activation: str

Return the activation function of the network.

Returns:

Activation function

Return type:

str

add_channel(hidden_layer: int | None = None, numb_new_channels: int | None = None) dict[str, int]

Add channel to hidden layer of convolutional neural network.

Parameters:
  • hidden_layer (int, optional) – Depth of hidden layer to add channel to, defaults to None

  • numb_new_channels (int, optional) – Number of channels to add to hidden layer, defaults to None

Returns:

Dictionary containing the hidden layer and number of new channels added

Return type:

dict[str, int]

add_layer() None

Add a hidden layer to convolutional neural network.

Returns:

If maximum number of hidden layers is reached, returns a dictionary containing

the hidden layer and number of new channels. :rtype: dict[str, int] | None

change_activation(activation: str, output: bool = False) None

Set the activation function for the network.

Parameters:
  • activation (str) – Activation function to use.

  • output (bool, optional) – Flag indicating whether to set the output activation function, defaults to False

change_kernel(kernel_size: int | None = None, hidden_layer: int | None = None) dict[str, int | None]

Randomly alters convolution kernel of random CNN layer.

Parameters:
  • kernel_size (int, optional) – Kernel size to change to, defaults to None

  • hidden_layer (int, optional) – Depth of hidden layer to change kernel size of, defaults to None

Returns:

Dictionary containing the hidden layer and kernel size

Return type:

dict[str, int | None]

create_cnn(in_channels: int, channel_size: list[int], kernel_size: list[int | tuple[int, ...]], stride_size: list[int], sample_input: Tensor) Sequential

Create and returns a convolutional neural network.

Parameters:
  • in_channels (int) – The number of input channels.

  • channel_size (list[int]) – A list of integers representing the number of channels in each convolutional layer.

  • kernel_size (list[int]) – A list of integers representing the kernel size of each convolutional layer.

  • stride_size (list[int]) – A list of integers representing the stride size of each convolutional layer.

  • sample_input (torch.Tensor) – A sample input tensor.

  • name (str) – The name of the CNN.

Returns:

The created convolutional neural network.

Return type:

nn.Sequential

forward(x: ndarray | Tensor) Tensor

Return output of neural network.

Parameters:

x (torch.Tensor or np.ndarray) – Neural network input

Returns:

Output of the neural network

Return type:

torch.Tensor

get_output_dense() Module

Return output layer of neural network.

Returns:

Output layer of neural network

Return type:

torch.nn.Module

init_weights_gaussian(std_coeff: float = 4) None

Initialise weights of linear layer using Gaussian distribution.

Parameters:

std_coeff (float, optional) – Standard deviation coefficient, defaults to 4

property kernel_size: list[int | tuple[int, ...]]

Return the kernel size of the network.

Returns:

Kernel size

Return type:

list[KernelSizeType]

recreate_network(shrink_params: bool = False) None

Recreates the neural network while preserving the parameters of the old network.

Parameters:

shrink_params (bool, optional) – Flag indicating whether to shrink the parameters, defaults to False

remove_channel(hidden_layer: int | None = None, numb_new_channels: int | None = None) dict[str, int]

Remove channel from hidden layer of convolutional neural network.

Parameters:
  • hidden_layer (int, optional) – Depth of hidden layer to add channel to, defaults to None

  • numb_new_channels (int, optional) – Number of channels to add to hidden layer, defaults to None

Returns:

Dictionary containing the hidden layer and number of new channels

Return type:

dict[str, int | None]

remove_layer() dict[str, int] | None

Remove a hidden layer from convolutional neural network.

Returns:

If minimum number of hidden layers is reached, returns a dictionary containing

the hidden layer and number of new channels. :rtype: dict[str, int] | None

reset_noise() None

Reset noise of the model layers.

static shrink_preserve_parameters(old_net: Module, new_net: Module) Module

Return shrunk new neural network with copied parameters from old network.

Parameters:
  • old_net (nn.Module) – Old neural network

  • new_net (nn.Module) – New neural network

Returns:

Shrunk new neural network with copied parameters

Return type:

nn.Module