Evolvable Convolutional Neural Network (CNN)¶

Parameters¶

class agilerl.modules.cnn.EvolvableCNN(*args: Any, **kwargs: Any)¶

The Evolvable Convolutional Neural Network class. Consists of a sequence of convolutional layers with an optional activation function between each layer. Supports using layer normalization. Allows for the following types of architecture mutations during training:

Adding or removing convolutional layers
Adding or removing channels from convolutional layers
Changing the kernel size and stride size of convolutional layers
Changing the activation function between layers (e.g. ReLU to GELU)
Changing the activation function for the output layer (e.g. ReLU to GELU)

Parameters:

input_shape (list[int]) – Input shape
num_outputs (int) – Action dimension
channel_size (list[int]) – CNN channel size
kernel_size (list[KernelSizeType]) – Convolution kernel size
stride_size (list[int]) – Convolution stride size
sample_input (torch.Tensor | None, optional) – Sample input tensor, defaults to None
block_type (Literal["Conv1d", "Conv2d", "Conv3d"], optional) – Type of convolutional block, either ‘Conv1d’, ‘Conv2d’ or ‘Conv3d’, defaults to ‘Conv2d’.
activation (str, optional) – CNN activation layer, defaults to ‘ReLU’
output_activation (str | None, optional) – MLP output activation layer, defaults to None
min_hidden_layers (int, optional) – Minimum number of hidden layers the fully connected layer will shrink down to, defaults to 1
max_hidden_layers (int, optional) – Maximum number of hidden layers the fully connected layer will expand to, defaults to 6
min_channel_size (int, optional) – Minimum number of channels a convolutional layer can have, defaults to 32
max_channel_size (int, optional) – Maximum number of channels a convolutional layer can have, defaults to 256
layer_norm (bool, optional) – Normalization between layers, defaults to False
init_layers (bool, optional) – Initialise network layers, defaults to True
device (str, optional) – Device for accelerated computing, ‘cpu’ or ‘cuda’, defaults to ‘cpu’
name (str, optional) – Name of the CNN, defaults to ‘cnn’
random_seed (int | None) – Random seed to use for the network. Defaults to None.

property activation: str¶

Return the activation function of the network.

Returns:: Activation function
Return type:: str

add_channel(hidden_layer: int | None = None, numb_new_channels: int | None = None) → dict[str, int]¶

Add channel to hidden layer of convolutional neural network.

Parameters:

hidden_layer (int, optional) – Depth of hidden layer to add channel to, defaults to None
numb_new_channels (int, optional) – Number of channels to add to hidden layer, defaults to None

Returns:

Dictionary containing the hidden layer and number of new channels added

Return type:

dict[str, int]

add_layer() → None¶

Add a hidden layer to convolutional neural network.

Returns:: If maximum number of hidden layers is reached, returns a dictionary containing

the hidden layer and number of new channels. :rtype: dict[str, int] | None

change_activation(activation: str, output: bool = False) → None¶

Set the activation function for the network.

Parameters:

activation (str) – Activation function to use.
output (bool, optional) – Flag indicating whether to set the output activation function, defaults to False

change_kernel(kernel_size: int | None = None, hidden_layer: int | None = None) → dict[str, int | None]¶

Randomly alters convolution kernel of random CNN layer.

Parameters:

kernel_size (int, optional) – Kernel size to change to, defaults to None
hidden_layer (int, optional) – Depth of hidden layer to change kernel size of, defaults to None

Returns:

Dictionary containing the hidden layer and kernel size

Return type:

dict[str, int | None]

create_cnn(in_channels: int, channel_size: list[int], kernel_size: list[int | tuple[int, ...]], stride_size: list[int], sample_input: Tensor) → Sequential¶

Create and returns a convolutional neural network.

Parameters:

in_channels (int) – The number of input channels.
channel_size (list[int]) – A list of integers representing the number of channels in each convolutional layer.
kernel_size (list[int]) – A list of integers representing the kernel size of each convolutional layer.
stride_size (list[int]) – A list of integers representing the stride size of each convolutional layer.
sample_input (torch.Tensor) – A sample input tensor.
name (str) – The name of the CNN.

Returns:

The created convolutional neural network.

Return type:

nn.Sequential

forward(x: ndarray | Tensor) → Tensor¶

Return output of neural network.

Parameters:: x (torch.Tensor or np.ndarray) – Neural network input
Returns:: Output of the neural network
Return type:: torch.Tensor

get_output_dense() → Module¶

Return output layer of neural network.

Returns:: Output layer of neural network
Return type:: torch.nn.Module

init_weights_gaussian(std_coeff: float = 4) → None¶

Initialise weights of linear layer using Gaussian distribution.

Parameters:: std_coeff (float, optional) – Standard deviation coefficient, defaults to 4

property kernel_size: list[int | tuple[int, ...]]¶

Return the kernel size of the network.

Returns:: Kernel size
Return type:: list[KernelSizeType]

recreate_network(shrink_params: bool = False) → None¶

Recreates the neural network while preserving the parameters of the old network.

Parameters:: shrink_params (bool, optional) – Flag indicating whether to shrink the parameters, defaults to False

remove_channel(hidden_layer: int | None = None, numb_new_channels: int | None = None) → dict[str, int]¶

Remove channel from hidden layer of convolutional neural network.

Parameters:

hidden_layer (int, optional) – Depth of hidden layer to add channel to, defaults to None
numb_new_channels (int, optional) – Number of channels to add to hidden layer, defaults to None

Returns:

Dictionary containing the hidden layer and number of new channels

Return type:

dict[str, int | None]

remove_layer() → dict[str, int] | None¶

Remove a hidden layer from convolutional neural network.

Returns:: If minimum number of hidden layers is reached, returns a dictionary containing

the hidden layer and number of new channels. :rtype: dict[str, int] | None

reset_noise() → None¶: Reset noise of the model layers.

static shrink_preserve_parameters(old_net: Module, new_net: Module) → Module¶

Return shrunk new neural network with copied parameters from old network.

Parameters:

old_net (nn.Module) – Old neural network
new_net (nn.Module) – New neural network

Returns:

Shrunk new neural network with copied parameters

Return type:

nn.Module