Evolvable Convolutional Neural Network (CNN)¶
Parameters¶
- class agilerl.modules.cnn.EvolvableCNN(*args: Any, **kwargs: Any)¶
The Evolvable Convolutional Neural Network class. Consists of a sequence of convolutional layers with an optional activation function between each layer. Supports using layer normalization. Allows for the following types of architecture mutations during training:
Adding or removing convolutional layers
Adding or removing channels from convolutional layers
Changing the kernel size and stride size of convolutional layers
Changing the activation function between layers (e.g. ReLU to GELU)
Changing the activation function for the output layer (e.g. ReLU to GELU)
- Parameters:
num_outputs (int) – Action dimension
kernel_size (list[KernelSizeType]) – Convolution kernel size
sample_input (torch.Tensor | None, optional) – Sample input tensor, defaults to None
block_type (Literal["Conv1d", "Conv2d", "Conv3d"], optional) – Type of convolutional block, either ‘Conv1d’, ‘Conv2d’ or ‘Conv3d’, defaults to ‘Conv2d’.
activation (str, optional) – CNN activation layer, defaults to ‘ReLU’
output_activation (str | None, optional) – MLP output activation layer, defaults to None
min_hidden_layers (int, optional) – Minimum number of hidden layers the fully connected layer will shrink down to, defaults to 1
max_hidden_layers (int, optional) – Maximum number of hidden layers the fully connected layer will expand to, defaults to 6
min_channel_size (int, optional) – Minimum number of channels a convolutional layer can have, defaults to 32
max_channel_size (int, optional) – Maximum number of channels a convolutional layer can have, defaults to 256
layer_norm (bool, optional) – Normalization between layers, defaults to False
init_layers (bool, optional) – Initialise network layers, defaults to True
device (str, optional) – Device for accelerated computing, ‘cpu’ or ‘cuda’, defaults to ‘cpu’
name (str, optional) – Name of the CNN, defaults to ‘cnn’
random_seed (int | None) – Random seed to use for the network. Defaults to None.
- property activation: str¶
Return the activation function of the network.
- Returns:
Activation function
- Return type:
- add_channel(hidden_layer: int | None = None, numb_new_channels: int | None = None) dict[str, int]¶
Add channel to hidden layer of convolutional neural network.
- Parameters:
- Returns:
Dictionary containing the hidden layer and number of new channels added
- Return type:
- add_layer() None¶
Add a hidden layer to convolutional neural network.
- Returns:
If maximum number of hidden layers is reached, returns a dictionary containing
the hidden layer and number of new channels. :rtype: dict[str, int] | None
- change_activation(activation: str, output: bool = False) None¶
Set the activation function for the network.
- change_kernel(kernel_size: int | None = None, hidden_layer: int | None = None) dict[str, int | None]¶
Randomly alters convolution kernel of random CNN layer.
- create_cnn(in_channels: int, channel_size: list[int], kernel_size: list[int | tuple[int, ...]], stride_size: list[int], sample_input: Tensor) Sequential¶
Create and returns a convolutional neural network.
- Parameters:
in_channels (int) – The number of input channels.
channel_size (list[int]) – A list of integers representing the number of channels in each convolutional layer.
kernel_size (list[int]) – A list of integers representing the kernel size of each convolutional layer.
stride_size (list[int]) – A list of integers representing the stride size of each convolutional layer.
sample_input (torch.Tensor) – A sample input tensor.
name (str) – The name of the CNN.
- Returns:
The created convolutional neural network.
- Return type:
nn.Sequential
- forward(x: ndarray | Tensor) Tensor¶
Return output of neural network.
- Parameters:
x (torch.Tensor or np.ndarray) – Neural network input
- Returns:
Output of the neural network
- Return type:
torch.Tensor
- get_output_dense() Module¶
Return output layer of neural network.
- Returns:
Output layer of neural network
- Return type:
torch.nn.Module
- init_weights_gaussian(std_coeff: float = 4) None¶
Initialise weights of linear layer using Gaussian distribution.
- Parameters:
std_coeff (float, optional) – Standard deviation coefficient, defaults to 4
- property kernel_size: list[int | tuple[int, ...]]¶
Return the kernel size of the network.
- Returns:
Kernel size
- Return type:
list[KernelSizeType]
- recreate_network(shrink_params: bool = False) None¶
Recreates the neural network while preserving the parameters of the old network.
- Parameters:
shrink_params (bool, optional) – Flag indicating whether to shrink the parameters, defaults to False
- remove_channel(hidden_layer: int | None = None, numb_new_channels: int | None = None) dict[str, int]¶
Remove channel from hidden layer of convolutional neural network.
- Parameters:
- Returns:
Dictionary containing the hidden layer and number of new channels
- Return type:
- remove_layer() dict[str, int] | None¶
Remove a hidden layer from convolutional neural network.
- Returns:
If minimum number of hidden layers is reached, returns a dictionary containing
the hidden layer and number of new channels. :rtype: dict[str, int] | None
- static shrink_preserve_parameters(old_net: Module, new_net: Module) Module¶
Return shrunk new neural network with copied parameters from old network.
- Parameters:
old_net (nn.Module) – Old neural network
new_net (nn.Module) – New neural network
- Returns:
Shrunk new neural network with copied parameters
- Return type:
nn.Module