Evolvable Multi-layer Perceptron (MLP)¶
Parameters¶
- class agilerl.modules.mlp.EvolvableMLP(*args: Any, **kwargs: Any)¶
The Evolvable Multi-layer Perceptron class. Consists of a sequence of fully connected linear layers with an optional activation function between each layer. Supports using layer normalization, using noisy linear layers, and vanishing the values of the weights in the output layer. Allows for the following types of architecture mutations during training:
Adding or removing hidden layers
Adding or removing nodes from hidden layers
Changing the activation function between layers (e.g. ReLU to GELU)
Changing the activation function for the output layer (e.g. ReLU to GELU)
- Parameters:
num_inputs (int) – Input layer dimension
num_outputs (int) – Output layer dimension
activation (str, optional) – Activation layer, defaults to ‘ReLU’
output_activation (str, optional) – Output activation layer, defaults to None
min_hidden_layers (int, optional) – Minimum number of hidden layers the network will shrink down to, defaults to 1
max_hidden_layers (int, optional) – Maximum number of hidden layers the network will expand to, defaults to 3
min_mlp_nodes (int, optional) – Minimum number of nodes a layer can have within the network, defaults to 64
max_mlp_nodes (int, optional) – Maximum number of nodes a layer can have within the network, defaults to 500
layer_norm (bool, optional) – Normalization between layers, defaults to True
output_layernorm (bool, optional) – Normalization for the output layer, defaults to False
output_vanish (bool, optional) – Vanish output by multiplying by 0.1, defaults to True
init_layers (bool, optional) – Initialise network layers, defaults to True
noise_std (float, optional) – Noise standard deviation, defaults to 0.5
noisy (bool, optional) – Add noise to network, defaults to False
new_gelu (bool, optional) – Use new GELU activation function, defaults to False
device (str, optional) – Device for accelerated computing, ‘cpu’ or ‘cuda’, defaults to ‘cpu’
name (str, optional) – Name of the network, defaults to ‘mlp’
random_seed (int | None) – Random seed to use for the network. Defaults to None.
- add_layer() dict[str, int] | None¶
Add a hidden layer to neural network. Falls back on
add_node()ifmax_hidden_layersreached.
- add_node(hidden_layer: int | None = None, numb_new_nodes: int | None = None) dict[str, int]¶
Add nodes to hidden layer of neural network.
- change_activation(activation: str, output: bool = False) None¶
Set the activation function for the network.
- forward(x: ndarray | Tensor) Tensor¶
Return output of neural network.
- Parameters:
x (torch.Tensor or np.ndarray) – Neural network input
- Returns:
Neural network output
- Return type:
torch.Tensor
- get_output_dense() Module¶
Return output layer of neural network.
- Returns:
Output layer of neural network
- Return type:
torch.nn.Module
- init_weights_gaussian(std_coeff: float = 4, output_coeff: float = 4) None¶
Initialise weights of neural network using Gaussian distribution.
- recreate_network() None¶
Recreates the neural network while preserving the parameters of the old network.
- remove_layer() dict[str, int] | None¶
Remove a hidden layer from neural network. Falls back on
add_node()ifmin_hidden_layersreached.