EvolvableNetwork¶
Parameters¶
- class agilerl.networks.base.EvolvableNetwork(*args, **kwargs)¶
Base class for evolvable networks, i.e., evolvable modules that are configured in a specific way for a reinforcement learning algorithm, similar to how CNNs are used as building blocks in ResNet, VGG, etc. An evolvable network automatically inspects the passed observation space to determine the appropriate encoder to build through the AgileRL evolvable modules, inheriting the mutation methods of any underlying evolvable module.
Note
Currently, evolvable networks should only have the encoder (which, if not specified by the user, is automatically built from the observation space) and a ‘head_net’ attribute that processes the latent encodings into the desired number of outputs as evolvable components. For example, in
RainbowQNetwork
, we disable mutations for the advantage net and apply the same mutations to it as the ‘value’ net, which is the network head in this case. Users should follow the same philosophy.- Parameters:
observation_space (spaces.Space) – Observation space of the environment.
encoder_cls (Optional[Union[str, Type[EvolvableModule]]]) – Encoder class to use for the network. Defaults to None, whereby it is automatically built using an AgileRL module according the observation space.
encoder_config (Optional[ConfigType]) – Configuration of the encoder. Defaults to None.
action_space (Optional[spaces.Space]) – Action space of the environment. Defaults to None.
min_latent_dim (int) – Minimum dimension of the latent space representation. Defaults to 8.
max_latent_dim (int) – Maximum dimension of the latent space representation. Defaults to 128.
n_agents (Optional[int]) – Number of agents in the environment. Defaults to None, which corresponds to single-agent environments.
encoder_mutations (bool) – If True, allow mutations to the encoder. Defaults to False.
latent_dim (int) – Dimension of the latent space representation. Defaults to 32.
simba (bool) – If True, use a SimBa network for the encoder for vector spaces. Defaults to False.
device (DeviceType) – Device to use for the network. Defaults to “cpu”.
- property activation: str¶
Activation function of the network.
- Returns:
Activation function.
- Return type:
- add_latent_node(numb_new_nodes: int | None = None) Dict[str, Any] ¶
Add a latent node to the network.
- change_activation(activation: str, output: bool = False) None ¶
Change the activation function for the network.
- create_mlp(num_inputs: int, num_outputs: int, name: str, net_config: Dict[str, Any]) EvolvableMLP ¶
Builds the head of the network based on the passed configuration.
- Parameters:
- Returns:
Network head.
- Return type:
- property encoder_config: Dict[str, Any]¶
Net configuration for encoder.
- Returns:
Initial dictionary for the network.
- Return type:
Dict[str, Any]
- forward(x: Tensor | Dict[str, Tensor] | Tuple[Tensor, ...]) Tensor ¶
Forward pass of the network.
- Parameters:
x (TorchObsType) – Input to the network.
- Returns:
Output of the network.
- Return type:
torch.Tensor
- property head_config: Dict[str, Any]¶
Net configuration for head.
- Returns:
Initial dictionary for the network.
- Return type:
Dict[str, Any]
- init_weights_gaussian(std_coeff: float = 4.0, output_coeff: float = 2.0) None ¶
Initialize the weights of the network with a Gaussian distribution.
- static modify_multi_agent_config(net_config: Dict[str, Any], observation_space: Space) Dict[str, Any] ¶
In multi-agent settings, it is not clear what the shape of the input to the encoder is based on the passed observation space. If kernel sizes are passed as integers, we add a depth dimension of 1 for all layers. Note that for e.g. value functions the first layer should have a depth corresponding to the number of agents to receive a single output rather than self.n_agents
- modules() Dict[str, EvolvableModule] ¶
Modules of the network.
- Returns:
Modules of the network.
- Return type:
Dict[str, EvolvableModule]