ValueNetwork

Parameters

class agilerl.networks.value_networks.ValueNetwork(*args, **kwargs)

Value functions are used in reinforcement learning to estimate the expected value of a state. For any given observation, we predict a single scalar value that represents the discounted return from that state. Used in e.g. PPO.

Parameters:
  • observation_space (spaces.Space) – Observation space of the environment.

  • encoder_cls (Optional[Union[str, Type[EvolvableModule]]]) – Encoder class to use for the network. Defaults to None, whereby it is automatically built using an AgileRL module according the observation space.

  • encoder_config (ConfigType) – Configuration of the encoder.

  • head_config (Optional[ConfigType]) – Configuration of the head.

  • min_latent_dim (int) – Minimum latent dimension.

  • max_latent_dim (int) – Maximum latent dimension.

  • n_agents (Optional[int]) – Number of agents.

  • latent_dim (int) – Latent dimension.

  • device (str) – Device to run the network on.

build_network_head(net_config: IsDataclass | Dict[str, Any] | None = None) None

Builds the head of the network.

Parameters:

net_config (Optional[ConfigType]) – Configuration of the head.

forward(x: Tensor | Dict[str, Tensor] | Tuple[Tensor, ...]) Tensor

Forward pass of the network.

Parameters:

x (torch.Tensor, dict[str, torch.Tensor], or list[torch.Tensor]) – Input tensor.

Returns:

Output tensor.

Return type:

torch.Tensor

get_output_dense() Linear

Returns the output dense layer of the network.

Returns:

Output dense layer.

Return type:

torch.nn.Linear

recreate_network() None

Recreates the network.