ValueNetwork¶

Parameters¶

class agilerl.networks.value_networks.ValueNetwork(*args: Any, **kwargs: Any)¶

Value functions are used in reinforcement learning to estimate the expected value of a state. For any given observation, we predict a single scalar value that represents the discounted return from that state. Used in e.g. PPO.

Parameters:

observation_space (spaces.Space) – Observation space of the environment.
encoder_cls (str | type[EvolvableModule] | None) – Encoder class to use for the network. Defaults to None, whereby it is automatically built using an AgileRL module according the observation space.
encoder_config (NetConfigType) – Configuration of the encoder.
head_config (NetConfigType | None) – Configuration of the head.
min_latent_dim (int) – Minimum latent dimension.
max_latent_dim (int) – Maximum latent dimension.
latent_dim (int) – Latent dimension.
simba (bool) – Whether to use the SimBa architecture for training the network.
recurrent (bool) – Whether to use a recurrent network.
device (str) – Device to run the network on.
random_seed (int | None) – Random seed to use for the network. Defaults to None.

build_network_head(net_config: dict[str, dict[str, Any] | Any]) → None¶

Build the head of the network.

Parameters:: net_config (NetConfigType) – Configuration of the head.

Forward pass of the network.

Parameters:: x (torch.Tensor, dict[str, torch.Tensor], or list[torch.Tensor]) – Input tensor.
Returns:: Output tensor.
Return type:: torch.Tensor

get_output_dense() → Linear¶

Return the output dense layer of the network.

Returns:: Output dense layer.
Return type:: torch.nn.Linear

recreate_network() → None¶: Recreates the network.