ValueNetwork¶
Parameters¶
- class agilerl.networks.value_networks.ValueNetwork(*args, **kwargs)¶
Value functions are used in reinforcement learning to estimate the expected value of a state. For any given observation, we predict a single scalar value that represents the discounted return from that state. Used in e.g. PPO.
- Parameters:
observation_space (spaces.Space) – Observation space of the environment.
encoder_cls (Optional[Union[str, Type[EvolvableModule]]]) – Encoder class to use for the network. Defaults to None, whereby it is automatically built using an AgileRL module according the observation space.
encoder_config (ConfigType) – Configuration of the encoder.
head_config (Optional[ConfigType]) – Configuration of the head.
min_latent_dim (int) – Minimum latent dimension.
max_latent_dim (int) – Maximum latent dimension.
n_agents (Optional[int]) – Number of agents.
latent_dim (int) – Latent dimension.
device (str) – Device to run the network on.
- build_network_head(net_config: IsDataclass | Dict[str, Any] | None = None) None ¶
Builds the head of the network.
- Parameters:
net_config (Optional[ConfigType]) – Configuration of the head.
- get_output_dense() Linear ¶
Returns the output dense layer of the network.
- Returns:
Output dense layer.
- Return type:
torch.nn.Linear