QNetwork¶
Parameters¶
- class agilerl.networks.q_networks.QNetwork(*args, **kwargs)¶
Q Networks correspond to state-action value functions in deep reinforcement learning. From any given state, they predict the value of each action that can be taken from that state. By default, we build an encoder that extracts features from an input corresponding to the passed observation space using the AgileRL evolvable modules. The QNetwork then uses an EvolvableMLP as head to predict a value for each possible discrete action for the given state.
- Parameters:
observation_space (spaces.Space) – Observation space of the environment.
action_space (DiscreteSpace) – Action space of the environment
encoder_cls (Optional[Union[str, Type[EvolvableModule]]]) – Encoder class to use for the network. Defaults to None, whereby it is automatically built using an AgileRL module according the observation space.
encoder_config (ConfigType) – Configuration of the encoder network.
head_config (Optional[ConfigType]) – Configuration of the network MLP head.
min_latent_dim (int) – Minimum dimension of the latent space representation. Defaults to 8.
max_latent_dim (int) – Maximum dimension of the latent space representation. Defaults to 128.
n_agents (Optional[int]) – Number of agents in the environment. Defaults to None, which corresponds to single-agent environments.
latent_dim (int) – Dimension of the latent space representation.
device (str) – Device to use for the network.
- build_network_head(net_config: Dict[str, Any]) None ¶
Builds the head of the network based on the passed configuration.
- Parameters:
net_config (Dict[str, Any]) – Configuration of the network head.
RainbowQNetwork¶
Parameters¶
- class agilerl.networks.q_networks.RainbowQNetwork(*args, **kwargs)¶
RainbowQNetwork is an extension of the QNetwork that incorporates the Rainbow DQN improvements from “Rainbow: Combining Improvements in Deep Reinforcement Learning” (Hessel et al., 2017).
Paper: https://arxiv.org/abs/1710.02298
- Parameters:
observation_space (spaces.Space) – Observation space of the environment.
action_space (DiscreteSpace) – Action space of the environment
encoder_cls (Optional[Union[str, Type[EvolvableModule]]]) – Encoder class to use for the network. Defaults to None, whereby it is automatically built using an AgileRL module according the observation space.
encoder_config (ConfigType) – Configuration of the encoder network.
support (torch.Tensor) – Support for the distributional value function.
num_atoms (int) – Number of atoms in the distributional value function. Defaults to 51.
head_config (Optional[ConfigType]) – Configuration of the network MLP head.
min_latent_dim (int) – Minimum dimension of the latent space representation. Defaults to 8.
max_latent_dim (int) – Maximum dimension of the latent space representation. Defaults to 128.
n_agents (Optional[int]) – Number of agents in the environment. Defaults to None, which corresponds to single-agent environments.
latent_dim (int) – Dimension of the latent space representation.
device (str) – Device to use for the network.
- build_network_head(net_config: Dict[str, Any]) None ¶
Builds the value and advantage heads of the network based on the passed configuration.
- Parameters:
net_config (Dict[str, Any]) – Configuration of the network head.
ContinuousQNetwork¶
Parameters¶
- class agilerl.networks.q_networks.ContinuousQNetwork(*args, **kwargs)¶
ContinuousQNetwork is an extension of the QNetwork that is used for continuous action spaces. This is used in off-policy algorithms like DDPG and TD3. The network predicts the Q value for a given state-action pair.
Paper: https://arxiv.org/abs/1509.02971
- Parameters:
observation_space (spaces.Space) – Observation space of the environment.
action_space (spaces.Box) – Action space of the environment
encoder_cls (Optional[Union[str, Type[EvolvableModule]]]) – Encoder class to use for the network. Defaults to None, whereby it is automatically built using an AgileRL module according the observation space.
encoder_config (ConfigType) – Configuration of the encoder network.
head_config (Optional[ConfigType]) – Configuration of the network MLP head.
min_latent_dim (int) – Minimum dimension of the latent space representation. Defaults to 8.
max_latent_dim (int) – Maximum dimension of the latent space representation. Defaults to 128.
n_agents (Optional[int]) – Number of agents in the environment. Defaults to None, which corresponds to single-agent environments.
latent_dim (int) – Dimension of the latent space representation.
simba (bool) – Whether to use SimBA for the network. Defaults to False.
normalize_actions (bool) – Whether to normalize the actions. Defaults to False. This is set to True if the encoder has nn.LayerNorm layers.
device (str) – Device to use for the network.
- build_network_head(net_config: IsDataclass | Dict[str, Any] | None = None) None ¶
Builds the head of the network.
- Parameters:
head_config (Optional[ConfigType]) – Configuration of the head.