PettingZoo Vectorization Parallel Wrapper¶
The PettingZooVectorizationParallelWrapper class is a wrapper that vectorizes the environment, allowing multiple instances of the environment to be run in parallel.
from agilerl.wrappers.pettingzoo_wrappers import PettingZooVectorizationParallelWrapper
from pettingzoo.atari import space_invaders_v2
env = space_invaders_v2.parallel_env()
n_envs = 4
vec_env = PettingZooVectorizationParallelWrapper(env, n_envs=n_envs)
observations, infos = vec_env.reset()
for step in range(25):
actions = {
agent: [vec_env.action_space(agent).sample() for n in range(n_envs)]
for agent in vec_env.agents
}
observations, rewards, terminations, truncations, infos = vec_env.step(actions)
Parameters¶
- class agilerl.wrappers.pettingzoo_wrappers.PettingZooVectorizationParallelWrapper(env: ParallelEnv[AgentID, ObsType, ActionType], n_envs: int)¶
- action_space(agent: str) Space ¶
Takes in agent and returns the action space for that agent.
MUST return the same value for the same agent name
Default implementation is to return the action_spaces dict
- observation_space(agent: str) Space ¶
Takes in agent and returns the observation space for that agent.
MUST return the same value for the same agent name
Default implementation is to return the observation_spaces dict
- render() None | np.ndarray | str | list ¶
Displays a rendered frame from the environment, if supported.
Alternate render modes in the default environments are ‘rgb_array’ which returns a numpy array and is supported by all environments outside of classic, and ‘ansi’ which returns the strings printed (specific to classic environments).
- reset(seed: int | None = None, options: dict | None = None) tuple[dict[AgentID, ObsType], dict[AgentID, dict]] ¶
Resets the environment.
And returns a dictionary of observations (keyed by the agent name)
- property state: ndarray¶
Returns the state.
State returns a global view of the environment appropriate for centralized training decentralized execution methods like QMIX
- step(actions: dict[str, ActionType]) tuple[dict[str, ObsType], dict[str, float], dict[str, bool], dict[str, bool], dict[str, dict]] ¶
Receives a dictionary of actions keyed by the agent name.
Returns the observation dictionary, reward dictionary, terminated dictionary, truncated dictionary and info dictionary, where each dictionary is keyed by the agent.