PettingZoo Vectorization Parallel Wrapper

The PettingZooVectorizationParallelWrapper class is a wrapper that vectorizes the environment, allowing multiple instances of the environment to be run in parallel.

from agilerl.wrappers.pettingzoo_wrappers import PettingZooVectorizationParallelWrapper
from pettingzoo.atari import space_invaders_v2

env = space_invaders_v2.parallel_env()
n_envs = 4
vec_env = PettingZooVectorizationParallelWrapper(env, n_envs=n_envs)
observations, infos = vec_env.reset()
for step in range(25):
    actions = {
        agent: [vec_env.action_space(agent).sample() for n in range(n_envs)]
        for agent in vec_env.agents
    }
    observations, rewards, terminations, truncations, infos = vec_env.step(actions)

Parameters

class agilerl.wrappers.pettingzoo_wrappers.PettingZooVectorizationParallelWrapper(env: ParallelEnv[AgentID, ObsType, ActionType], n_envs: int)
action_space(agent: str) Space

Takes in agent and returns the action space for that agent.

MUST return the same value for the same agent name

Default implementation is to return the action_spaces dict

close() None

Closes the rendering window.

observation_space(agent: str) Space

Takes in agent and returns the observation space for that agent.

MUST return the same value for the same agent name

Default implementation is to return the observation_spaces dict

render() None | np.ndarray | str | list

Displays a rendered frame from the environment, if supported.

Alternate render modes in the default environments are ‘rgb_array’ which returns a numpy array and is supported by all environments outside of classic, and ‘ansi’ which returns the strings printed (specific to classic environments).

reset(seed: int | None = None, options: dict | None = None) tuple[dict[AgentID, ObsType], dict[AgentID, dict]]

Resets the environment.

And returns a dictionary of observations (keyed by the agent name)

property state: ndarray

Returns the state.

State returns a global view of the environment appropriate for centralized training decentralized execution methods like QMIX

step(actions: dict[str, ActionType]) tuple[dict[str, ObsType], dict[str, float], dict[str, bool], dict[str, bool], dict[str, dict]]

Receives a dictionary of actions keyed by the agent name.

Returns the observation dictionary, reward dictionary, terminated dictionary, truncated dictionary and info dictionary, where each dictionary is keyed by the agent.