LLM Utils

class agilerl.utils.llm_utils.HuggingFaceGym(train_dataset: str, test_dataset, tokenizer: AutoTokenizer, reward_fn: Callable[[str, str, str], float], apply_chat_template_fn: Callable[[str, str, AutoTokenizer], BatchEncoding], data_batch_size_per_gpu: int = 8, custom_collate_fn: Callable | None = None, accelerator: Accelerator | None = None)

Class to convert HuggingFace datasets into Gymnasium style environment.

Parameters:
  • dataset_name (str) – Dataset name to be loaded from HuggingFace datasets.

  • tokenizer (AutoTokenizer) – Tokenizer to be used for encoding and decoding the prompts.

  • reward_fn (Callable[..., float]) – Reward function for evaluating completions.

  • data_batch_size_per_gpu (int, optional) – DataLoader batch size, defaults to 8

  • accelerator (Accelerator, optional) – Accelerator to be used for training, defaults to None