LLM Utils¶
- class agilerl.utils.llm_utils.HuggingFaceGym(train_dataset: str, test_dataset, tokenizer: AutoTokenizer, reward_fn: Callable[[str, str, str], float], apply_chat_template_fn: Callable[[str, str, AutoTokenizer], BatchEncoding], data_batch_size_per_gpu: int = 8, custom_collate_fn: Callable | None = None, accelerator: Accelerator | None = None)¶
Class to convert HuggingFace datasets into Gymnasium style environment.
- Parameters:
dataset_name (str) – Dataset name to be loaded from HuggingFace datasets.
tokenizer (AutoTokenizer) – Tokenizer to be used for encoding and decoding the prompts.
reward_fn (Callable[..., float]) – Reward function for evaluating completions.
data_batch_size_per_gpu (int, optional) – DataLoader batch size, defaults to 8
accelerator (Accelerator, optional) – Accelerator to be used for training, defaults to None