LLM Utils¶
- class agilerl.utils.llm_utils.HuggingFaceGym(train_dataset: str, test_dataset, tokenizer: AutoTokenizer, reward_fn: Callable[[str, str, str], float], apply_chat_template_fn: Callable[[str, str, AutoTokenizer], BatchEncoding], max_answer_tokens: int = 512, data_batch_size: int = 8, custom_collate_fn: Callable | None = None)¶
Class to convert HuggingFace datasets into Gymnasium style environment.
- Parameters:
dataset_name (str) – Dataset name to be loaded from HuggingFace datasets.
tokenizer (AutoTokenizer) – Tokenizer to be used for encoding and decoding the prompts.
reward_fn (Callable[..., float]) – Reward function for evaluating completions.
max_answer_tokens (int, optional) – Max number of answer tokens, defaults to 512
data_batch_size (int, optional) – DataLoader batch size, defaults to 8