Back to Resources
🎖️

TRL

Training & Fine-tuning

Transformer Reinforcement Learning for RLHF and alignment.

10kstars1.3kforksPython

About

TRL by Hugging Face provides tools for RLHF, DPO, PPO, and reward modeling to align language models with human preferences.

Key Features

  • RLHF
  • PPO
  • DPO
  • Reward modeling
  • SFT

Tags

RLHFAlignmentDPOHugging Face

Related Resources