Back to Resources
🎖️
TRL
Training & Fine-tuningTransformer Reinforcement Learning for RLHF and alignment.
10kstars1.3kforksPython
About
TRL by Hugging Face provides tools for RLHF, DPO, PPO, and reward modeling to align language models with human preferences.
Key Features
- RLHF
- PPO
- DPO
- Reward modeling
- SFT
Tags
RLHFAlignmentDPOHugging Face