On Extending Direct Preference Optimization to Accommodate Ties
Published in NeurIPS 2025
We extend Direct Preference Optimization (DPO) to handle tied preferences, where annotators cannot distinguish between two responses. Our method provides a principled framework for incorporating ties into preference-based alignment of language models.
Recommended citation: J. Chen, G. Yang, W. Lin, J. Mei, B. Byrne. "On Extending Direct Preference Optimization to Accommodate Ties." NeurIPS 2025.
