Options
November 21, 2025
Paper (Preprint, Research Paper, Review Paper, White Paper, etc.)
Title
Reinforcement Learning for Large Language Model Fine-Tuning: A Systematic Literature Review
Abstract
Large Language Models (LLMs) have been developed for a wide range of language-based tasks, while Reinforcement Learning (RL) has been primarily applied to decision-making problems such as robotics, game theory, and control systems. Nowadays, these two paradigms are integrated through different synergies. In this literature review, we focus RL4LLM fine-tuning, where RL techniques are systematically leveraged to fine-tune LLMs and align them with various preferences. Our review provides a comprehensive analysis of 230 recent publications, presenting a methodological taxonomy that organizes current research into three primary method domains: Optimization Algorithm, concerning innovation in core RL update rules; Training Framework, regarding innovation in the orchestration of the training process; and Reward Modeling, addressing how LLMs learn and represent preferences and feedback. Within these primary domains, we further analyze methods and innovations through more granular categories to provide in-depth summary of RL4LLM fine-tuning research. We address three research questions: 1) recent methods overview, 2) methodological innovations, and 3) limitations and future work. Our analysis comprehensively demonstrates the breadth and impact of recent RL4LLM fine-tuning research while highlighting valuable directions for future investigation.
Author(s)
Funder
Deutsche Forschungsgemeinschaft -DFG-, Bonn
Open Access
File(s)
Rights
CC BY-SA 4.0: Creative Commons Attribution-ShareAlike
Language
English