Generative Reward Models (GenRM): A Hybrid Approach to Reinforcement Learning from Human and AI Feedback, Solving Task Generalization and Feedback Collection Challenges MarkTechPost
Recent Comments