SP
BravenNow
Reinforcement learning from human feedback
🌐 Entity

Reinforcement learning from human feedback

Machine learning technique

πŸ“Š Rating

1 news mentions Β· πŸ‘ 0 likes Β· πŸ‘Ž 0 dislikes

πŸ“Œ Topics

  • AI Technology (1)
  • Machine Learning (1)
  • Generative Models (1)

🏷️ Keywords

Curriculum-DPO (1) Β· Text-to-image generation (1) Β· Direct Preference Optimization (1) Β· Reinforcement learning from human feedback (1) Β· Preference optimization (1) Β· AI alignment (1) Β· Generative AI (1)

πŸ“– Key Information

In machine learning, reinforcement learning from human feedback (RLHF) is a technique to align an intelligent agent with human preferences. It involves training a reward model to represent preferences, which can then be used to train other models through reinforcement learning. In classical reinforcement learning, an intelligent agent's goal is to learn a function that guides its behavior, called a policy.

πŸ“° Related News (1)

πŸ”— Entity Intersection Graph

Generative artificial intelligence(1)AI alignment(1)Reinforcement learning from human feedback

People and organizations frequently mentioned alongside Reinforcement learning from human feedback:

πŸ”— External Links