Точка Синхронізації

AI Archive of Human History

🌐 Entity Reinforcement learning from human feedback

Reinforcement learning from human feedback

Machine learning technique

📊 Rating

4 news mentions · 👍 0 likes · 👎 0 dislikes

📌 Topics

  • Artificial Intelligence (4)
  • Machine Learning (3)
  • Computer Vision (1)
  • Ethics (1)
  • Digital Sovereignty (1)
  • Linguistics (1)
  • Reinforcement Learning (1)

🏷️ Keywords

RLHF (4) · arXiv (3) · Generative AI (2) · Reward Modeling (1) · Chain-of-Thought (1) · Image Editing (1) · Semantic Consistency (1) · Reinforcement Learning (1) · AI Alignment (1) · Sycophancy (1) · Ground Truth (1) · Dogma 4 (1) · compar:IA (1) · Large Language Models (1) · French government (1) · AI alignment (1) · Direct Preference Optimization (1) · Dataset (1) · AEGPO (1) · Diffusion Models (1)

📖 Key Information

In machine learning, reinforcement learning from human feedback (RLHF) is a technique to align an intelligent agent with human preferences. It involves training a reward model to represent preferences, which can then be used to train other models through reinforcement learning. In classical reinforcement learning, an intelligent agent's goal is to learn a function that guides its behavior, called a policy.

📰 Related News (4)

🔗 Entity Intersection Graph

People and organizations frequently mentioned alongside Reinforcement learning from human feedback:

🔗 External Links