SP
BravenNow
Learning Rewards, Not Labels: Adversarial Inverse Reinforcement Learning for Machinery Fault Detection
| USA | technology | ✓ Verified - arxiv.org

Learning Rewards, Not Labels: Adversarial Inverse Reinforcement Learning for Machinery Fault Detection

#Reinforcement Learning #Machinery Fault Detection #Adversarial Inverse Reinforcement Learning #Anomaly Detection #Industrial Diagnostics #Sequential Decision-Making #Data-Driven Approaches

📌 Key Takeaways

  • Researchers developed a novel approach using Adversarial Inverse Reinforcement Learning for machinery fault detection
  • The method learns reward dynamics directly from healthy operational sequences, bypassing manual reward engineering
  • The framework trains a discriminator to distinguish between normal and abnormal transitions
  • The approach was successfully tested on three benchmark datasets with promising results
  • This work bridges the gap between RL's sequential decision-making and MFD's temporal structure

📖 Full Retelling

Researchers Dhiraj Neupane, Richard Dazeley, Mohamed Reda Bouadjenek, and Sunil Aryal introduced a novel approach to machinery fault detection using Adversarial Inverse Reinforcement Learning in a paper submitted to arXiv on February 25, 2026, addressing the limitation of current methods that fail to fully utilize reinforcement learning's sequential decision-making capabilities for industrial diagnostics. The research team formulated machinery fault detection as an offline inverse reinforcement learning problem, allowing an agent to learn reward dynamics directly from healthy operational sequences. This innovative approach bypasses the need for manual reward engineering and fault labels, which have been traditional barriers in machine diagnostics. By treating MFD not as a simple guessing game but as a complex sequential decision-making process, the researchers demonstrated how reinforcement learning's full potential can be harnessed for industrial applications. The framework employs Adversarial Inverse Reinforcement Learning to train a discriminator that distinguishes between normal and policy-generated transitions, with the discriminator's learned reward serving as an anomaly score to indicate deviations from normal operating behavior. When evaluated on three benchmark datasets—HUMS2023, IMS, and XJTU-SY—the model consistently assigned low anomaly scores to normal samples and high scores to faulty ones, enabling early and robust fault detection.

🏷️ Themes

Machine Learning, Industrial Diagnostics, Reinforcement Learning

📚 Related People & Topics

Reinforcement learning

Reinforcement learning

Field of machine learning

In machine learning and optimal control, reinforcement learning (RL) is concerned with how an intelligent agent should take actions in a dynamic environment in order to maximize a reward signal. Reinforcement learning is one of the three basic machine learning paradigms, alongside supervised learnin...

View Profile → Wikipedia ↗

Entity Intersection Graph

Connections for Reinforcement learning:

🌐 Large language model 8 shared
🌐 Artificial intelligence 6 shared
🌐 Machine learning 4 shared
🏢 Science Publishing Group 2 shared
🌐 Reasoning model 2 shared
View full profile
Original Source
--> Computer Science > Machine Learning arXiv:2602.22297 [Submitted on 25 Feb 2026] Title: Learning Rewards, Not Labels: Adversarial Inverse Reinforcement Learning for Machinery Fault Detection Authors: Dhiraj Neupane , Richard Dazeley , Mohamed Reda Bouadjenek , Sunil Aryal View a PDF of the paper titled Learning Rewards, Not Labels: Adversarial Inverse Reinforcement Learning for Machinery Fault Detection, by Dhiraj Neupane and 3 other authors View PDF HTML Abstract: Reinforcement learning offers significant promise for machinery fault detection . However, most existing RL-based MFD approaches do not fully exploit RL's sequential decision-making strengths, often treating MFD as a simple guessing game (Contextual Bandits). To bridge this gap, we formulate MFD as an offline inverse reinforcement learning problem, where the agent learns the reward dynamics directly from healthy operational sequences, thereby bypassing the need for manual reward engineering and fault labels. Our framework employs Adversarial Inverse Reinforcement Learning to train a discriminator that distinguishes between normal and policy-generated transitions. The discriminator's learned reward serves as an anomaly score, indicating deviations from normal operating behaviour. When evaluated on three run-to-failure benchmark datasets (HUMS2023, IMS, and XJTU-SY), the model consistently assigns low anomaly scores to normal samples and high scores to faulty ones, enabling early and robust fault detection. By aligning RL's sequential reasoning with MFD's temporal structure, this work opens a path toward RL-based diagnostics in data-driven industrial settings. Comments: This article is accepted to be published in AAMAS2026. The doi is listed below but the production is on the way as of now (26/02/2026) Subjects: Machine Learning (cs.LG) ; Artificial Intelligence (cs.AI) Cite as: arXiv:2602.22297 [cs.LG] (or arXiv:2602.22297v1 [cs.LG] for this version) https://doi.org/10.48550/arXiv.2602.22297 Focus to le...
Read full article at source

Source

arxiv.org

More from USA

News from Other Countries

🇬🇧 United Kingdom

🇺🇦 Ukraine