2/19/2026 | USA | technology | ✓ Verified - arxiv.org

IT-OSE: Exploring Optimal Sample Size for Industrial Data Augmentation

#data augmentation #optimal sample size #information theory #industrial scenarios #model performance

📌 Key Takeaways

Industrial data augmentation lacks a theoretical basis for determining optimal sample size.
No established metric exists to evaluate the accuracy or deviation of an estimated OSS from the ground truth.
Authors propose an information-theoretic framework to estimate OSS.
The approach targets improving model performance while addressing practical limitations in industrial scenarios.

📖 Full Retelling

The paper appears in the arXiv repository as arXiv:2602.15878v1, published in February 2026. It addresses researchers and practitioners working with industrial machine learning systems who rely on data augmentation to boost model performance. The authors point out that, despite its practical benefits, there is no theoretical framework or established method for determining the optimal sample size (OSS) for augmentation, nor is there a metric for assessing how close a chosen OSS is to the true optimal value. In response, they propose an information-theoretic approach aimed at estimating OSS and evaluating its accuracy within industrial contexts.

🏷️ Themes

Data Augmentation, Optimal Sample Size, Information Theory, Industrial Machine Learning

Entity Intersection Graph

No entity connections available yet for this article.

}

Original Source

              arXiv:2602.15878v1 Announce Type: cross 
Abstract: In industrial scenarios, data augmentation is an effective approach to improve model performance. However, its benefits are not unidirectionally beneficial. There is no theoretical research or established estimation for the optimal sample size (OSS) in augmentation, nor is there an established metric to evaluate the accuracy of OSS or its deviation from the ground truth. To address these issues, we propose an information-theoretic optimal sample
            

Read full article at source

Source

arxiv.org

IT-OSE: Exploring Optimal Sample Size for Industrial Data Augmentation

📌 Key Takeaways

📖 Full Retelling

🏷️ Themes

Entity Intersection Graph

Source

More from USA

News from Other Countries

🇬🇧 United Kingdom

🇺🇦 Ukraine