SP
BravenNow
Conjugate Learning Theory: Uncovering the Mechanisms of Trainability and Generalization in Deep Neural Networks
| USA | technology | ✓ Verified - arxiv.org

Conjugate Learning Theory: Uncovering the Mechanisms of Trainability and Generalization in Deep Neural Networks

#practical learnability #conjugate learning theory #convex conjugate duality #empirical risk minimization #mini‑batch SGD #deep neural networks #global optimum #extreme eigenvalues #trainability #generalization

📌 Key Takeaways

  • Definition of practical learnability suited to finite‑sample regimes.
  • Development of a conjugate learning framework based on convex conjugate duality.
  • Analysis of mini‑batch SGD training dynamics for deep neural networks.
  • Proof that SGD can reach global optima of the empirical risk.
  • Joint control of extreme eigenvalues of the empirical loss Hessian as a key mechanism.
  • Implications for understanding trainability and generalization in deep learning.

📖 Full Retelling

Researchers in machine learning and theoretical computer science have announced a new framework, the Conjugate Learning Theory, presented on the arXiv in February 2026. The paper introduces a notion of practical learnability tailored to finite‑sample settings and builds a theoretical foundation using convex conjugate duality. It demonstrates that training deep neural networks (DNNs) with mini‑batch stochastic gradient descent (SGD) can reach global optima of the empirical risk by jointly controlling the extreme eigenvalues of the loss curvature. The work aims to clarify the mechanisms that enable DNNs to be trainable and to generalize well in practice.

🏷️ Themes

Learning theory, Convex optimization, Deep neural networks, Stochastic gradient descent, Eigenvalue analysis

Entity Intersection Graph

No entity connections available yet for this article.

Original Source
arXiv:2602.16177v1 Announce Type: cross Abstract: In this work, we propose a notion of practical learnability grounded in finite sample settings, and develop a conjugate learning theoretical framework based on convex conjugate duality to characterize this learnability property. Building on this foundation, we demonstrate that training deep neural networks (DNNs) with mini-batch stochastic gradient descent (SGD) achieves global optima of empirical risk by jointly controlling the extreme eigenval
Read full article at source

Source

arxiv.org

More from USA

News from Other Countries

🇬🇧 United Kingdom

🇺🇦 Ukraine