Why Do Neural Networks Forget: A Study of Collapse in Continual Learning
#neural networks #catastrophic forgetting #continual learning #collapse #AI adaptation #sequential tasks #model retention
📌 Key Takeaways
- Neural networks experience 'catastrophic forgetting' when learning new tasks sequentially.
- The study identifies 'collapse' as a key mechanism behind this forgetting in continual learning.
- Researchers propose methods to mitigate collapse and improve model retention over time.
- Findings have implications for AI systems that need to adapt to new information without losing old knowledge.
📖 Full Retelling
arXiv:2603.04580v1 Announce Type: cross
Abstract: Catastrophic forgetting is a major problem in continual learning, and lots of approaches arise to reduce it. However, most of them are evaluated through task accuracy, which ignores the internal model structure. Recent research suggests that structural collapse leads to loss of plasticity, as evidenced by changes in effective rank (eRank). This indicates a link to forgetting, since the networks lose the ability to expand their feature space to l
🏷️ Themes
AI Learning, Memory Retention
Entity Intersection Graph
No entity connections available yet for this article.
Original Source
--> Computer Science > Machine Learning arXiv:2603.04580 [Submitted on 4 Mar 2026] Title: Why Do Neural Networks Forget: A Study of Collapse in Continual Learning Authors: Yunqin Zhu , Jun Jin View a PDF of the paper titled Why Do Neural Networks Forget: A Study of Collapse in Continual Learning, by Yunqin Zhu and Jun Jin View PDF HTML Abstract: Catastrophic forgetting is a major problem in continual learning, and lots of approaches arise to reduce it. However, most of them are evaluated through task accuracy, which ignores the internal model structure. Recent research suggests that structural collapse leads to loss of plasticity, as evidenced by changes in effective rank . This indicates a link to forgetting, since the networks lose the ability to expand their feature space to learn new tasks, which forces the network to overwrite existing representations. Therefore, in this study, we investigate the correlation between forgetting and collapse through the measurement of both weight and activation eRank. To be more specific, we evaluated four architectures, including MLP, ConvGRU, ResNet-18, and Bi-ConvGRU, in the split MNIST and Split CIFAR-100 benchmarks. Those models are trained through the SGD, Learning-without-Forgetting , and Experience Replay strategies separately. The results demonstrate that forgetting and collapse are strongly related, and different continual learning strategies help models preserve both capacity and performance in different efficiency. Subjects: Machine Learning (cs.LG) ; Artificial Intelligence (cs.AI) Cite as: arXiv:2603.04580 [cs.LG] (or arXiv:2603.04580v1 [cs.LG] for this version) https://doi.org/10.48550/arXiv.2603.04580 Focus to learn more arXiv-issued DOI via DataCite (pending registration) Submission history From: Yunqin Zhu [ view email ] [v1] Wed, 4 Mar 2026 20:19:00 UTC (2,765 KB) Full-text links: Access Paper: View a PDF of the paper titled Why Do Neural Networks Forget: A Study of Collapse in Continual Learning, by Yunqin Zhu...
Read full article at source