Catastrophic forgetting is a major challenge in training neural networks, where the model’s performance on a task deteriorates significantly as new tasks are learned. This problem is particularly pronounced in deep neural networks, which have a large capacity to learn new tasks but also require more computational resources. In this article, we propose a brain-inspired algorithm that mitigates catastrophic forgetting in both artificial and spiking neural networks with low computational cost.
Algorithm
Our proposed algorithm, called HWC (Hierarchical Weight Consolidation), is inspired by the hierarchical organization of the brain’s neural circuits. The key insight behind HWC is to consolidate the weights of the neural network at different levels of hierarchy, allowing the model to retain its knowledge of previous tasks while adapting to new ones. Specifically, HWC uses a hierarchical structure of neural networks, where each level of the hierarchy represents a different abstraction of the input data. The weights of the lower levels are consolidated using a long short-term memory (LSTM) network, while the higher levels use a simpler feedforward network to adapt to new tasks.
Experiments
We evaluate the performance of HWC on several benchmark datasets and compare it with other state-of-the-art methods for mitigating catastrophic forgetting. Our results show that HWC achieves better or comparable performance to these methods while requiring significantly fewer computational resources. We also analyze the energy consumption of our proposed algorithm and find that it has a much lower power consumption compared to other neuromorphic chips.
Conclusion
In this article, we proposed a brain-inspired algorithm for mitigating catastrophic forgetting in neural networks. Our proposed algorithm, HWC, uses a hierarchical structure of neural networks and consolidates the weights of the lower levels using an LSTM network to retain knowledge of previous tasks while adapting to new ones. We demonstrated the effectiveness of HWC through experiments on several benchmark datasets and showed that it achieves better or comparable performance to other state-of-the-art methods while requiring fewer computational resources. Our proposed algorithm has important implications for developing efficient and effective neural networks for a wide range of applications, including image recognition, natural language processing, and autonomous driving.