In this article, the authors propose a novel approach to compress deep convolutional neural networks (CNNs) designed for mobile applications. The key idea is to treat the one-hot shift kernel vector as an on/off switch carrying information about its associated ci-co rank-1 term. By pruning this term, the feature map size and representation capacity both reduce.
The authors first present a novel view of CNNs, called the CPD (Circuit Product Decomposition) view, which allows for a more intuitive understanding of the network’s architecture. In this view, the contraction/convolution process is divided into three stages: 1) PW (Pyramid Wide) along ci to generate slices equal to the CPD rank rcp, 2) DW (Dilated Weight) along these slices, and 3) PW (Pyramid Wide) along the desired co.
The authors then propose a new compression technique called PDP (PW+DW+PW), which reduces the number of weight parameters by (|ci| + |co| + |k2|)rcp, where rcp is the number of CPD terms. This number can be tuned for different tradeoffs between complexity and representation capacity.
To evaluate the effectiveness of their approach, the authors conduct experiments on various datasets using a standard bottleneck layer (SBL) and an inverted bottleneck layer (IBL). They show that PDP significantly reduces the number of weight parameters while maintaining accuracy. Furthermore, they demonstrate that their approach can be used to prune shift groups in the network, which leads to additional compression gains without affecting accuracy.
In conclusion, this article presents a novel and effective approach to compress deep CNNs for mobile applications. By treating the one-hot shift kernel vector as an on/off switch carrying information about its associated ci-co rank-1 term, the authors are able to significantly reduce the number of weight parameters while maintaining accuracy. Their proposed PDP compression technique offers a promising solution for reducing the computational cost and memory usage of CNNs in mobile devices.
Computer Science, Computer Vision and Pattern Recognition