Structured Knowledge Distillation for Efficient Image Segmentation

In this paper, we propose a framework for efficient semantic segmentation using a lightweight student network that is trained by full-stage knowledge distillation from a teacher network. The goal is to achieve highly efficient SAM (Segmentation Anywhere Model) with minimal computational cost and memory usage while maintaining high accuracy.
To achieve this, we introduce an online hard prompt sampling method to mine the hard knowledge from the teacher network to the student network. This helps to activate the distillation process and ensure that the student network learns the most important features from the teacher network.
We also adapt a post-training quantization method to the segmentation task, which reduces the precision of the weights and activations in the student network while maintaining the accuracy. This further reduces the computational cost and memory usage of the lightweight student network.
Finally, we propose a hierarchical everything inference mode that allows the lightweight student network to avoid redundant computation by only performing segmentation for the objects that are actually present in the image. This results in a significant speedup of the inference time without compromising the accuracy.
In summary, our proposed framework enables highly efficient SAM using a lightweight student network that is trained by full-stage knowledge distillation and post-training quantization. The hierarchical everything inference mode further accelerates the inference time without sacrificing accuracy, making it an ideal solution for real-world applications where computational resources are limited.

ARXIV/2312.13789 authored by Han Shu, Wenshuo Li, Yehui Tang, Yiman Zhang, Yihao Chen, Houqiang Li, Yunhe Wang, Xinghao Chen.

Structured Knowledge Distillation for Efficient Image Segmentation

LLama 2 7B Chat

Categories

Tags

Archives

Structured Knowledge Distillation for Efficient Image Segmentation

LLama 2 7B Chat

Accurate Analysis of Image Captions with CoT-Based Methods

Unsupervised Audio-Caption Alignment via Correspondence Learning

Efficient Method for ML Model Accuracy Improvement in Non-IID Data Settings

Categories

Tags

Archives