In this article, we present a novel approach to fuse part and semantic information for robust grasping in robotics. Our proposed scheme, shown in Figure 4, consists of two steps: semantic-wise fusion and part-wise fusion. The former evaluates the agreement function to produce "enhanced" logits for each semantic class, while the latter takes a part class and its associated semantic class to produce "enhanced" logits for each part class. These enhanced logits are then passed to the panoptic fusion model to generate final part labels.
To fuse part and semantic information, we use a sigmoid function rescaled to the range [-1, 1] to combine the part and semantic logits. This allows the model to focus on the most relevant parts of each modality when making predictions. By combining these modalities, our approach can better handle variations in object shape, size, and material properties, leading to more robust grasping performance.
Our proposed scheme improves upon existing methods by providing a more comprehensive understanding of the object’s structure and properties. By considering both part and semantic information, we can better distinguish between different parts of an object and their associated semantics. This leads to improved object recognition and grasping accuracy, making our approach suitable for real-world robotics applications.
In conclusion, our article presents a novel approach to fuse part and semantic information for robust grasping in robotics. By combining both modalities using a sigmoid function rescaled to the range [-1, 1], we can better handle variations in object shape, size, and material properties, leading to more accurate object recognition and grasping performance. Our proposed scheme has important implications for real-world robotics applications, where robustness and adaptability are crucial for successful operation.