Category-Level Object Pose Estimation with Statistic Attention.

Advanced Search

Changhong Jiang, Xiaoqiao Mu, Bingbing Zhang, Chao Liang, Mujun Xie

Author Information

Changhong Jiang: School of Electrical and Electronic Engineering, Changchun University of Technology, Changchun 130012, China. ORCID
Xiaoqiao Mu: School of Mechanical and Electrical Engineering, Changchun University of Technology, Changchun 130012, China.
Bingbing Zhang: School of Computer Science and Engineering, Dalian Minzu University, Dalian 116602, China. ORCID
Chao Liang: Collage of Computer Science and Engineering, Changchun University of Technology, Changchun 130012, China.
Mujun Xie: School of Electrical and Electronic Engineering, Changchun University of Technology, Changchun 130012, China. ORCID

PMID: 39205041 DOI: 10.3390/s24165347

Six-dimensional object pose estimation is a fundamental problem in the field of computer vision. Recently, category-level object pose estimation methods based on 3D-GC have made significant breakthroughs due to advancements in 3D-GC. However, current methods often fail to capture long-range dependencies, which are crucial for modeling complex and occluded object shapes. Additionally, discerning detailed differences between different objects is essential. Some existing methods utilize self-attention mechanisms or Transformer encoder-decoder structures to address the lack of long-range dependencies, but they only focus on first-order information of features, failing to explore more complex information and neglecting detailed differences between objects. In this paper, we propose SAPENet, which follows the 3D-GC architecture but replaces the 3D-GC in the encoder part with HS-layer to extract features and incorporates statistical attention to compute higher-order statistical information. Additionally, three sub-modules are designed for pose regression, point cloud reconstruction, and bounding box voting. The pose regression module also integrates statistical attention to leverage higher-order statistical information for modeling geometric relationships and aiding regression. Experiments demonstrate that our method achieves outstanding performance, attaining an mAP of 49.5 on the 5��2 cm metric, which is 3.4 higher than the baseline model. Our method achieves state-of-the-art (SOTA) performance on the REAL275 dataset.

higher-order long-range dependencies pose estimation

IEEE Trans Image Process. 2022;31:6907-6921 [PMID: 36315551]

20230201111GX/Science and Technology Development Program Project of Jilin Province
20230201039GX/Science and Technology Development Program Project of Jilin Province

Journal Article

Category-Level 6-D Object Pose Estimation With Shape Deformation for Robotic Grasp Detection.MH6D: Multi-Hypothesis Consistency Learning for Category-Level 6-D Object Pose Estimation.Diff9D: Diffusion-Based Domain-Generalized Category-Level 9-DoF Object Pose Estimation.6D-ViT: Category-Level 6D Object Pose Estimation via Transformer-Based Instance Representation Learning.Toward Real-World Category-Level Articulation Pose Estimation.Keypoint-Based Disentangled Pose Network for Category-Level 6-D Object Pose Tracking.MSSPA-GC: Multi-Scale Shape Prior Adaptation with 3D Graph Convolutions for Category-Level Object Pose Estimation.Corr-Track: Category-Level 6D Pose Tracking with Soft-Correspondence Matrix Estimation.End-to-End Implicit Object Pose Estimation.Accurate Object Pose Estimation Using Depth Only.

No available data.

OpenLB
Open Library of Bioscience