Learning pairwise neural network encoder for depth image-based 3d model retrieval

Jing Zhu, Fan Zhu, Edward K. Wong, Yi Fang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

With the emergence of RGB-D cameras (e.g., Kinect), the sensing capability of artificial intelligence systems has been dramatically increased, and as a consequence, a wide range of depth image-based human-machine interaction applica-tions are proposed. In design industry, a 3D model always contains abundant information, which are required for man-ufacture. Since depth images can be conveniently acquired, a retrieval system that can return 3D models based on depth image inputs can assist or improve the traditional product design process. In this work, we address the depth image-based 3D model retrieval problem. By extending the neural network to a neural network pair with identical output lay-ers for objects of the same category, unified domain-invariant representations can be learned based on the low-level mis-matched depth image features and 3D model features. A unique advantage of the framework is that the correspon-dence information between depth images and 3D models are not required, so that it can easily be generalized to large-scale databases. In order to evaluate the effiectiveness of our approach, depth images (with Kinect-Type noise) in the NYU Depth V2 dataset are used as queries to retrieve 3D models of the same categories in the SHREC 2014 dataset. Experimental results suggest that our approach can outper-form the state-of-The-Arts methods, and the paradigm that directly uses the original representations of depth images and 3D models for retrieval.

Original languageEnglish (US)
Title of host publicationMM 2015 - Proceedings of the 2015 ACM Multimedia Conference
PublisherAssociation for Computing Machinery, Inc
Pages1227-1230
Number of pages4
ISBN (Electronic)9781450334594
DOIs
StatePublished - Oct 13 2015
Event23rd ACM International Conference on Multimedia, MM 2015 - Brisbane, Australia
Duration: Oct 26 2015Oct 30 2015

Publication series

NameMM 2015 - Proceedings of the 2015 ACM Multimedia Conference

Other

Other23rd ACM International Conference on Multimedia, MM 2015
CountryAustralia
CityBrisbane
Period10/26/1510/30/15

    Fingerprint

Keywords

  • Cross-Domain
  • Depth Image
  • Neural Network
  • Retrieval

ASJC Scopus subject areas

  • Media Technology
  • Computer Graphics and Computer-Aided Design
  • Computer Vision and Pattern Recognition
  • Software

Cite this

Zhu, J., Zhu, F., Wong, E. K., & Fang, Y. (2015). Learning pairwise neural network encoder for depth image-based 3d model retrieval. In MM 2015 - Proceedings of the 2015 ACM Multimedia Conference (pp. 1227-1230). (MM 2015 - Proceedings of the 2015 ACM Multimedia Conference). Association for Computing Machinery, Inc. https://doi.org/10.1145/2733373.2806323