Direct multichannel tracking

Carlos Jaramillo, Yuichi Taguchi, Chen Feng

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We present direct multichannel tracking, an algorithm for tracking the pose of a monocular camera (visual odometry) using high-dimensional features in a direct image alignment framework. Instead of using a single grayscale channel and assuming intensity constancy as in existing approaches, we extract multichannel features at each pixel from each image and assume feature constancy among consecutive images. High-dimensional features are more discriminative and robust to noise and image variations than intensities, enabling more accurate camera tracking. We demonstrate our claim using conventional hand-crafted features such as SIFT as well as more recent features extracted from convolutional neural networks (CNNs) such as Siamese and AlexNet networks. We evaluate the performance of our algorithm against the baseline case (single-channel tracking) using several public datasets, where the AlexNet feature provides the best pose estimation results.

Original languageEnglish (US)
Title of host publicationProceedings - 2017 International Conference on 3D Vision, 3DV 2017
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages347-355
Number of pages9
ISBN (Electronic)9781538626108
DOIs
StatePublished - May 25 2018
Event7th IEEE International Conference on 3D Vision, 3DV 2017 - Qingdao, China
Duration: Oct 10 2017Oct 12 2017

Other

Other7th IEEE International Conference on 3D Vision, 3DV 2017
CountryChina
CityQingdao
Period10/10/1710/12/17

Fingerprint

Cameras
Pixels
Neural networks

Keywords

  • 3D-reconstruction
  • Camera-pose-estimation
  • Camera-Tracking
  • CNN-features
  • Computer-vision
  • Dense-SIFT
  • Direct-method
  • Monocular-vision
  • Multichannel
  • Visual-odometry

ASJC Scopus subject areas

  • Media Technology
  • Computer Vision and Pattern Recognition
  • Signal Processing

Cite this

Jaramillo, C., Taguchi, Y., & Feng, C. (2018). Direct multichannel tracking. In Proceedings - 2017 International Conference on 3D Vision, 3DV 2017 (pp. 347-355). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/3DV.2017.00047

Direct multichannel tracking. / Jaramillo, Carlos; Taguchi, Yuichi; Feng, Chen.

Proceedings - 2017 International Conference on 3D Vision, 3DV 2017. Institute of Electrical and Electronics Engineers Inc., 2018. p. 347-355.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Jaramillo, C, Taguchi, Y & Feng, C 2018, Direct multichannel tracking. in Proceedings - 2017 International Conference on 3D Vision, 3DV 2017. Institute of Electrical and Electronics Engineers Inc., pp. 347-355, 7th IEEE International Conference on 3D Vision, 3DV 2017, Qingdao, China, 10/10/17. https://doi.org/10.1109/3DV.2017.00047
Jaramillo C, Taguchi Y, Feng C. Direct multichannel tracking. In Proceedings - 2017 International Conference on 3D Vision, 3DV 2017. Institute of Electrical and Electronics Engineers Inc. 2018. p. 347-355 https://doi.org/10.1109/3DV.2017.00047
Jaramillo, Carlos ; Taguchi, Yuichi ; Feng, Chen. / Direct multichannel tracking. Proceedings - 2017 International Conference on 3D Vision, 3DV 2017. Institute of Electrical and Electronics Engineers Inc., 2018. pp. 347-355
@inproceedings{769fa8024a294b1b8b98259d9e5846d4,
title = "Direct multichannel tracking",
abstract = "We present direct multichannel tracking, an algorithm for tracking the pose of a monocular camera (visual odometry) using high-dimensional features in a direct image alignment framework. Instead of using a single grayscale channel and assuming intensity constancy as in existing approaches, we extract multichannel features at each pixel from each image and assume feature constancy among consecutive images. High-dimensional features are more discriminative and robust to noise and image variations than intensities, enabling more accurate camera tracking. We demonstrate our claim using conventional hand-crafted features such as SIFT as well as more recent features extracted from convolutional neural networks (CNNs) such as Siamese and AlexNet networks. We evaluate the performance of our algorithm against the baseline case (single-channel tracking) using several public datasets, where the AlexNet feature provides the best pose estimation results.",
keywords = "3D-reconstruction, Camera-pose-estimation, Camera-Tracking, CNN-features, Computer-vision, Dense-SIFT, Direct-method, Monocular-vision, Multichannel, Visual-odometry",
author = "Carlos Jaramillo and Yuichi Taguchi and Chen Feng",
year = "2018",
month = "5",
day = "25",
doi = "10.1109/3DV.2017.00047",
language = "English (US)",
pages = "347--355",
booktitle = "Proceedings - 2017 International Conference on 3D Vision, 3DV 2017",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - GEN

T1 - Direct multichannel tracking

AU - Jaramillo, Carlos

AU - Taguchi, Yuichi

AU - Feng, Chen

PY - 2018/5/25

Y1 - 2018/5/25

N2 - We present direct multichannel tracking, an algorithm for tracking the pose of a monocular camera (visual odometry) using high-dimensional features in a direct image alignment framework. Instead of using a single grayscale channel and assuming intensity constancy as in existing approaches, we extract multichannel features at each pixel from each image and assume feature constancy among consecutive images. High-dimensional features are more discriminative and robust to noise and image variations than intensities, enabling more accurate camera tracking. We demonstrate our claim using conventional hand-crafted features such as SIFT as well as more recent features extracted from convolutional neural networks (CNNs) such as Siamese and AlexNet networks. We evaluate the performance of our algorithm against the baseline case (single-channel tracking) using several public datasets, where the AlexNet feature provides the best pose estimation results.

AB - We present direct multichannel tracking, an algorithm for tracking the pose of a monocular camera (visual odometry) using high-dimensional features in a direct image alignment framework. Instead of using a single grayscale channel and assuming intensity constancy as in existing approaches, we extract multichannel features at each pixel from each image and assume feature constancy among consecutive images. High-dimensional features are more discriminative and robust to noise and image variations than intensities, enabling more accurate camera tracking. We demonstrate our claim using conventional hand-crafted features such as SIFT as well as more recent features extracted from convolutional neural networks (CNNs) such as Siamese and AlexNet networks. We evaluate the performance of our algorithm against the baseline case (single-channel tracking) using several public datasets, where the AlexNet feature provides the best pose estimation results.

KW - 3D-reconstruction

KW - Camera-pose-estimation

KW - Camera-Tracking

KW - CNN-features

KW - Computer-vision

KW - Dense-SIFT

KW - Direct-method

KW - Monocular-vision

KW - Multichannel

KW - Visual-odometry

UR - http://www.scopus.com/inward/record.url?scp=85048804246&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85048804246&partnerID=8YFLogxK

U2 - 10.1109/3DV.2017.00047

DO - 10.1109/3DV.2017.00047

M3 - Conference contribution

SP - 347

EP - 355

BT - Proceedings - 2017 International Conference on 3D Vision, 3DV 2017

PB - Institute of Electrical and Electronics Engineers Inc.

ER -