Dual many-to-one-encoder-based transfer learning for cross-dataset human action recognition

Tiantian Xu, Fan Zhu, Edward Wong, Yi Fang

Research output: Contribution to journalArticle

Abstract

The emergence of large-scale human action datasets poses a challenge to efficient action labeling. Hand labeling large-scale datasets is tedious and time consuming; thus a more efficient labeling method would be beneficial. One possible solution is to make use of the knowledge of a known dataset to aid the labeling of a new dataset. To this end, we propose a new transfer learning method for cross-dataset human action recognition. Our method aims at learning generalized feature representation for effective cross-dataset classification. We propose a novel dual many-to-one encoder architecture to extract generalized features by mapping raw features from source and target datasets to the same feature space. Benefiting from the favorable property of the proposed many-to-one encoder, cross-dataset action data are encouraged to possess identical encoded features if the actions share the same class labels. Experiments on pairs of benchmark human action datasets achieved state-of-the-art accuracy, proving the efficacy of the proposed method.

Original languageEnglish (US)
JournalImage and Vision Computing
DOIs
StateAccepted/In press - Sep 16 2015

Fingerprint

Labeling
Labels
Experiments

Keywords

  • Action recognition
  • Cross-dataset
  • Domain adaptation
  • Neural network
  • Transfer learning

ASJC Scopus subject areas

  • Electrical and Electronic Engineering

Cite this

Dual many-to-one-encoder-based transfer learning for cross-dataset human action recognition. / Xu, Tiantian; Zhu, Fan; Wong, Edward; Fang, Yi.

In: Image and Vision Computing, 16.09.2015.

Research output: Contribution to journalArticle

@article{592b7dd0b3984a6aba30ecd64df1bd45,
title = "Dual many-to-one-encoder-based transfer learning for cross-dataset human action recognition",
abstract = "The emergence of large-scale human action datasets poses a challenge to efficient action labeling. Hand labeling large-scale datasets is tedious and time consuming; thus a more efficient labeling method would be beneficial. One possible solution is to make use of the knowledge of a known dataset to aid the labeling of a new dataset. To this end, we propose a new transfer learning method for cross-dataset human action recognition. Our method aims at learning generalized feature representation for effective cross-dataset classification. We propose a novel dual many-to-one encoder architecture to extract generalized features by mapping raw features from source and target datasets to the same feature space. Benefiting from the favorable property of the proposed many-to-one encoder, cross-dataset action data are encouraged to possess identical encoded features if the actions share the same class labels. Experiments on pairs of benchmark human action datasets achieved state-of-the-art accuracy, proving the efficacy of the proposed method.",
keywords = "Action recognition, Cross-dataset, Domain adaptation, Neural network, Transfer learning",
author = "Tiantian Xu and Fan Zhu and Edward Wong and Yi Fang",
year = "2015",
month = "9",
day = "16",
doi = "10.1016/j.imavis.2016.01.001",
language = "English (US)",
journal = "Image and Vision Computing",
issn = "0262-8856",
publisher = "Elsevier Limited",

}

TY - JOUR

T1 - Dual many-to-one-encoder-based transfer learning for cross-dataset human action recognition

AU - Xu, Tiantian

AU - Zhu, Fan

AU - Wong, Edward

AU - Fang, Yi

PY - 2015/9/16

Y1 - 2015/9/16

N2 - The emergence of large-scale human action datasets poses a challenge to efficient action labeling. Hand labeling large-scale datasets is tedious and time consuming; thus a more efficient labeling method would be beneficial. One possible solution is to make use of the knowledge of a known dataset to aid the labeling of a new dataset. To this end, we propose a new transfer learning method for cross-dataset human action recognition. Our method aims at learning generalized feature representation for effective cross-dataset classification. We propose a novel dual many-to-one encoder architecture to extract generalized features by mapping raw features from source and target datasets to the same feature space. Benefiting from the favorable property of the proposed many-to-one encoder, cross-dataset action data are encouraged to possess identical encoded features if the actions share the same class labels. Experiments on pairs of benchmark human action datasets achieved state-of-the-art accuracy, proving the efficacy of the proposed method.

AB - The emergence of large-scale human action datasets poses a challenge to efficient action labeling. Hand labeling large-scale datasets is tedious and time consuming; thus a more efficient labeling method would be beneficial. One possible solution is to make use of the knowledge of a known dataset to aid the labeling of a new dataset. To this end, we propose a new transfer learning method for cross-dataset human action recognition. Our method aims at learning generalized feature representation for effective cross-dataset classification. We propose a novel dual many-to-one encoder architecture to extract generalized features by mapping raw features from source and target datasets to the same feature space. Benefiting from the favorable property of the proposed many-to-one encoder, cross-dataset action data are encouraged to possess identical encoded features if the actions share the same class labels. Experiments on pairs of benchmark human action datasets achieved state-of-the-art accuracy, proving the efficacy of the proposed method.

KW - Action recognition

KW - Cross-dataset

KW - Domain adaptation

KW - Neural network

KW - Transfer learning

UR - http://www.scopus.com/inward/record.url?scp=84960192296&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84960192296&partnerID=8YFLogxK

U2 - 10.1016/j.imavis.2016.01.001

DO - 10.1016/j.imavis.2016.01.001

M3 - Article

AN - SCOPUS:84960192296

JO - Image and Vision Computing

JF - Image and Vision Computing

SN - 0262-8856

ER -