No Cover Image

Journal article 899 views 207 downloads

A bag of words approach to subject specific 3D human pose interaction classification with random decision forests

Jingjing Deng, Xianghua Xie Orcid Logo, Ben Daubney

Graphical Models, Volume: 76, Issue: 3, Pages: 162 - 171

Swansea University Authors: Jingjing Deng, Xianghua Xie Orcid Logo

Abstract

In this work, we investigate whether it is possible to distinguish conversational interactions from observing human motion alone, in particular subject specific gestures in 3D. We adopt Kinect sensors to obtain 3D displacement and velocity measurements, followed by wavelet decomposition to extract l...

Full description

Published in: Graphical Models
ISSN: 15240703
Published: Elsevier 2014
Online Access: Check full text

URI: https://cronfa.swan.ac.uk/Record/cronfa49635
Tags: Add Tag
No Tags, Be the first to tag this record!
first_indexed 2019-03-20T13:59:09Z
last_indexed 2020-12-08T04:03:06Z
id cronfa49635
recordtype SURis
fullrecord <?xml version="1.0"?><rfc1807><datestamp>2020-12-07T13:21:59.4785185</datestamp><bib-version>v2</bib-version><id>49635</id><entry>2019-03-20</entry><title>A bag of words approach to subject specific 3D human pose interaction classification with random decision forests</title><swanseaauthors><author><sid>6f6d01d585363d6dc1622640bb4fcb3f</sid><firstname>Jingjing</firstname><surname>Deng</surname><name>Jingjing Deng</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>b334d40963c7a2f435f06d2c26c74e11</sid><ORCID>0000-0002-2701-8660</ORCID><firstname>Xianghua</firstname><surname>Xie</surname><name>Xianghua Xie</name><active>true</active><ethesisStudent>false</ethesisStudent></author></swanseaauthors><date>2019-03-20</date><abstract>In this work, we investigate whether it is possible to distinguish conversational interactions from observing human motion alone, in particular subject specific gestures in 3D. We adopt Kinect sensors to obtain 3D displacement and velocity measurements, followed by wavelet decomposition to extract low level temporal features. These features are thengeneralized to form a visual vocabulary that can be further generalized to a set of topics from temporal distributions of visual vocabulary. A subject specific supervised learning approach based on Random Forests is used to classify the testing sequences to seven different conversational scenarios. These conversational scenarios concerned in this workhave rather subtle differences among them. Unlike typical action or event recognition, each interaction in our case contain many instances of primitive motions and actions, many of which are shared among different conversation scenarios. That is the interactions we are concerned with are not micro or instant events, such as hugging and high-five, but rather interactions over a period of time that consists rather similar individual motions, micro actions and interactions. We believe this is among one of the first work that is devoted to subject specific conversational interaction classification using 3D pose features and to show this task is indeed possible.</abstract><type>Journal Article</type><journal>Graphical Models</journal><volume>76</volume><journalNumber>3</journalNumber><paginationStart>162</paginationStart><paginationEnd>171</paginationEnd><publisher>Elsevier</publisher><placeOfPublication/><isbnPrint/><isbnElectronic/><issnPrint>15240703</issnPrint><issnElectronic/><keywords>Human interaction, Action recognition, Human pose, Random forests, Bag of words</keywords><publishedDay>31</publishedDay><publishedMonth>5</publishedMonth><publishedYear>2014</publishedYear><publishedDate>2014-05-31</publishedDate><doi>10.1016/j.gmod.2013.10.006</doi><url>http://www.sciencedirect.com/science/article/pii/S1524070313000337</url><notes/><college>COLLEGE NANME</college><CollegeCode>COLLEGE CODE</CollegeCode><institution>Swansea University</institution><apcterm/><lastEdited>2020-12-07T13:21:59.4785185</lastEdited><Created>2019-03-20T10:10:34.4837235</Created><path><level id="1">Faculty of Science and Engineering</level><level id="2">School of Mathematics and Computer Science - Computer Science</level></path><authors><author><firstname>Jingjing</firstname><surname>Deng</surname><order>1</order></author><author><firstname>Xianghua</firstname><surname>Xie</surname><orcid>0000-0002-2701-8660</orcid><order>2</order></author><author><firstname>Ben</firstname><surname>Daubney</surname><order>3</order></author></authors><documents><document><filename>0049635-01042019171033.pdf</filename><originalFilename>gmod.pdf</originalFilename><uploaded>2019-04-01T17:10:33.3000000</uploaded><type>Output</type><contentLength>3079687</contentLength><contentType>application/pdf</contentType><version>Accepted Manuscript</version><cronfaStatus>true</cronfaStatus><embargoDate>2019-04-01T00:00:00.0000000</embargoDate><copyrightCorrect>true</copyrightCorrect><language>eng</language></document></documents><OutputDurs/></rfc1807>
spelling 2020-12-07T13:21:59.4785185 v2 49635 2019-03-20 A bag of words approach to subject specific 3D human pose interaction classification with random decision forests 6f6d01d585363d6dc1622640bb4fcb3f Jingjing Deng Jingjing Deng true false b334d40963c7a2f435f06d2c26c74e11 0000-0002-2701-8660 Xianghua Xie Xianghua Xie true false 2019-03-20 In this work, we investigate whether it is possible to distinguish conversational interactions from observing human motion alone, in particular subject specific gestures in 3D. We adopt Kinect sensors to obtain 3D displacement and velocity measurements, followed by wavelet decomposition to extract low level temporal features. These features are thengeneralized to form a visual vocabulary that can be further generalized to a set of topics from temporal distributions of visual vocabulary. A subject specific supervised learning approach based on Random Forests is used to classify the testing sequences to seven different conversational scenarios. These conversational scenarios concerned in this workhave rather subtle differences among them. Unlike typical action or event recognition, each interaction in our case contain many instances of primitive motions and actions, many of which are shared among different conversation scenarios. That is the interactions we are concerned with are not micro or instant events, such as hugging and high-five, but rather interactions over a period of time that consists rather similar individual motions, micro actions and interactions. We believe this is among one of the first work that is devoted to subject specific conversational interaction classification using 3D pose features and to show this task is indeed possible. Journal Article Graphical Models 76 3 162 171 Elsevier 15240703 Human interaction, Action recognition, Human pose, Random forests, Bag of words 31 5 2014 2014-05-31 10.1016/j.gmod.2013.10.006 http://www.sciencedirect.com/science/article/pii/S1524070313000337 COLLEGE NANME COLLEGE CODE Swansea University 2020-12-07T13:21:59.4785185 2019-03-20T10:10:34.4837235 Faculty of Science and Engineering School of Mathematics and Computer Science - Computer Science Jingjing Deng 1 Xianghua Xie 0000-0002-2701-8660 2 Ben Daubney 3 0049635-01042019171033.pdf gmod.pdf 2019-04-01T17:10:33.3000000 Output 3079687 application/pdf Accepted Manuscript true 2019-04-01T00:00:00.0000000 true eng
title A bag of words approach to subject specific 3D human pose interaction classification with random decision forests
spellingShingle A bag of words approach to subject specific 3D human pose interaction classification with random decision forests
Jingjing Deng
Xianghua Xie
title_short A bag of words approach to subject specific 3D human pose interaction classification with random decision forests
title_full A bag of words approach to subject specific 3D human pose interaction classification with random decision forests
title_fullStr A bag of words approach to subject specific 3D human pose interaction classification with random decision forests
title_full_unstemmed A bag of words approach to subject specific 3D human pose interaction classification with random decision forests
title_sort A bag of words approach to subject specific 3D human pose interaction classification with random decision forests
author_id_str_mv 6f6d01d585363d6dc1622640bb4fcb3f
b334d40963c7a2f435f06d2c26c74e11
author_id_fullname_str_mv 6f6d01d585363d6dc1622640bb4fcb3f_***_Jingjing Deng
b334d40963c7a2f435f06d2c26c74e11_***_Xianghua Xie
author Jingjing Deng
Xianghua Xie
author2 Jingjing Deng
Xianghua Xie
Ben Daubney
format Journal article
container_title Graphical Models
container_volume 76
container_issue 3
container_start_page 162
publishDate 2014
institution Swansea University
issn 15240703
doi_str_mv 10.1016/j.gmod.2013.10.006
publisher Elsevier
college_str Faculty of Science and Engineering
hierarchytype
hierarchy_top_id facultyofscienceandengineering
hierarchy_top_title Faculty of Science and Engineering
hierarchy_parent_id facultyofscienceandengineering
hierarchy_parent_title Faculty of Science and Engineering
department_str School of Mathematics and Computer Science - Computer Science{{{_:::_}}}Faculty of Science and Engineering{{{_:::_}}}School of Mathematics and Computer Science - Computer Science
url http://www.sciencedirect.com/science/article/pii/S1524070313000337
document_store_str 1
active_str 0
description In this work, we investigate whether it is possible to distinguish conversational interactions from observing human motion alone, in particular subject specific gestures in 3D. We adopt Kinect sensors to obtain 3D displacement and velocity measurements, followed by wavelet decomposition to extract low level temporal features. These features are thengeneralized to form a visual vocabulary that can be further generalized to a set of topics from temporal distributions of visual vocabulary. A subject specific supervised learning approach based on Random Forests is used to classify the testing sequences to seven different conversational scenarios. These conversational scenarios concerned in this workhave rather subtle differences among them. Unlike typical action or event recognition, each interaction in our case contain many instances of primitive motions and actions, many of which are shared among different conversation scenarios. That is the interactions we are concerned with are not micro or instant events, such as hugging and high-five, but rather interactions over a period of time that consists rather similar individual motions, micro actions and interactions. We believe this is among one of the first work that is devoted to subject specific conversational interaction classification using 3D pose features and to show this task is indeed possible.
published_date 2014-05-31T04:00:48Z
_version_ 1763753116963962880
score 11.016235