Dynamic Facial Expression Recognition of Learners via Adaptive Global Attention and Differential Temporal Transformer

Liu, Wei; Li, Lujia; Yan, Chun; Zhang, Yulin; Cheng, Cheng; Zhao, Xinyan; Liu, Mingshi

doi:10.1049/cit2.70115

Journal article 250 views 151 downloads

Dynamic Facial Expression Recognition of Learners via Adaptive Global Attention and Differential Temporal Transformer

Wei Liu

, Lujia Li, Chun Yan, Yulin Zhang, Cheng Cheng

, Xinyan Zhao, Mingshi Liu

CAAI Transactions on Intelligence Technology

Swansea University Author: Cheng Cheng

PDF | Version of Record

© 2026 The Author(s). This is an open access article under the terms of the Creative Commons Attribution License.
Download (1.38MB)

Check full text

DOI (Published version): 10.1049/cit2.70115

Abstract

Analysing learners' facial expressions during learning and exploring their learning processes and emotional changes are of great significance for assisting teachers' teaching and promoting smart education. In complex learning environments, static facial expression recognition fails to capt...

Full description

Published in:	CAAI Transactions on Intelligence Technology
ISSN:	2468-6557 2468-2322
Published:	Wiley 2026
Online Access:	Check full text
URI:	https://cronfa.swan.ac.uk/Record/cronfa71469

first_indexed	2026-02-19T14:35:28Z
last_indexed	2026-04-29T05:26:40Z
id	cronfa71469
recordtype	SURis
fullrecord	<?xml version="1.0"?><rfc1807><datestamp>2026-04-28T11:41:59.8661591</datestamp><bib-version>v2</bib-version><id>71469</id><entry>2026-02-19</entry><title>Dynamic Facial Expression Recognition of Learners via Adaptive Global Attention and Differential Temporal Transformer</title><swanseaauthors><author><sid>11ddf61c123b99e59b00fa1479367582</sid><ORCID>0000-0003-0371-9646</ORCID><firstname>Cheng</firstname><surname>Cheng</surname><name>Cheng Cheng</name><active>true</active><ethesisStudent>false</ethesisStudent></author></swanseaauthors><date>2026-02-19</date><deptcode>MACS</deptcode><abstract>Analysing learners' facial expressions during learning and exploring their learning processes and emotional changes are of great significance for assisting teachers' teaching and promoting smart education. In complex learning environments, static facial expression recognition fails to capture the dynamic changes of learners' expressions losing the continuous features in the learning process, and its recognition effect is easily interfered with by factors such as occlusion and lighting variations during learning. To address the above issues, a network model based on adaptive global attention and temporal difference is proposed to recognise learners' dynamic expression sequences. Firstly, we have designed an Adaptive Global Attention (AGA) block, which adaptively models inter-channel relationships to dynamically enhance key channels that are highly correlated with learners' states while suppressing redundant information, thereby improving the model's feature representation capability under noisy environments. Secondly, we have designed a Differential Temporal Transformer (DTFormer) to extract differential information between consecutive frames, increasing the model's sensitivity to learners' facial expression dynamics and improving recognition performance. The two components complement each other in terms of spatial feature enhancement and temporal dynamic modelling effectively improving the model's overall capability for representing learners' dynamic facial expressions. Experiments were conducted on public datasets DFEW, FERV39k and the learner E-learning emotional state data set DAiSEE, and comparisons were made with classical methods using objective indicators. The results demonstrate that the proposed method outperforms the comparison methods in multiple performance indicators, thereby verifying its effectiveness.</abstract><type>Journal Article</type><journal>CAAI Transactions on Intelligence Technology</journal><volume>0</volume><journalNumber/><paginationStart/><paginationEnd/><publisher>Wiley</publisher><placeOfPublication/><isbnPrint/><isbnElectronic/><issnPrint>2468-6557</issnPrint><issnElectronic>2468-2322</issnElectronic><keywords>face analysis; facial expression recognition; spatial‐temporal feature; transformer</keywords><publishedDay>3</publishedDay><publishedMonth>3</publishedMonth><publishedYear>2026</publishedYear><publishedDate>2026-03-03</publishedDate><doi>10.1049/cit2.70115</doi><url/><notes/><college>COLLEGE NANME</college><department>Mathematics and Computer Science School</department><CollegeCode>COLLEGE CODE</CollegeCode><DepartmentCode>MACS</DepartmentCode><institution>Swansea University</institution><apcterm>SU Library paid the OA fee (TA Institutional Deal)</apcterm><funders>The study was supported by UKRI (Grant EP/W020408/1); the Humanities and Social Science Fund of Ministry of Education of China (23YJAZH084).</funders><projectreference/><lastEdited>2026-04-28T11:41:59.8661591</lastEdited><Created>2026-02-19T14:31:25.8509386</Created><path><level id="1">Faculty of Science and Engineering</level><level id="2">School of Mathematics and Computer Science - Computer Science</level></path><authors><author><firstname>Wei</firstname><surname>Liu</surname><orcid>0000-0001-6468-3232</orcid><order>1</order></author><author><firstname>Lujia</firstname><surname>Li</surname><order>2</order></author><author><firstname>Chun</firstname><surname>Yan</surname><order>3</order></author><author><firstname>Yulin</firstname><surname>Zhang</surname><order>4</order></author><author><firstname>Cheng</firstname><surname>Cheng</surname><orcid>0000-0003-0371-9646</orcid><order>5</order></author><author><firstname>Xinyan</firstname><surname>Zhao</surname><order>6</order></author><author><firstname>Mingshi</firstname><surname>Liu</surname><order>7</order></author></authors><documents><document><filename>71469__36427__58ba0ebf192346c8adddd91474f47fda.pdf</filename><originalFilename>71469.VoR.pdf</originalFilename><uploaded>2026-03-17T13:21:10.8245451</uploaded><type>Output</type><contentLength>1448385</contentLength><contentType>application/pdf</contentType><version>Version of Record</version><cronfaStatus>true</cronfaStatus><documentNotes>© 2026 The Author(s). This is an open access article under the terms of the Creative Commons Attribution License.</documentNotes><copyrightCorrect>true</copyrightCorrect><language>eng</language><licence>http://creativecommons.org/licenses/by/4.0/</licence></document></documents><OutputDurs/></rfc1807>
spelling	2026-04-28T11:41:59.8661591 v2 71469 2026-02-19 Dynamic Facial Expression Recognition of Learners via Adaptive Global Attention and Differential Temporal Transformer 11ddf61c123b99e59b00fa1479367582 0000-0003-0371-9646 Cheng Cheng Cheng Cheng true false 2026-02-19 MACS Analysing learners' facial expressions during learning and exploring their learning processes and emotional changes are of great significance for assisting teachers' teaching and promoting smart education. In complex learning environments, static facial expression recognition fails to capture the dynamic changes of learners' expressions losing the continuous features in the learning process, and its recognition effect is easily interfered with by factors such as occlusion and lighting variations during learning. To address the above issues, a network model based on adaptive global attention and temporal difference is proposed to recognise learners' dynamic expression sequences. Firstly, we have designed an Adaptive Global Attention (AGA) block, which adaptively models inter-channel relationships to dynamically enhance key channels that are highly correlated with learners' states while suppressing redundant information, thereby improving the model's feature representation capability under noisy environments. Secondly, we have designed a Differential Temporal Transformer (DTFormer) to extract differential information between consecutive frames, increasing the model's sensitivity to learners' facial expression dynamics and improving recognition performance. The two components complement each other in terms of spatial feature enhancement and temporal dynamic modelling effectively improving the model's overall capability for representing learners' dynamic facial expressions. Experiments were conducted on public datasets DFEW, FERV39k and the learner E-learning emotional state data set DAiSEE, and comparisons were made with classical methods using objective indicators. The results demonstrate that the proposed method outperforms the comparison methods in multiple performance indicators, thereby verifying its effectiveness. Journal Article CAAI Transactions on Intelligence Technology 0 Wiley 2468-6557 2468-2322 face analysis; facial expression recognition; spatial‐temporal feature; transformer 3 3 2026 2026-03-03 10.1049/cit2.70115 COLLEGE NANME Mathematics and Computer Science School COLLEGE CODE MACS Swansea University SU Library paid the OA fee (TA Institutional Deal) The study was supported by UKRI (Grant EP/W020408/1); the Humanities and Social Science Fund of Ministry of Education of China (23YJAZH084). 2026-04-28T11:41:59.8661591 2026-02-19T14:31:25.8509386 Faculty of Science and Engineering School of Mathematics and Computer Science - Computer Science Wei Liu 0000-0001-6468-3232 1 Lujia Li 2 Chun Yan 3 Yulin Zhang 4 Cheng Cheng 0000-0003-0371-9646 5 Xinyan Zhao 6 Mingshi Liu 7 71469__36427__58ba0ebf192346c8adddd91474f47fda.pdf 71469.VoR.pdf 2026-03-17T13:21:10.8245451 Output 1448385 application/pdf Version of Record true © 2026 The Author(s). This is an open access article under the terms of the Creative Commons Attribution License. true eng http://creativecommons.org/licenses/by/4.0/
title	Dynamic Facial Expression Recognition of Learners via Adaptive Global Attention and Differential Temporal Transformer
spellingShingle	Dynamic Facial Expression Recognition of Learners via Adaptive Global Attention and Differential Temporal Transformer Cheng Cheng
title_short	Dynamic Facial Expression Recognition of Learners via Adaptive Global Attention and Differential Temporal Transformer
title_full	Dynamic Facial Expression Recognition of Learners via Adaptive Global Attention and Differential Temporal Transformer
title_fullStr	Dynamic Facial Expression Recognition of Learners via Adaptive Global Attention and Differential Temporal Transformer
title_full_unstemmed	Dynamic Facial Expression Recognition of Learners via Adaptive Global Attention and Differential Temporal Transformer
title_sort	Dynamic Facial Expression Recognition of Learners via Adaptive Global Attention and Differential Temporal Transformer
author_id_str_mv	11ddf61c123b99e59b00fa1479367582
author_id_fullname_str_mv	11ddf61c123b99e59b00fa1479367582_***_Cheng Cheng
author	Cheng Cheng
author2	Wei Liu Lujia Li Chun Yan Yulin Zhang Cheng Cheng Xinyan Zhao Mingshi Liu
format	Journal article
container_title	CAAI Transactions on Intelligence Technology
container_volume	0
publishDate	2026
institution	Swansea University
issn	2468-6557 2468-2322
doi_str_mv	10.1049/cit2.70115
publisher	Wiley
college_str	Faculty of Science and Engineering
hierarchytype
hierarchy_top_id	facultyofscienceandengineering
hierarchy_top_title	Faculty of Science and Engineering
hierarchy_parent_id	facultyofscienceandengineering
hierarchy_parent_title	Faculty of Science and Engineering
department_str	School of Mathematics and Computer Science - Computer Science{{{_:::_}}}Faculty of Science and Engineering{{{_:::_}}}School of Mathematics and Computer Science - Computer Science
document_store_str	1
active_str	0
description	Analysing learners' facial expressions during learning and exploring their learning processes and emotional changes are of great significance for assisting teachers' teaching and promoting smart education. In complex learning environments, static facial expression recognition fails to capture the dynamic changes of learners' expressions losing the continuous features in the learning process, and its recognition effect is easily interfered with by factors such as occlusion and lighting variations during learning. To address the above issues, a network model based on adaptive global attention and temporal difference is proposed to recognise learners' dynamic expression sequences. Firstly, we have designed an Adaptive Global Attention (AGA) block, which adaptively models inter-channel relationships to dynamically enhance key channels that are highly correlated with learners' states while suppressing redundant information, thereby improving the model's feature representation capability under noisy environments. Secondly, we have designed a Differential Temporal Transformer (DTFormer) to extract differential information between consecutive frames, increasing the model's sensitivity to learners' facial expression dynamics and improving recognition performance. The two components complement each other in terms of spatial feature enhancement and temporal dynamic modelling effectively improving the model's overall capability for representing learners' dynamic facial expressions. Experiments were conducted on public datasets DFEW, FERV39k and the learner E-learning emotional state data set DAiSEE, and comparisons were made with classical methods using objective indicators. The results demonstrate that the proposed method outperforms the comparison methods in multiple performance indicators, thereby verifying its effectiveness.
published_date	2026-03-03T05:36:16Z
_version_	1864140562409455616
score	11.103791

Dynamic Facial Expression Recognition of Learners via Adaptive Global Attention and Differential Temporal Transformer

Similar Items