No Cover Image

Journal article 92 views 6 downloads

Dynamic Facial Expression Recognition of Learners via Adaptive Global Attention and Differential Temporal Transformer

Wei Liu Orcid Logo, Lujia Li, Chun Yan, Yulin Zhang, Cheng Cheng Orcid Logo, Xinyan Zhao, Mingshi Liu

CAAI Transactions on Intelligence Technology

Swansea University Author: Cheng Cheng Orcid Logo

  • 71469.VoR.pdf

    PDF | Version of Record

    © 2026 The Author(s). This is an open access article under the terms of the Creative Commons Attribution License.

    Download (1.38MB)

Check full text

DOI (Published version): 10.1049/cit2.70115

Abstract

Analysing learners' facial expressions during learning and exploring their learning processes and emotional changes are of great significance for assisting teachers' teaching and promoting smart education. In complex learning environments, static facial expression recognition fails to capt...

Full description

Published in: CAAI Transactions on Intelligence Technology
ISSN: 2468-6557 2468-2322
Published: Institution of Engineering and Technology (IET) 2026
Online Access: Check full text

URI: https://cronfa.swan.ac.uk/Record/cronfa71469
first_indexed 2026-02-19T14:35:28Z
last_indexed 2026-03-18T05:40:27Z
id cronfa71469
recordtype SURis
fullrecord <?xml version="1.0"?><rfc1807><datestamp>2026-03-17T13:22:04.5518997</datestamp><bib-version>v2</bib-version><id>71469</id><entry>2026-02-19</entry><title>Dynamic Facial Expression Recognition of Learners via Adaptive Global Attention and Differential Temporal Transformer</title><swanseaauthors><author><sid>11ddf61c123b99e59b00fa1479367582</sid><ORCID>0000-0003-0371-9646</ORCID><firstname>Cheng</firstname><surname>Cheng</surname><name>Cheng Cheng</name><active>true</active><ethesisStudent>false</ethesisStudent></author></swanseaauthors><date>2026-02-19</date><deptcode>MACS</deptcode><abstract>Analysing learners' facial expressions during learning and exploring their learning processes and emotional changes are of great significance for assisting teachers' teaching and promoting smart education. In complex learning environments, static facial expression recognition fails to capture the dynamic changes of learners' expressions losing the continuous features in the learning process, and its recognition effect is easily interfered with by factors such as occlusion and lighting variations during learning. To address the above issues, a network model based on adaptive global attention and temporal difference is proposed to recognise learners' dynamic expression sequences. Firstly, we have designed an Adaptive Global Attention (AGA) block, which adaptively models inter-channel relationships to dynamically enhance key channels that are highly correlated with learners' states while suppressing redundant information, thereby improving the model's feature representation capability under noisy environments. Secondly, we have designed a Differential Temporal Transformer (DTFormer) to extract differential information between consecutive frames, increasing the model's sensitivity to learners' facial expression dynamics and improving recognition performance. The two components complement each other in terms of spatial feature enhancement and temporal dynamic modelling effectively improving the model's overall capability for representing learners' dynamic facial expressions. Experiments were conducted on public datasets DFEW, FERV39k and the learner E-learning emotional state data set DAiSEE, and comparisons were made with classical methods using objective indicators. The results demonstrate that the proposed method outperforms the comparison methods in multiple performance indicators, thereby verifying its effectiveness.</abstract><type>Journal Article</type><journal>CAAI Transactions on Intelligence Technology</journal><volume>0</volume><journalNumber/><paginationStart/><paginationEnd/><publisher>Institution of Engineering and Technology (IET)</publisher><placeOfPublication/><isbnPrint/><isbnElectronic/><issnPrint>2468-6557</issnPrint><issnElectronic>2468-2322</issnElectronic><keywords>face analysis; facial expression recognition; spatial&#x2010;temporal feature; transformer</keywords><publishedDay>3</publishedDay><publishedMonth>3</publishedMonth><publishedYear>2026</publishedYear><publishedDate>2026-03-03</publishedDate><doi>10.1049/cit2.70115</doi><url/><notes/><college>COLLEGE NANME</college><department>Mathematics and Computer Science School</department><CollegeCode>COLLEGE CODE</CollegeCode><DepartmentCode>MACS</DepartmentCode><institution>Swansea University</institution><apcterm>SU Library paid the OA fee (TA Institutional Deal)</apcterm><funders>UKRI Grant EP/W020408/1 at Swansea University; the Humanities and Social Science Fund of Ministry of Education of China (Grant Number: 23YJAZH084)</funders><projectreference/><lastEdited>2026-03-17T13:22:04.5518997</lastEdited><Created>2026-02-19T14:31:25.8509386</Created><path><level id="1">Faculty of Science and Engineering</level><level id="2">School of Mathematics and Computer Science - Computer Science</level></path><authors><author><firstname>Wei</firstname><surname>Liu</surname><orcid>0000-0001-6468-3232</orcid><order>1</order></author><author><firstname>Lujia</firstname><surname>Li</surname><order>2</order></author><author><firstname>Chun</firstname><surname>Yan</surname><order>3</order></author><author><firstname>Yulin</firstname><surname>Zhang</surname><order>4</order></author><author><firstname>Cheng</firstname><surname>Cheng</surname><orcid>0000-0003-0371-9646</orcid><order>5</order></author><author><firstname>Xinyan</firstname><surname>Zhao</surname><order>6</order></author><author><firstname>Mingshi</firstname><surname>Liu</surname><order>7</order></author></authors><documents><document><filename>71469__36427__58ba0ebf192346c8adddd91474f47fda.pdf</filename><originalFilename>71469.VoR.pdf</originalFilename><uploaded>2026-03-17T13:21:10.8245451</uploaded><type>Output</type><contentLength>1448385</contentLength><contentType>application/pdf</contentType><version>Version of Record</version><cronfaStatus>true</cronfaStatus><documentNotes>&#xA9; 2026 The Author(s). This is an open access article under the terms of the Creative Commons Attribution License.</documentNotes><copyrightCorrect>true</copyrightCorrect><language>eng</language><licence>http://creativecommons.org/licenses/by/4.0/</licence></document></documents><OutputDurs/></rfc1807>
spelling 2026-03-17T13:22:04.5518997 v2 71469 2026-02-19 Dynamic Facial Expression Recognition of Learners via Adaptive Global Attention and Differential Temporal Transformer 11ddf61c123b99e59b00fa1479367582 0000-0003-0371-9646 Cheng Cheng Cheng Cheng true false 2026-02-19 MACS Analysing learners' facial expressions during learning and exploring their learning processes and emotional changes are of great significance for assisting teachers' teaching and promoting smart education. In complex learning environments, static facial expression recognition fails to capture the dynamic changes of learners' expressions losing the continuous features in the learning process, and its recognition effect is easily interfered with by factors such as occlusion and lighting variations during learning. To address the above issues, a network model based on adaptive global attention and temporal difference is proposed to recognise learners' dynamic expression sequences. Firstly, we have designed an Adaptive Global Attention (AGA) block, which adaptively models inter-channel relationships to dynamically enhance key channels that are highly correlated with learners' states while suppressing redundant information, thereby improving the model's feature representation capability under noisy environments. Secondly, we have designed a Differential Temporal Transformer (DTFormer) to extract differential information between consecutive frames, increasing the model's sensitivity to learners' facial expression dynamics and improving recognition performance. The two components complement each other in terms of spatial feature enhancement and temporal dynamic modelling effectively improving the model's overall capability for representing learners' dynamic facial expressions. Experiments were conducted on public datasets DFEW, FERV39k and the learner E-learning emotional state data set DAiSEE, and comparisons were made with classical methods using objective indicators. The results demonstrate that the proposed method outperforms the comparison methods in multiple performance indicators, thereby verifying its effectiveness. Journal Article CAAI Transactions on Intelligence Technology 0 Institution of Engineering and Technology (IET) 2468-6557 2468-2322 face analysis; facial expression recognition; spatial‐temporal feature; transformer 3 3 2026 2026-03-03 10.1049/cit2.70115 COLLEGE NANME Mathematics and Computer Science School COLLEGE CODE MACS Swansea University SU Library paid the OA fee (TA Institutional Deal) UKRI Grant EP/W020408/1 at Swansea University; the Humanities and Social Science Fund of Ministry of Education of China (Grant Number: 23YJAZH084) 2026-03-17T13:22:04.5518997 2026-02-19T14:31:25.8509386 Faculty of Science and Engineering School of Mathematics and Computer Science - Computer Science Wei Liu 0000-0001-6468-3232 1 Lujia Li 2 Chun Yan 3 Yulin Zhang 4 Cheng Cheng 0000-0003-0371-9646 5 Xinyan Zhao 6 Mingshi Liu 7 71469__36427__58ba0ebf192346c8adddd91474f47fda.pdf 71469.VoR.pdf 2026-03-17T13:21:10.8245451 Output 1448385 application/pdf Version of Record true © 2026 The Author(s). This is an open access article under the terms of the Creative Commons Attribution License. true eng http://creativecommons.org/licenses/by/4.0/
title Dynamic Facial Expression Recognition of Learners via Adaptive Global Attention and Differential Temporal Transformer
spellingShingle Dynamic Facial Expression Recognition of Learners via Adaptive Global Attention and Differential Temporal Transformer
Cheng Cheng
title_short Dynamic Facial Expression Recognition of Learners via Adaptive Global Attention and Differential Temporal Transformer
title_full Dynamic Facial Expression Recognition of Learners via Adaptive Global Attention and Differential Temporal Transformer
title_fullStr Dynamic Facial Expression Recognition of Learners via Adaptive Global Attention and Differential Temporal Transformer
title_full_unstemmed Dynamic Facial Expression Recognition of Learners via Adaptive Global Attention and Differential Temporal Transformer
title_sort Dynamic Facial Expression Recognition of Learners via Adaptive Global Attention and Differential Temporal Transformer
author_id_str_mv 11ddf61c123b99e59b00fa1479367582
author_id_fullname_str_mv 11ddf61c123b99e59b00fa1479367582_***_Cheng Cheng
author Cheng Cheng
author2 Wei Liu
Lujia Li
Chun Yan
Yulin Zhang
Cheng Cheng
Xinyan Zhao
Mingshi Liu
format Journal article
container_title CAAI Transactions on Intelligence Technology
container_volume 0
publishDate 2026
institution Swansea University
issn 2468-6557
2468-2322
doi_str_mv 10.1049/cit2.70115
publisher Institution of Engineering and Technology (IET)
college_str Faculty of Science and Engineering
hierarchytype
hierarchy_top_id facultyofscienceandengineering
hierarchy_top_title Faculty of Science and Engineering
hierarchy_parent_id facultyofscienceandengineering
hierarchy_parent_title Faculty of Science and Engineering
department_str School of Mathematics and Computer Science - Computer Science{{{_:::_}}}Faculty of Science and Engineering{{{_:::_}}}School of Mathematics and Computer Science - Computer Science
document_store_str 1
active_str 0
description Analysing learners' facial expressions during learning and exploring their learning processes and emotional changes are of great significance for assisting teachers' teaching and promoting smart education. In complex learning environments, static facial expression recognition fails to capture the dynamic changes of learners' expressions losing the continuous features in the learning process, and its recognition effect is easily interfered with by factors such as occlusion and lighting variations during learning. To address the above issues, a network model based on adaptive global attention and temporal difference is proposed to recognise learners' dynamic expression sequences. Firstly, we have designed an Adaptive Global Attention (AGA) block, which adaptively models inter-channel relationships to dynamically enhance key channels that are highly correlated with learners' states while suppressing redundant information, thereby improving the model's feature representation capability under noisy environments. Secondly, we have designed a Differential Temporal Transformer (DTFormer) to extract differential information between consecutive frames, increasing the model's sensitivity to learners' facial expression dynamics and improving recognition performance. The two components complement each other in terms of spatial feature enhancement and temporal dynamic modelling effectively improving the model's overall capability for representing learners' dynamic facial expressions. Experiments were conducted on public datasets DFEW, FERV39k and the learner E-learning emotional state data set DAiSEE, and comparisons were made with classical methods using objective indicators. The results demonstrate that the proposed method outperforms the comparison methods in multiple performance indicators, thereby verifying its effectiveness.
published_date 2026-03-03T05:38:42Z
_version_ 1860430015231950848
score 11.099917