No Cover Image

Conference Paper/Proceeding/Abstract 335 views

Alzheimer's Dementia Recognition Using Acoustic, Lexical, Disfluency and Speech Pause Features Robust to Noisy Inputs

Morteza Rohanian, Julian Hough Orcid Logo, Matthew Purver

Interspeech 2021

Swansea University Author: Julian Hough Orcid Logo

Full text not available from this repository: check for access using links below.

DOI (Published version): 10.21437/interspeech.2021-1633

Abstract

We present two multimodal fusion-based deep learning models that consume ASR transcribed speech and acoustic data simultaneously to classify whether a speaker in a structured diagnostic task has Alzheimer's Disease and to what degree, evaluating the ADReSSo challenge 2021 data. Our best model,...

Full description

Published in: Interspeech 2021
Published: ISCA ISCA 2021
URI: https://cronfa.swan.ac.uk/Record/cronfa64932
Abstract: We present two multimodal fusion-based deep learning models that consume ASR transcribed speech and acoustic data simultaneously to classify whether a speaker in a structured diagnostic task has Alzheimer's Disease and to what degree, evaluating the ADReSSo challenge 2021 data. Our best model, a BiLSTM with highway layers using words, word probabilities, disfluency features, pause information, and a variety of acoustic features, achieves an accuracy of 84% and RSME error prediction of 4.26 on MMSE cognitive scores. While predicting cognitive decline is more challenging, our models show improvement using the multimodal approach and word probabilities, disfluency and pause information over word-only models. We show considerable gains for AD classification using multimodal fusion and gating, which can effectively deal with noisy inputs from acoustic features and ASR hypotheses.
College: Faculty of Science and Engineering