Modeling Large Sparse Data for Feature Selection: Hospital Admission Predictions of the Dementia Patients Using Primary Care Electronic Health Records

Tsang, Gavin; Zhou, Shang-ming; Xie, Xianghua

doi:10.1109/jtehm.2020.3040236

Journal article 1026 views 307 downloads

Modeling Large Sparse Data for Feature Selection: Hospital Admission Predictions of the Dementia Patients Using Primary Care Electronic Health Records

Gavin Tsang, Shang-ming Zhou

, Xianghua Xie

IEEE Journal of Translational Engineering in Health and Medicine, Volume: 9, Pages: 1 - 13

Swansea University Authors: Gavin Tsang, Shang-ming Zhou , Xianghua Xie

PDF | Version of Record

This work is licensed under a Creative Commons Attribution 4.0 License.
Download (1.83MB)

Check full text

DOI (Published version): 10.1109/jtehm.2020.3040236

Abstract

A growing elderly population suffering from incur- able, chronic conditions such as dementia present a continual strain on medical services due to mental impairment paired with high comorbidity resulting in increased hospitalization risk. The identification of at risk individuals allows for pre- ven...

Full description

Published in:	IEEE Journal of Translational Engineering in Health and Medicine
ISSN:	2168-2372
Published:	Institute of Electrical and Electronics Engineers (IEEE) 2021
Online Access:	Check full text
URI:	https://cronfa.swan.ac.uk/Record/cronfa55654

Abstract:	A growing elderly population suffering from incur- able, chronic conditions such as dementia present a continual strain on medical services due to mental impairment paired with high comorbidity resulting in increased hospitalization risk. The identification of at risk individuals allows for pre- ventative measures to alleviate said strain. Electronic health records provide opportunity for big data analysis to address such applications. Such data however, provides a challenging problem space for traditional statistics and machine learning due to high dimensionality and sparse data elements.This paper proposes a novel machine learning methodology: entropy regularization with ensemble deep neural networks (ECNN), which simultaneously provides high predictive per- formance of hospitalization of patients with dementia whilst enabling an interpretable heuristic analysis of the model archi- tecture, able to identify individual features of importance within a large feature domain space.Experimental results on health records containing 54,647 features were able to identify 10 event indicators within a patient timeline: a collection of diagnostic events, medication prescriptions and procedural events, the highest ranked being essential hypertension. The resulting subset was still able to pro- vide a highly competitive hospitalization prediction (Accuracy: 0.759) as compared to the full feature domain (Accuracy: 0.755) or traditional feature selection techniques (Accuracy: 0.737), a significant reduction in feature size.The discovery and heuristic evidence of correlation provide evidence for further clinical study of said medical events as potential novel indicators. There also remains great potential for adaption of ECNN within other medical big data domains as a data mining tool for novel risk factor identification.
Keywords:	Deep learning; dementia; electronic health records; feature selection; hospitalization; machine learning; risk factors; weight regularization.
College:	Faculty of Science and Engineering
Start Page:	1
End Page:	13

Modeling Large Sparse Data for Feature Selection: Hospital Admission Predictions of the Dementia Patients Using Primary Care Electronic Health Records

Similar Items