E-Thesis 307 views
A platform methodology for optimised cancer detection from Raman spectroscopy of human blood serum / FREYA WOODS
Swansea University Author: FREYA WOODS
E-Thesis – open access under embargo until: 22nd June 2027
DOI (Published version): 10.23889/SUthesis.60374
Cancer remains the most lethal condition in the world, accounting for 10 million deaths worldwide i.e. 1/6 of all deaths. Currently, cancer detection is primar-ily through symptomatic routes wherein patients present with serious symptoms and undergo imaging and subsequently biopsy of suspected cance...
|Supervisor:||Dunstan, Peter R. ; Harris, Dean A.|
No Tags, Be the first to tag this record!
Cancer remains the most lethal condition in the world, accounting for 10 million deaths worldwide i.e. 1/6 of all deaths. Currently, cancer detection is primar-ily through symptomatic routes wherein patients present with serious symptoms and undergo imaging and subsequently biopsy of suspected cancer growths for histopathological confirmation of cancer. A diagnostic tool through serum anal-ysis could revolutionise current cancer pathways. Raman spectroscopy offers the ability to measure a complex biochemical fingerprint of a sample through vibra-tional energy shifts. Numerous studies with Raman of biological samples (serum, plasma, tissue, cellular) exist showing promising results for the detection of can-cer. However, these studies are typically limited to proof-of-concept and halt at larger scale studies or forward looking to how these methods might inter-rupt current clinical pathways. Presented in this thesis are a methodology for the optimisation of cancer detection from Raman spectroscopy of human blood serum. The first results chapters show tools in R for the pre-processing of Raman spectra using a variety of techniques. This is followed by an application for qual-ity control of Raman spectra with a view of the necessary safety nets required for tools to integrate into a clinical setting. These tools are then utilised for the task of optimising pre-processing specifically for colorectal cancer detection with human blood serum spectra using high-performance computing (HPC). 2.4 million different pre-processing permutations are trialled in total. This method-ology saw an improvement in diagnostic abilities, with sensitivity increasing by 14.6%, specificity increasing by 6.9%, positive predictive value (PPV) increasing by 3.4%, and negative predictive value increasing by 2.4% when compared to a standard pre-processing optimisation. A similar methodology using HPC is then utilised in chapter 7 to optimise machine learning algorithm selection and feature reduction for colorectal cancer detection of serum spectra. Feature reduction methods principle component analysis (PCA), factor analysis (FA), ElasticNet (EN), and random forest feature selection (RFFS) combined with model types k nearest-neighbours (KNN), logistic regression (LR), support vector machines (SVM), and random forest (RF) are trialled. Traditional feature reduction meth-ods such as PCA and FA were found to perform poorly compared to techniques EN and RFFS. In addition, model types SVM and RF outperform methods LR and KNN. This chapter also shows results from applying artificial neural net-work architectures for colorectal cancer detection and finding that linear machine learning methods (SVM/RF) outperform neural networks. Finally, the last re-sults chapter presents the culmination of these optimised methods applied to building machine learning models for the detection of other cancer types; breast, pancreatic, lung, colorectal cancer. The models achieved 90.9% sensitivity and 77.3% specificity for pancreatic cancer vs. controls, 90.3% sensitivity and 64.5%specificity for lung cancer vs. controls, 92.1% sensitivity and 65.8% specificity for breast cancer and controls, and finally 91.3% sensitivity and 44.0% specificity for colorectal cancer vs. controls where models are thresholded to achieve a minimum of 90% sensitivity. This chapter also focuses on differences in metabolite profile from the Raman spectra between different cancer types and aims to elucidate biochemical linkages.
ORCiD identifier: https://orcid.org/0000-0001-9412-0967
Raman spectroscopy, cancer detection, machine learning
Faculty of Science and Engineering