No Cover Image

E-Thesis 155 views 38 downloads

Data mining patterns of receptor-drug interactions across vast biological and chemical space / James Witts

Swansea University Author: James Witts

DOI (Published version): 10.23889/SUthesis.65941

Abstract

The aim of the project was to determine how machine learning tools can assist in the process of drug discovery and in silico screening. As the development costs and attrition rates for candidate compounds can be high, a method of predicting likelihood for approval or for therapeutic promise with mac...

Full description

Published: Swansea, Wales, UK 2020
Institution: Swansea University
Degree level: Doctoral
Degree name: Ph.D
Supervisor: Mullins, Jonathan G. ; Iago, Heledd F.
URI: https://cronfa.swan.ac.uk/Record/cronfa65941
Tags: Add Tag
No Tags, Be the first to tag this record!
first_indexed 2024-04-03T16:11:04Z
last_indexed 2024-04-03T16:11:04Z
id cronfa65941
recordtype RisThesis
fullrecord <?xml version="1.0" encoding="utf-8"?><rfc1807 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"><bib-version>v2</bib-version><id>65941</id><entry>2024-04-03</entry><title>Data mining patterns of receptor-drug interactions across vast biological and chemical space</title><swanseaauthors><author><sid>c8d1e374a823863aae5d0dfaec19c7b5</sid><ORCID>0009-0008-3386-2965</ORCID><firstname>James</firstname><surname>Witts</surname><name>James Witts</name><active>true</active><ethesisStudent>false</ethesisStudent></author></swanseaauthors><date>2024-04-03</date><deptcode>MEDS</deptcode><abstract>The aim of the project was to determine how machine learning tools can assist in the process of drug discovery and in silico screening. As the development costs and attrition rates for candidate compounds can be high, a method of predicting likelihood for approval or for therapeutic promise with machine learning and high performance computing tools could be of great benefit to medical researchers, biotech, chemical and pharmaceutical industries. The first phase of the project was to determine if knowledge of the recorded in vitro protein interactions of particular compounds (listed in DrugBank and ToxCast) would be sufficient to determine whether a candidate compound could be designated as having a good (i.e approved drug) or a bad (i.e toxic) profile. The learning models assessed showed promise in correctly designating candidate compounds based on a small number of proteins used for pharmacological profiling, with over 90% overall profiling prediction accuracy in the best case, however the vast majority of interactions between compounds and proteins are unknown and so a predictive approach is needed. The second phase of the project was to provide a method for predicting these hitherto unknown interactions, by predicting protein-compound interaction pairs through clustering techniques. Several clustering methods based on protein and compound similarity measurement techniques were investigated and found that when tested on blind in vitro interactions, approximately half on average were detected successfully, highlighting the promising potential to strengthen the predictive profiling models. The third and final phase of the project focused on the development of a compound and protein target prediction interface, TargetPredict (http://proteins.swan.ac.uk/cheminf/), which incorporates the data and methodologies presented throughout the whole project into a single centralised source, to provide the sector for the first time with a unified tool, avoiding the onerousness of current approaches that either require the use of multiple often incompatible websites or require extensive coding experience.</abstract><type>E-Thesis</type><journal/><volume/><journalNumber/><paginationStart/><paginationEnd/><publisher/><placeOfPublication>Swansea, Wales, UK</placeOfPublication><isbnPrint/><isbnElectronic/><issnPrint/><issnElectronic/><keywords>Data mining, clustering, classification, similarity</keywords><publishedDay>10</publishedDay><publishedMonth>7</publishedMonth><publishedYear>2020</publishedYear><publishedDate>2020-07-10</publishedDate><doi>10.23889/SUthesis.65941</doi><url/><notes/><college>COLLEGE NANME</college><department>Medical School</department><CollegeCode>COLLEGE CODE</CollegeCode><DepartmentCode>MEDS</DepartmentCode><institution>Swansea University</institution><supervisor>Mullins, Jonathan G. ; Iago, Heledd F.</supervisor><degreelevel>Doctoral</degreelevel><degreename>Ph.D</degreename><degreesponsorsfunders>Moleculomics Ltd.</degreesponsorsfunders><apcterm/><funders/><projectreference/><lastEdited>2024-06-05T10:43:34.3050963</lastEdited><Created>2024-04-03T17:07:23.5245481</Created><path><level id="1">Faculty of Medicine, Health and Life Sciences</level><level id="2">Swansea University Medical School - Biomedical Science</level></path><authors><author><firstname>James</firstname><surname>Witts</surname><orcid>0009-0008-3386-2965</orcid><order>1</order></author></authors><documents><document><filename>65941__29905__c3f67d223085450e87d0d2e2a18cee26.pdf</filename><originalFilename>Witts_James_A_PhD_Thesis_Final_Cronfa.pdf</originalFilename><uploaded>2024-04-03T17:35:18.3798499</uploaded><type>Output</type><contentLength>14132772</contentLength><contentType>application/pdf</contentType><version>E-Thesis – open access</version><cronfaStatus>true</cronfaStatus><documentNotes>Copyright: The author, James A. Witts, 2020.</documentNotes><copyrightCorrect>true</copyrightCorrect><language>eng</language></document></documents><OutputDurs/></rfc1807>
spelling v2 65941 2024-04-03 Data mining patterns of receptor-drug interactions across vast biological and chemical space c8d1e374a823863aae5d0dfaec19c7b5 0009-0008-3386-2965 James Witts James Witts true false 2024-04-03 MEDS The aim of the project was to determine how machine learning tools can assist in the process of drug discovery and in silico screening. As the development costs and attrition rates for candidate compounds can be high, a method of predicting likelihood for approval or for therapeutic promise with machine learning and high performance computing tools could be of great benefit to medical researchers, biotech, chemical and pharmaceutical industries. The first phase of the project was to determine if knowledge of the recorded in vitro protein interactions of particular compounds (listed in DrugBank and ToxCast) would be sufficient to determine whether a candidate compound could be designated as having a good (i.e approved drug) or a bad (i.e toxic) profile. The learning models assessed showed promise in correctly designating candidate compounds based on a small number of proteins used for pharmacological profiling, with over 90% overall profiling prediction accuracy in the best case, however the vast majority of interactions between compounds and proteins are unknown and so a predictive approach is needed. The second phase of the project was to provide a method for predicting these hitherto unknown interactions, by predicting protein-compound interaction pairs through clustering techniques. Several clustering methods based on protein and compound similarity measurement techniques were investigated and found that when tested on blind in vitro interactions, approximately half on average were detected successfully, highlighting the promising potential to strengthen the predictive profiling models. The third and final phase of the project focused on the development of a compound and protein target prediction interface, TargetPredict (http://proteins.swan.ac.uk/cheminf/), which incorporates the data and methodologies presented throughout the whole project into a single centralised source, to provide the sector for the first time with a unified tool, avoiding the onerousness of current approaches that either require the use of multiple often incompatible websites or require extensive coding experience. E-Thesis Swansea, Wales, UK Data mining, clustering, classification, similarity 10 7 2020 2020-07-10 10.23889/SUthesis.65941 COLLEGE NANME Medical School COLLEGE CODE MEDS Swansea University Mullins, Jonathan G. ; Iago, Heledd F. Doctoral Ph.D Moleculomics Ltd. 2024-06-05T10:43:34.3050963 2024-04-03T17:07:23.5245481 Faculty of Medicine, Health and Life Sciences Swansea University Medical School - Biomedical Science James Witts 0009-0008-3386-2965 1 65941__29905__c3f67d223085450e87d0d2e2a18cee26.pdf Witts_James_A_PhD_Thesis_Final_Cronfa.pdf 2024-04-03T17:35:18.3798499 Output 14132772 application/pdf E-Thesis – open access true Copyright: The author, James A. Witts, 2020. true eng
title Data mining patterns of receptor-drug interactions across vast biological and chemical space
spellingShingle Data mining patterns of receptor-drug interactions across vast biological and chemical space
James Witts
title_short Data mining patterns of receptor-drug interactions across vast biological and chemical space
title_full Data mining patterns of receptor-drug interactions across vast biological and chemical space
title_fullStr Data mining patterns of receptor-drug interactions across vast biological and chemical space
title_full_unstemmed Data mining patterns of receptor-drug interactions across vast biological and chemical space
title_sort Data mining patterns of receptor-drug interactions across vast biological and chemical space
author_id_str_mv c8d1e374a823863aae5d0dfaec19c7b5
author_id_fullname_str_mv c8d1e374a823863aae5d0dfaec19c7b5_***_James Witts
author James Witts
author2 James Witts
format E-Thesis
publishDate 2020
institution Swansea University
doi_str_mv 10.23889/SUthesis.65941
college_str Faculty of Medicine, Health and Life Sciences
hierarchytype
hierarchy_top_id facultyofmedicinehealthandlifesciences
hierarchy_top_title Faculty of Medicine, Health and Life Sciences
hierarchy_parent_id facultyofmedicinehealthandlifesciences
hierarchy_parent_title Faculty of Medicine, Health and Life Sciences
department_str Swansea University Medical School - Biomedical Science{{{_:::_}}}Faculty of Medicine, Health and Life Sciences{{{_:::_}}}Swansea University Medical School - Biomedical Science
document_store_str 1
active_str 0
description The aim of the project was to determine how machine learning tools can assist in the process of drug discovery and in silico screening. As the development costs and attrition rates for candidate compounds can be high, a method of predicting likelihood for approval or for therapeutic promise with machine learning and high performance computing tools could be of great benefit to medical researchers, biotech, chemical and pharmaceutical industries. The first phase of the project was to determine if knowledge of the recorded in vitro protein interactions of particular compounds (listed in DrugBank and ToxCast) would be sufficient to determine whether a candidate compound could be designated as having a good (i.e approved drug) or a bad (i.e toxic) profile. The learning models assessed showed promise in correctly designating candidate compounds based on a small number of proteins used for pharmacological profiling, with over 90% overall profiling prediction accuracy in the best case, however the vast majority of interactions between compounds and proteins are unknown and so a predictive approach is needed. The second phase of the project was to provide a method for predicting these hitherto unknown interactions, by predicting protein-compound interaction pairs through clustering techniques. Several clustering methods based on protein and compound similarity measurement techniques were investigated and found that when tested on blind in vitro interactions, approximately half on average were detected successfully, highlighting the promising potential to strengthen the predictive profiling models. The third and final phase of the project focused on the development of a compound and protein target prediction interface, TargetPredict (http://proteins.swan.ac.uk/cheminf/), which incorporates the data and methodologies presented throughout the whole project into a single centralised source, to provide the sector for the first time with a unified tool, avoiding the onerousness of current approaches that either require the use of multiple often incompatible websites or require extensive coding experience.
published_date 2020-07-10T10:43:34Z
_version_ 1801013810099650560
score 11.016235