Data mining patterns of receptor-drug interactions across vast biological and chemical space

Witts, James

doi:10.23889/SUthesis.65941

E-Thesis 641 views 168 downloads

Data mining patterns of receptor-drug interactions across vast biological and chemical space / James Witts

Swansea University Author: James Witts

PDF | E-Thesis – open access

Copyright: The author, James A. Witts, 2020.
Download (13.48MB)

DOI (Published version): 10.23889/SUthesis.65941

Abstract

The aim of the project was to determine how machine learning tools can assist in the process of drug discovery and in silico screening. As the development costs and attrition rates for candidate compounds can be high, a method of predicting likelihood for approval or for therapeutic promise with mac...

Full description

Published:	Swansea, Wales, UK 2020
Institution:	Swansea University
Degree level:	Doctoral
Degree name:	Ph.D
Supervisor:	Mullins, Jonathan G. ; Iago, Heledd F.
URI:	https://cronfa.swan.ac.uk/Record/cronfa65941

first_indexed	2024-04-03T16:11:04Z
last_indexed	2024-11-25T14:17:09Z
id	cronfa65941
recordtype	RisThesis
fullrecord	<?xml version="1.0"?><rfc1807><datestamp>2024-06-05T10:43:34.3050963</datestamp><bib-version>v2</bib-version><id>65941</id><entry>2024-04-03</entry><title>Data mining patterns of receptor-drug interactions across vast biological and chemical space</title><swanseaauthors><author><sid>c8d1e374a823863aae5d0dfaec19c7b5</sid><ORCID>0009-0008-3386-2965</ORCID><firstname>James</firstname><surname>Witts</surname><name>James Witts</name><active>true</active><ethesisStudent>false</ethesisStudent></author></swanseaauthors><date>2024-04-03</date><deptcode>MEDS</deptcode><abstract>The aim of the project was to determine how machine learning tools can assist in the process of drug discovery and in silico screening. As the development costs and attrition rates for candidate compounds can be high, a method of predicting likelihood for approval or for therapeutic promise with machine learning and high performance computing tools could be of great beneﬁt to medical researchers, biotech, chemical and pharmaceutical industries. The ﬁrst phase of the project was to determine if knowledge of the recorded in vitro protein interactions of particular compounds (listed in DrugBank and ToxCast) would be suﬃcient to determine whether a candidate compound could be designated as having a good (i.e approved drug) or a bad (i.e toxic) proﬁle. The learning models assessed showed promise in correctly designating candidate compounds based on a small number of proteins used for pharmacological proﬁling, with over 90% overall proﬁling prediction accuracy in the best case, however the vast majority of interactions between compounds and proteins are unknown and so a predictive approach is needed. The second phase of the project was to provide a method for predicting these hitherto unknown interactions, by predicting protein-compound interaction pairs through clustering techniques. Several clustering methods based on protein and compound similarity measurement techniques were investigated and found that when tested on blind in vitro interactions, approximately half on average were detected successfully, highlighting the promising potential to strengthen the predictive proﬁling models. The third and ﬁnal phase of the project focused on the development of a compound and protein target prediction interface, TargetPredict (http://proteins.swan.ac.uk/cheminf/), which incorporates the data and methodologies presented throughout the whole project into a single centralised source, to provide the sector for the ﬁrst time with a uniﬁed tool, avoiding the onerousness of current approaches that either require the use of multiple often incompatible websites or require extensive coding experience.</abstract><type>E-Thesis</type><journal/><volume/><journalNumber/><paginationStart/><paginationEnd/><publisher/><placeOfPublication>Swansea, Wales, UK</placeOfPublication><isbnPrint/><isbnElectronic/><issnPrint/><issnElectronic/><keywords>Data mining, clustering, classification, similarity</keywords><publishedDay>10</publishedDay><publishedMonth>7</publishedMonth><publishedYear>2020</publishedYear><publishedDate>2020-07-10</publishedDate><doi>10.23889/SUthesis.65941</doi><url/><notes/><college>COLLEGE NANME</college><department>Medical School</department><CollegeCode>COLLEGE CODE</CollegeCode><DepartmentCode>MEDS</DepartmentCode><institution>Swansea University</institution><supervisor>Mullins, Jonathan G. ; Iago, Heledd F.</supervisor><degreelevel>Doctoral</degreelevel><degreename>Ph.D</degreename><degreesponsorsfunders>Moleculomics Ltd.</degreesponsorsfunders><apcterm/><funders/><projectreference/><lastEdited>2024-06-05T10:43:34.3050963</lastEdited><Created>2024-04-03T17:07:23.5245481</Created><path><level id="1">Faculty of Medicine, Health and Life Sciences</level><level id="2">Swansea University Medical School - Biomedical Science</level></path><authors><author><firstname>James</firstname><surname>Witts</surname><orcid>0009-0008-3386-2965</orcid><order>1</order></author></authors><documents><document><filename>65941__29905__c3f67d223085450e87d0d2e2a18cee26.pdf</filename><originalFilename>Witts_James_A_PhD_Thesis_Final_Cronfa.pdf</originalFilename><uploaded>2024-04-03T17:35:18.3798499</uploaded><type>Output</type><contentLength>14132772</contentLength><contentType>application/pdf</contentType><version>E-Thesis – open access</version><cronfaStatus>true</cronfaStatus><documentNotes>Copyright: The author, James A. Witts, 2020.</documentNotes><copyrightCorrect>true</copyrightCorrect><language>eng</language></document></documents><OutputDurs/></rfc1807>
spelling	2024-06-05T10:43:34.3050963 v2 65941 2024-04-03 Data mining patterns of receptor-drug interactions across vast biological and chemical space c8d1e374a823863aae5d0dfaec19c7b5 0009-0008-3386-2965 James Witts James Witts true false 2024-04-03 MEDS The aim of the project was to determine how machine learning tools can assist in the process of drug discovery and in silico screening. As the development costs and attrition rates for candidate compounds can be high, a method of predicting likelihood for approval or for therapeutic promise with machine learning and high performance computing tools could be of great beneﬁt to medical researchers, biotech, chemical and pharmaceutical industries. The ﬁrst phase of the project was to determine if knowledge of the recorded in vitro protein interactions of particular compounds (listed in DrugBank and ToxCast) would be suﬃcient to determine whether a candidate compound could be designated as having a good (i.e approved drug) or a bad (i.e toxic) proﬁle. The learning models assessed showed promise in correctly designating candidate compounds based on a small number of proteins used for pharmacological proﬁling, with over 90% overall proﬁling prediction accuracy in the best case, however the vast majority of interactions between compounds and proteins are unknown and so a predictive approach is needed. The second phase of the project was to provide a method for predicting these hitherto unknown interactions, by predicting protein-compound interaction pairs through clustering techniques. Several clustering methods based on protein and compound similarity measurement techniques were investigated and found that when tested on blind in vitro interactions, approximately half on average were detected successfully, highlighting the promising potential to strengthen the predictive proﬁling models. The third and ﬁnal phase of the project focused on the development of a compound and protein target prediction interface, TargetPredict (http://proteins.swan.ac.uk/cheminf/), which incorporates the data and methodologies presented throughout the whole project into a single centralised source, to provide the sector for the ﬁrst time with a uniﬁed tool, avoiding the onerousness of current approaches that either require the use of multiple often incompatible websites or require extensive coding experience. E-Thesis Swansea, Wales, UK Data mining, clustering, classification, similarity 10 7 2020 2020-07-10 10.23889/SUthesis.65941 COLLEGE NANME Medical School COLLEGE CODE MEDS Swansea University Mullins, Jonathan G. ; Iago, Heledd F. Doctoral Ph.D Moleculomics Ltd. 2024-06-05T10:43:34.3050963 2024-04-03T17:07:23.5245481 Faculty of Medicine, Health and Life Sciences Swansea University Medical School - Biomedical Science James Witts 0009-0008-3386-2965 1 65941__29905__c3f67d223085450e87d0d2e2a18cee26.pdf Witts_James_A_PhD_Thesis_Final_Cronfa.pdf 2024-04-03T17:35:18.3798499 Output 14132772 application/pdf E-Thesis – open access true Copyright: The author, James A. Witts, 2020. true eng
title	Data mining patterns of receptor-drug interactions across vast biological and chemical space
spellingShingle	Data mining patterns of receptor-drug interactions across vast biological and chemical space James Witts
title_short	Data mining patterns of receptor-drug interactions across vast biological and chemical space
title_full	Data mining patterns of receptor-drug interactions across vast biological and chemical space
title_fullStr	Data mining patterns of receptor-drug interactions across vast biological and chemical space
title_full_unstemmed	Data mining patterns of receptor-drug interactions across vast biological and chemical space
title_sort	Data mining patterns of receptor-drug interactions across vast biological and chemical space
author_id_str_mv	c8d1e374a823863aae5d0dfaec19c7b5
author_id_fullname_str_mv	c8d1e374a823863aae5d0dfaec19c7b5_***_James Witts
author	James Witts
author2	James Witts
format	E-Thesis
publishDate	2020
institution	Swansea University
doi_str_mv	10.23889/SUthesis.65941
college_str	Faculty of Medicine, Health and Life Sciences
hierarchytype
hierarchy_top_id	facultyofmedicinehealthandlifesciences
hierarchy_top_title	Faculty of Medicine, Health and Life Sciences
hierarchy_parent_id	facultyofmedicinehealthandlifesciences
hierarchy_parent_title	Faculty of Medicine, Health and Life Sciences
department_str	Swansea University Medical School - Biomedical Science{{{_:::_}}}Faculty of Medicine, Health and Life Sciences{{{_:::_}}}Swansea University Medical School - Biomedical Science
document_store_str	1
active_str	0
description	The aim of the project was to determine how machine learning tools can assist in the process of drug discovery and in silico screening. As the development costs and attrition rates for candidate compounds can be high, a method of predicting likelihood for approval or for therapeutic promise with machine learning and high performance computing tools could be of great beneﬁt to medical researchers, biotech, chemical and pharmaceutical industries. The ﬁrst phase of the project was to determine if knowledge of the recorded in vitro protein interactions of particular compounds (listed in DrugBank and ToxCast) would be suﬃcient to determine whether a candidate compound could be designated as having a good (i.e approved drug) or a bad (i.e toxic) proﬁle. The learning models assessed showed promise in correctly designating candidate compounds based on a small number of proteins used for pharmacological proﬁling, with over 90% overall proﬁling prediction accuracy in the best case, however the vast majority of interactions between compounds and proteins are unknown and so a predictive approach is needed. The second phase of the project was to provide a method for predicting these hitherto unknown interactions, by predicting protein-compound interaction pairs through clustering techniques. Several clustering methods based on protein and compound similarity measurement techniques were investigated and found that when tested on blind in vitro interactions, approximately half on average were detected successfully, highlighting the promising potential to strengthen the predictive proﬁling models. The third and ﬁnal phase of the project focused on the development of a compound and protein target prediction interface, TargetPredict (http://proteins.swan.ac.uk/cheminf/), which incorporates the data and methodologies presented throughout the whole project into a single centralised source, to provide the sector for the ﬁrst time with a uniﬁed tool, avoiding the onerousness of current approaches that either require the use of multiple often incompatible websites or require extensive coding experience.
published_date	2020-07-10T06:44:23Z
_version_	1858893998644527104
score	11.098807

Data mining patterns of receptor-drug interactions across vast biological and chemical space / James Witts

Similar Items