E-Thesis 251 views 69 downloads
Data mining patterns of receptor-drug interactions across vast biological and chemical space / James Witts
Swansea University Author: James Witts
DOI (Published version): 10.23889/SUthesis.65941
Abstract
The aim of the project was to determine how machine learning tools can assist in the process of drug discovery and in silico screening. As the development costs and attrition rates for candidate compounds can be high, a method of predicting likelihood for approval or for therapeutic promise with mac...
Published: |
Swansea, Wales, UK
2020
|
---|---|
Institution: | Swansea University |
Degree level: | Doctoral |
Degree name: | Ph.D |
Supervisor: | Mullins, Jonathan G. ; Iago, Heledd F. |
URI: | https://cronfa.swan.ac.uk/Record/cronfa65941 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
first_indexed |
2024-04-03T16:11:04Z |
---|---|
last_indexed |
2024-04-03T16:11:04Z |
id |
cronfa65941 |
recordtype |
RisThesis |
fullrecord |
<?xml version="1.0" encoding="utf-8"?><rfc1807 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"><bib-version>v2</bib-version><id>65941</id><entry>2024-04-03</entry><title>Data mining patterns of receptor-drug interactions across vast biological and chemical space</title><swanseaauthors><author><sid>c8d1e374a823863aae5d0dfaec19c7b5</sid><ORCID>0009-0008-3386-2965</ORCID><firstname>James</firstname><surname>Witts</surname><name>James Witts</name><active>true</active><ethesisStudent>false</ethesisStudent></author></swanseaauthors><date>2024-04-03</date><deptcode>MEDS</deptcode><abstract>The aim of the project was to determine how machine learning tools can assist in the process of drug discovery and in silico screening. As the development costs and attrition rates for candidate compounds can be high, a method of predicting likelihood for approval or for therapeutic promise with machine learning and high performance computing tools could be of great benefit to medical researchers, biotech, chemical and pharmaceutical industries. The first phase of the project was to determine if knowledge of the recorded in vitro protein interactions of particular compounds (listed in DrugBank and ToxCast) would be sufficient to determine whether a candidate compound could be designated as having a good (i.e approved drug) or a bad (i.e toxic) profile. The learning models assessed showed promise in correctly designating candidate compounds based on a small number of proteins used for pharmacological profiling, with over 90% overall profiling prediction accuracy in the best case, however the vast majority of interactions between compounds and proteins are unknown and so a predictive approach is needed. The second phase of the project was to provide a method for predicting these hitherto unknown interactions, by predicting protein-compound interaction pairs through clustering techniques. Several clustering methods based on protein and compound similarity measurement techniques were investigated and found that when tested on blind in vitro interactions, approximately half on average were detected successfully, highlighting the promising potential to strengthen the predictive profiling models. The third and final phase of the project focused on the development of a compound and protein target prediction interface, TargetPredict (http://proteins.swan.ac.uk/cheminf/), which incorporates the data and methodologies presented throughout the whole project into a single centralised source, to provide the sector for the first time with a unified tool, avoiding the onerousness of current approaches that either require the use of multiple often incompatible websites or require extensive coding experience.</abstract><type>E-Thesis</type><journal/><volume/><journalNumber/><paginationStart/><paginationEnd/><publisher/><placeOfPublication>Swansea, Wales, UK</placeOfPublication><isbnPrint/><isbnElectronic/><issnPrint/><issnElectronic/><keywords>Data mining, clustering, classification, similarity</keywords><publishedDay>10</publishedDay><publishedMonth>7</publishedMonth><publishedYear>2020</publishedYear><publishedDate>2020-07-10</publishedDate><doi>10.23889/SUthesis.65941</doi><url/><notes/><college>COLLEGE NANME</college><department>Medical School</department><CollegeCode>COLLEGE CODE</CollegeCode><DepartmentCode>MEDS</DepartmentCode><institution>Swansea University</institution><supervisor>Mullins, Jonathan G. ; Iago, Heledd F.</supervisor><degreelevel>Doctoral</degreelevel><degreename>Ph.D</degreename><degreesponsorsfunders>Moleculomics Ltd.</degreesponsorsfunders><apcterm/><funders/><projectreference/><lastEdited>2024-06-05T10:43:34.3050963</lastEdited><Created>2024-04-03T17:07:23.5245481</Created><path><level id="1">Faculty of Medicine, Health and Life Sciences</level><level id="2">Swansea University Medical School - Biomedical Science</level></path><authors><author><firstname>James</firstname><surname>Witts</surname><orcid>0009-0008-3386-2965</orcid><order>1</order></author></authors><documents><document><filename>65941__29905__c3f67d223085450e87d0d2e2a18cee26.pdf</filename><originalFilename>Witts_James_A_PhD_Thesis_Final_Cronfa.pdf</originalFilename><uploaded>2024-04-03T17:35:18.3798499</uploaded><type>Output</type><contentLength>14132772</contentLength><contentType>application/pdf</contentType><version>E-Thesis – open access</version><cronfaStatus>true</cronfaStatus><documentNotes>Copyright: The author, James A. Witts, 2020.</documentNotes><copyrightCorrect>true</copyrightCorrect><language>eng</language></document></documents><OutputDurs/></rfc1807> |
spelling |
v2 65941 2024-04-03 Data mining patterns of receptor-drug interactions across vast biological and chemical space c8d1e374a823863aae5d0dfaec19c7b5 0009-0008-3386-2965 James Witts James Witts true false 2024-04-03 MEDS The aim of the project was to determine how machine learning tools can assist in the process of drug discovery and in silico screening. As the development costs and attrition rates for candidate compounds can be high, a method of predicting likelihood for approval or for therapeutic promise with machine learning and high performance computing tools could be of great benefit to medical researchers, biotech, chemical and pharmaceutical industries. The first phase of the project was to determine if knowledge of the recorded in vitro protein interactions of particular compounds (listed in DrugBank and ToxCast) would be sufficient to determine whether a candidate compound could be designated as having a good (i.e approved drug) or a bad (i.e toxic) profile. The learning models assessed showed promise in correctly designating candidate compounds based on a small number of proteins used for pharmacological profiling, with over 90% overall profiling prediction accuracy in the best case, however the vast majority of interactions between compounds and proteins are unknown and so a predictive approach is needed. The second phase of the project was to provide a method for predicting these hitherto unknown interactions, by predicting protein-compound interaction pairs through clustering techniques. Several clustering methods based on protein and compound similarity measurement techniques were investigated and found that when tested on blind in vitro interactions, approximately half on average were detected successfully, highlighting the promising potential to strengthen the predictive profiling models. The third and final phase of the project focused on the development of a compound and protein target prediction interface, TargetPredict (http://proteins.swan.ac.uk/cheminf/), which incorporates the data and methodologies presented throughout the whole project into a single centralised source, to provide the sector for the first time with a unified tool, avoiding the onerousness of current approaches that either require the use of multiple often incompatible websites or require extensive coding experience. E-Thesis Swansea, Wales, UK Data mining, clustering, classification, similarity 10 7 2020 2020-07-10 10.23889/SUthesis.65941 COLLEGE NANME Medical School COLLEGE CODE MEDS Swansea University Mullins, Jonathan G. ; Iago, Heledd F. Doctoral Ph.D Moleculomics Ltd. 2024-06-05T10:43:34.3050963 2024-04-03T17:07:23.5245481 Faculty of Medicine, Health and Life Sciences Swansea University Medical School - Biomedical Science James Witts 0009-0008-3386-2965 1 65941__29905__c3f67d223085450e87d0d2e2a18cee26.pdf Witts_James_A_PhD_Thesis_Final_Cronfa.pdf 2024-04-03T17:35:18.3798499 Output 14132772 application/pdf E-Thesis – open access true Copyright: The author, James A. Witts, 2020. true eng |
title |
Data mining patterns of receptor-drug interactions across vast biological and chemical space |
spellingShingle |
Data mining patterns of receptor-drug interactions across vast biological and chemical space James Witts |
title_short |
Data mining patterns of receptor-drug interactions across vast biological and chemical space |
title_full |
Data mining patterns of receptor-drug interactions across vast biological and chemical space |
title_fullStr |
Data mining patterns of receptor-drug interactions across vast biological and chemical space |
title_full_unstemmed |
Data mining patterns of receptor-drug interactions across vast biological and chemical space |
title_sort |
Data mining patterns of receptor-drug interactions across vast biological and chemical space |
author_id_str_mv |
c8d1e374a823863aae5d0dfaec19c7b5 |
author_id_fullname_str_mv |
c8d1e374a823863aae5d0dfaec19c7b5_***_James Witts |
author |
James Witts |
author2 |
James Witts |
format |
E-Thesis |
publishDate |
2020 |
institution |
Swansea University |
doi_str_mv |
10.23889/SUthesis.65941 |
college_str |
Faculty of Medicine, Health and Life Sciences |
hierarchytype |
|
hierarchy_top_id |
facultyofmedicinehealthandlifesciences |
hierarchy_top_title |
Faculty of Medicine, Health and Life Sciences |
hierarchy_parent_id |
facultyofmedicinehealthandlifesciences |
hierarchy_parent_title |
Faculty of Medicine, Health and Life Sciences |
department_str |
Swansea University Medical School - Biomedical Science{{{_:::_}}}Faculty of Medicine, Health and Life Sciences{{{_:::_}}}Swansea University Medical School - Biomedical Science |
document_store_str |
1 |
active_str |
0 |
description |
The aim of the project was to determine how machine learning tools can assist in the process of drug discovery and in silico screening. As the development costs and attrition rates for candidate compounds can be high, a method of predicting likelihood for approval or for therapeutic promise with machine learning and high performance computing tools could be of great benefit to medical researchers, biotech, chemical and pharmaceutical industries. The first phase of the project was to determine if knowledge of the recorded in vitro protein interactions of particular compounds (listed in DrugBank and ToxCast) would be sufficient to determine whether a candidate compound could be designated as having a good (i.e approved drug) or a bad (i.e toxic) profile. The learning models assessed showed promise in correctly designating candidate compounds based on a small number of proteins used for pharmacological profiling, with over 90% overall profiling prediction accuracy in the best case, however the vast majority of interactions between compounds and proteins are unknown and so a predictive approach is needed. The second phase of the project was to provide a method for predicting these hitherto unknown interactions, by predicting protein-compound interaction pairs through clustering techniques. Several clustering methods based on protein and compound similarity measurement techniques were investigated and found that when tested on blind in vitro interactions, approximately half on average were detected successfully, highlighting the promising potential to strengthen the predictive profiling models. The third and final phase of the project focused on the development of a compound and protein target prediction interface, TargetPredict (http://proteins.swan.ac.uk/cheminf/), which incorporates the data and methodologies presented throughout the whole project into a single centralised source, to provide the sector for the first time with a unified tool, avoiding the onerousness of current approaches that either require the use of multiple often incompatible websites or require extensive coding experience. |
published_date |
2020-07-10T10:43:34Z |
_version_ |
1801013810099650560 |
score |
11.03559 |