Development of a combined database / in silico pipeline for the investigation of novel approaches for precision medicine

ASOMUGHA, HENRY

doi:https://doi.org/

E-Thesis 1027 views 215 downloads

Development of a combined database / in silico pipeline for the investigation of novel approaches for precision medicine / HENRY ASOMUGHA

Swansea University Author: HENRY ASOMUGHA

PDF | E-Thesis – open access

Copyright: The author, Henry Asomugha, 2023.
Download (1.18MB)

Abstract

This project investigates how integrated in silico approaches can be developed to advance precision medicine. A novel bioinformatics database was created and implemented. This database was able to output gene and variant information of all the known human protein drug targets, while also retrieving...

Full description

Published:	Swansea 2023
Institution:	Swansea University
Degree level:	Master of Research
Degree name:	MSc by Research
Supervisor:	Mullins, Jonathan ; Ferla, Salvatore
URI:	https://cronfa.swan.ac.uk/Record/cronfa62661

Abstract:	This project investigates how integrated in silico approaches can be developed to advance precision medicine. A novel bioinformatics database was created and implemented. This database was able to output gene and variant information of all the known human protein drug targets, while also retrieving information on compounds, disorders and drugs associated with each protein. This database was created by writing code, in Linux, that was able to web-scrape multiple databases for information and store it in the pipeline database. The databases that were scraped were UniProt, ClinVar, PubChem, chEMBL, guide to pharmacology, MedGen and the therapeutic target database. These databases were selected as they provided the largest and most reliable data, that could be web-scraped, for each section they were scraped for. Once coded, the full pharmacology set (a set of 720 pharmacologically relevant proteins, whose pharmacological mechanism is known) was added to the pipeline, meaning all their information was downloaded and stored in the pipeline. This bioinformatics pipeline proved to be very effective as an investigative tool for identifying new avenues for personalised medicine as it was able to retrieve and integrate all the requested information on proteins, variants, diseases, and compounds when called upon. In the proof-of-concept study, the database was used to gather key information that allowed for an investigation into the effect of pathogenic variants on drug binding in proteins. This investigation was conducted by simulating the binding of a protein’s wild type to two of its known drug ligands. 10 benign variants and 10 pathogenic variants of the protein were also bound to 2 drug ligands associated with the protein; their relative binding energies were collected. This allowed for comparisons to be made between the effect of pathogenic variants and benign variants on a protein’s binding ability. Analysis from the docking simulations showed that in 3 of the 5 proteins studied (60%), more pathogenic mutations returned a binding energy with at least a 15% deviation from the wildtype binding energy than benign mutations. These results suggest that the binding interactions of a protein could be affected by polymorphic variation, especially pathogenic variation, although in this case study the difference between the two groups did not show statistical significance.
Keywords:	Precision Medicine, Bioinformatics
College:	Faculty of Medicine, Health and Life Sciences

Development of a combined database / in silico pipeline for the investigation of novel approaches for precision medicine / HENRY ASOMUGHA

Similar Items