TLGP: a flexible transfer learning algorithm for gene prioritization based on heterogeneous source domain

Deng, Jingjing; Wang, Yan; Xia, Zuheng; Xie, Xianghua; Gong, Maoguo

doi:10.1186/s12859-021-04190-9

Journal article 1311 views 227 downloads

TLGP: a flexible transfer learning algorithm for gene prioritization based on heterogeneous source domain

Jingjing Deng, Yan Wang, Zuheng Xia, Xianghua Xie

, Maoguo Gong

BMC Bioinformatics, Volume: 22, Issue: S9

Swansea University Authors: Jingjing Deng, Xianghua Xie

PDF | Version of Record

© The Author(s) 2021. This article is licensed under a Creative Commons Attribution 4.0 International License
Download (1.26MB)

Check full text

DOI (Published version): 10.1186/s12859-021-04190-9

Abstract

BackgroundGene prioritization (gene ranking) aims to obtain the centrality of genes, which is critical for cancer diagnosis and therapy since keys genes correspond to the biomarkers or targets of drugs. Great efforts have been devoted to the gene ranking problem by exploring the similarity between c...

Full description

Published in:	BMC Bioinformatics
ISSN:	1471-2105
Published:	Springer Science and Business Media LLC 2021
Online Access:	Check full text
URI:	https://cronfa.swan.ac.uk/Record/cronfa57704

first_indexed	2021-08-29T17:44:00Z
last_indexed	2023-01-11T14:37:47Z
id	cronfa57704
recordtype	SURis
fullrecord	<?xml version="1.0"?><rfc1807><datestamp>2022-10-28T15:51:57.5462143</datestamp><bib-version>v2</bib-version><id>57704</id><entry>2021-08-29</entry><title>TLGP: a flexible transfer learning algorithm for gene prioritization based on heterogeneous source domain</title><swanseaauthors><author><sid>6f6d01d585363d6dc1622640bb4fcb3f</sid><firstname>Jingjing</firstname><surname>Deng</surname><name>Jingjing Deng</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>b334d40963c7a2f435f06d2c26c74e11</sid><ORCID>0000-0002-2701-8660</ORCID><firstname>Xianghua</firstname><surname>Xie</surname><name>Xianghua Xie</name><active>true</active><ethesisStudent>false</ethesisStudent></author></swanseaauthors><date>2021-08-29</date><deptcode>MACS</deptcode><abstract>BackgroundGene prioritization (gene ranking) aims to obtain the centrality of genes, which is critical for cancer diagnosis and therapy since keys genes correspond to the biomarkers or targets of drugs. Great efforts have been devoted to the gene ranking problem by exploring the similarity between candidate and known disease-causing genes. However, when the number of disease-causing genes is limited, they are not applicable largely due to the low accuracy. Actually, the number of disease-causing genes for cancers, particularly for these rare cancers, are really limited. Therefore, there is a critical needed to design effective and efficient algorithms for gene ranking with limited prior disease-causing genes.ResultsIn this study, we propose a transfer learning based algorithm for gene prioritization (called TLGP) in the cancer (target domain) without disease-causing genes by transferring knowledge from other cancers (source domain). The underlying assumption is that knowledge shared by similar cancers improves the accuracy of gene prioritization. Specifically, TLGP first quantifies the similarity between the target and source domain by calculating the affinity matrix for genes. Then, TLGP automatically learns a fusion network for the target cancer by fusing affinity matrix, pathogenic genes and genomic data of source cancers. Finally, genes in the target cancer are prioritized. The experimental results indicate that the learnt fusion network is more reliable than gene co-expression network, implying that transferring knowledge from other cancers improves the accuracy of network construction. Moreover, TLGP outperforms state-of-the-art approaches in terms of accuracy, improving at least 5%.ConclusionThe proposed model and method provide an effective and efficient strategy for gene ranking by integrating genomic data from various cancers.</abstract><type>Journal Article</type><journal>BMC Bioinformatics</journal><volume>22</volume><journalNumber>S9</journalNumber><paginationStart/><paginationEnd/><publisher>Springer Science and Business Media LLC</publisher><placeOfPublication/><isbnPrint/><isbnElectronic/><issnPrint/><issnElectronic>1471-2105</issnElectronic><keywords>Gene prioritizatio, Transfer learning, Gene co-expression network, Integrative analysis</keywords><publishedDay>25</publishedDay><publishedMonth>8</publishedMonth><publishedYear>2021</publishedYear><publishedDate>2021-08-25</publishedDate><doi>10.1186/s12859-021-04190-9</doi><url/><notes/><college>COLLEGE NANME</college><department>Mathematics and Computer Science School</department><CollegeCode>COLLEGE CODE</CollegeCode><DepartmentCode>MACS</DepartmentCode><institution>Swansea University</institution><apcterm>Another institution paid the OA fee</apcterm><funders>This work was supported by the National Natural Science Foundation of China with No. 61772394 (XM) and Scientifc Research Foundation for the Returned Overseas Chinese Scholars of Shaanxi Province with No. 2018003 (XM). Publication costs are founded by National Natural Science Foundation of China (No. 61772394)</funders><projectreference/><lastEdited>2022-10-28T15:51:57.5462143</lastEdited><Created>2021-08-29T18:41:49.9271161</Created><path><level id="1">Faculty of Science and Engineering</level><level id="2">School of Mathematics and Computer Science - Computer Science</level></path><authors><author><firstname>Jingjing</firstname><surname>Deng</surname><order>1</order></author><author><firstname>Yan</firstname><surname>Wang</surname><order>2</order></author><author><firstname>Zuheng</firstname><surname>Xia</surname><order>3</order></author><author><firstname>Xianghua</firstname><surname>Xie</surname><orcid>0000-0002-2701-8660</orcid><order>4</order></author><author><firstname>Maoguo</firstname><surname>Gong</surname><order>5</order></author></authors><documents><document><filename>57704__20712__3ca1e183fba1499c934711668de7814f.pdf</filename><originalFilename>s12859-021-04190-9.pdf</originalFilename><uploaded>2021-08-29T18:43:34.4302066</uploaded><type>Output</type><contentLength>1319713</contentLength><contentType>application/pdf</contentType><version>Version of Record</version><cronfaStatus>true</cronfaStatus><documentNotes>© The Author(s) 2021. This article is licensed under a Creative Commons Attribution 4.0 International License</documentNotes><copyrightCorrect>true</copyrightCorrect><language>eng</language><licence>http://creativecommons.org/licenses/by/4.0/</licence></document></documents><OutputDurs/></rfc1807>
spelling	2022-10-28T15:51:57.5462143 v2 57704 2021-08-29 TLGP: a flexible transfer learning algorithm for gene prioritization based on heterogeneous source domain 6f6d01d585363d6dc1622640bb4fcb3f Jingjing Deng Jingjing Deng true false b334d40963c7a2f435f06d2c26c74e11 0000-0002-2701-8660 Xianghua Xie Xianghua Xie true false 2021-08-29 MACS BackgroundGene prioritization (gene ranking) aims to obtain the centrality of genes, which is critical for cancer diagnosis and therapy since keys genes correspond to the biomarkers or targets of drugs. Great efforts have been devoted to the gene ranking problem by exploring the similarity between candidate and known disease-causing genes. However, when the number of disease-causing genes is limited, they are not applicable largely due to the low accuracy. Actually, the number of disease-causing genes for cancers, particularly for these rare cancers, are really limited. Therefore, there is a critical needed to design effective and efficient algorithms for gene ranking with limited prior disease-causing genes.ResultsIn this study, we propose a transfer learning based algorithm for gene prioritization (called TLGP) in the cancer (target domain) without disease-causing genes by transferring knowledge from other cancers (source domain). The underlying assumption is that knowledge shared by similar cancers improves the accuracy of gene prioritization. Specifically, TLGP first quantifies the similarity between the target and source domain by calculating the affinity matrix for genes. Then, TLGP automatically learns a fusion network for the target cancer by fusing affinity matrix, pathogenic genes and genomic data of source cancers. Finally, genes in the target cancer are prioritized. The experimental results indicate that the learnt fusion network is more reliable than gene co-expression network, implying that transferring knowledge from other cancers improves the accuracy of network construction. Moreover, TLGP outperforms state-of-the-art approaches in terms of accuracy, improving at least 5%.ConclusionThe proposed model and method provide an effective and efficient strategy for gene ranking by integrating genomic data from various cancers. Journal Article BMC Bioinformatics 22 S9 Springer Science and Business Media LLC 1471-2105 Gene prioritizatio, Transfer learning, Gene co-expression network, Integrative analysis 25 8 2021 2021-08-25 10.1186/s12859-021-04190-9 COLLEGE NANME Mathematics and Computer Science School COLLEGE CODE MACS Swansea University Another institution paid the OA fee This work was supported by the National Natural Science Foundation of China with No. 61772394 (XM) and Scientifc Research Foundation for the Returned Overseas Chinese Scholars of Shaanxi Province with No. 2018003 (XM). Publication costs are founded by National Natural Science Foundation of China (No. 61772394) 2022-10-28T15:51:57.5462143 2021-08-29T18:41:49.9271161 Faculty of Science and Engineering School of Mathematics and Computer Science - Computer Science Jingjing Deng 1 Yan Wang 2 Zuheng Xia 3 Xianghua Xie 0000-0002-2701-8660 4 Maoguo Gong 5 57704__20712__3ca1e183fba1499c934711668de7814f.pdf s12859-021-04190-9.pdf 2021-08-29T18:43:34.4302066 Output 1319713 application/pdf Version of Record true © The Author(s) 2021. This article is licensed under a Creative Commons Attribution 4.0 International License true eng http://creativecommons.org/licenses/by/4.0/
title	TLGP: a flexible transfer learning algorithm for gene prioritization based on heterogeneous source domain
spellingShingle	TLGP: a flexible transfer learning algorithm for gene prioritization based on heterogeneous source domain Jingjing Deng Xianghua Xie
title_short	TLGP: a flexible transfer learning algorithm for gene prioritization based on heterogeneous source domain
title_full	TLGP: a flexible transfer learning algorithm for gene prioritization based on heterogeneous source domain
title_fullStr	TLGP: a flexible transfer learning algorithm for gene prioritization based on heterogeneous source domain
title_full_unstemmed	TLGP: a flexible transfer learning algorithm for gene prioritization based on heterogeneous source domain
title_sort	TLGP: a flexible transfer learning algorithm for gene prioritization based on heterogeneous source domain
author_id_str_mv	6f6d01d585363d6dc1622640bb4fcb3f b334d40963c7a2f435f06d2c26c74e11
author_id_fullname_str_mv	6f6d01d585363d6dc1622640bb4fcb3f_*_Jingjing Deng b334d40963c7a2f435f06d2c26c74e11_*_Xianghua Xie
author	Jingjing Deng Xianghua Xie
author2	Jingjing Deng Yan Wang Zuheng Xia Xianghua Xie Maoguo Gong
format	Journal article
container_title	BMC Bioinformatics
container_volume	22
container_issue	S9
publishDate	2021
institution	Swansea University
issn	1471-2105
doi_str_mv	10.1186/s12859-021-04190-9
publisher	Springer Science and Business Media LLC
college_str	Faculty of Science and Engineering
hierarchytype
hierarchy_top_id	facultyofscienceandengineering
hierarchy_top_title	Faculty of Science and Engineering
hierarchy_parent_id	facultyofscienceandengineering
hierarchy_parent_title	Faculty of Science and Engineering
department_str	School of Mathematics and Computer Science - Computer Science{{{_:::_}}}Faculty of Science and Engineering{{{_:::_}}}School of Mathematics and Computer Science - Computer Science
document_store_str	1
active_str	0
description	BackgroundGene prioritization (gene ranking) aims to obtain the centrality of genes, which is critical for cancer diagnosis and therapy since keys genes correspond to the biomarkers or targets of drugs. Great efforts have been devoted to the gene ranking problem by exploring the similarity between candidate and known disease-causing genes. However, when the number of disease-causing genes is limited, they are not applicable largely due to the low accuracy. Actually, the number of disease-causing genes for cancers, particularly for these rare cancers, are really limited. Therefore, there is a critical needed to design effective and efficient algorithms for gene ranking with limited prior disease-causing genes.ResultsIn this study, we propose a transfer learning based algorithm for gene prioritization (called TLGP) in the cancer (target domain) without disease-causing genes by transferring knowledge from other cancers (source domain). The underlying assumption is that knowledge shared by similar cancers improves the accuracy of gene prioritization. Specifically, TLGP first quantifies the similarity between the target and source domain by calculating the affinity matrix for genes. Then, TLGP automatically learns a fusion network for the target cancer by fusing affinity matrix, pathogenic genes and genomic data of source cancers. Finally, genes in the target cancer are prioritized. The experimental results indicate that the learnt fusion network is more reliable than gene co-expression network, implying that transferring knowledge from other cancers improves the accuracy of network construction. Moreover, TLGP outperforms state-of-the-art approaches in terms of accuracy, improving at least 5%.ConclusionThe proposed model and method provide an effective and efficient strategy for gene ranking by integrating genomic data from various cancers.
published_date	2021-08-25T04:55:59Z
_version_	1858705984553222144
score	11.09817

TLGP: a flexible transfer learning algorithm for gene prioritization based on heterogeneous source domain

Similar Items