Sentence Graph Attention for Content-Aware Summarization

Siragusa, Giovanni; Robaldo, Livio

doi:10.3390/app122010382

Journal article 704 views 72 downloads

Sentence Graph Attention for Content-Aware Summarization

Giovanni Siragusa

, Livio Robaldo

Applied Sciences, Volume: 12, Issue: 20, Start page: 10382

Swansea University Author: Livio Robaldo

PDF | Version of Record

© 2022 by the authors. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license
Download (858.45KB)

Check full text

DOI (Published version): 10.3390/app122010382

Abstract

Neural network-based encoder–decoder (ED) models are widely used for abstractive text summarization. While the encoder first reads the source document and embeds salient information, the decoder starts from such encoding to generate the summary word-by-word. However, the drawback of the ED model is...

Full description

Published in:	Applied Sciences
ISSN:	2076-3417
Published:	MDPI AG 2022
Online Access:	Check full text
URI:	https://cronfa.swan.ac.uk/Record/cronfa61559

first_indexed	2022-10-15T10:43:47Z
last_indexed	2023-01-13T19:22:23Z
id	cronfa61559
recordtype	SURis
fullrecord	<?xml version="1.0"?><rfc1807><datestamp>2022-10-27T15:03:55.0934551</datestamp><bib-version>v2</bib-version><id>61559</id><entry>2022-10-15</entry><title>Sentence Graph Attention for Content-Aware Summarization</title><swanseaauthors><author><sid>b711cf9f3a7821ec52bd1e53b4f6cf9e</sid><ORCID>0000-0003-4713-8990</ORCID><firstname>Livio</firstname><surname>Robaldo</surname><name>Livio Robaldo</name><active>true</active><ethesisStudent>false</ethesisStudent></author></swanseaauthors><date>2022-10-15</date><deptcode>HRCL</deptcode><abstract>Neural network-based encoder–decoder (ED) models are widely used for abstractive text summarization. While the encoder first reads the source document and embeds salient information, the decoder starts from such encoding to generate the summary word-by-word. However, the drawback of the ED model is that it treats words and sentences equally, without discerning the most relevant ones from the others. Many researchers have investigated this problem and provided different solutions. In this paper, we define a sentence-level attention mechanism based on the well-known PageRank algorithm to find the relevant sentences, then propagate the resulting scores into a second word-level attention layer. We tested the proposed model on the well-known CNN/Dailymail dataset, and found that it was able to generate summaries with a much higher abstractive power than state-of-the-art models, in spite of an unavoidable (but slight) decrease in terms of the Rouge scores.</abstract><type>Journal Article</type><journal>Applied Sciences</journal><volume>12</volume><journalNumber>20</journalNumber><paginationStart>10382</paginationStart><paginationEnd/><publisher>MDPI AG</publisher><placeOfPublication/><isbnPrint/><isbnElectronic/><issnPrint/><issnElectronic>2076-3417</issnElectronic><keywords>summarization; knowledge graph; neural networks; pagerank; natural language processing</keywords><publishedDay>14</publishedDay><publishedMonth>10</publishedMonth><publishedYear>2022</publishedYear><publishedDate>2022-10-14</publishedDate><doi>10.3390/app122010382</doi><url/><notes/><college>COLLEGE NANME</college><department>Hillary Rodham Clinton Law School</department><CollegeCode>COLLEGE CODE</CollegeCode><DepartmentCode>HRCL</DepartmentCode><institution>Swansea University</institution><apcterm>Another institution paid the OA fee</apcterm><funders>This research received no external funding.</funders><projectreference/><lastEdited>2022-10-27T15:03:55.0934551</lastEdited><Created>2022-10-15T11:38:34.2131134</Created><path><level id="1">Faculty of Humanities and Social Sciences</level><level id="2">Hilary Rodham Clinton School of Law</level></path><authors><author><firstname>Giovanni</firstname><surname>Siragusa</surname><orcid>0000-0002-1797-7956</orcid><order>1</order></author><author><firstname>Livio</firstname><surname>Robaldo</surname><orcid>0000-0003-4713-8990</orcid><order>2</order></author></authors><documents><document><filename>61559__25467__da3628c247fb40e7b04e6b1c79ae4945.pdf</filename><originalFilename>applsci-12-10382(1).pdf</originalFilename><uploaded>2022-10-15T11:41:47.1454486</uploaded><type>Output</type><contentLength>879054</contentLength><contentType>application/pdf</contentType><version>Version of Record</version><cronfaStatus>true</cronfaStatus><documentNotes>© 2022 by the authors. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license</documentNotes><copyrightCorrect>true</copyrightCorrect><language>eng</language><licence>https://creativecommons.org/licenses/by/4.0/</licence></document></documents><OutputDurs/></rfc1807>
spelling	2022-10-27T15:03:55.0934551 v2 61559 2022-10-15 Sentence Graph Attention for Content-Aware Summarization b711cf9f3a7821ec52bd1e53b4f6cf9e 0000-0003-4713-8990 Livio Robaldo Livio Robaldo true false 2022-10-15 HRCL Neural network-based encoder–decoder (ED) models are widely used for abstractive text summarization. While the encoder first reads the source document and embeds salient information, the decoder starts from such encoding to generate the summary word-by-word. However, the drawback of the ED model is that it treats words and sentences equally, without discerning the most relevant ones from the others. Many researchers have investigated this problem and provided different solutions. In this paper, we define a sentence-level attention mechanism based on the well-known PageRank algorithm to find the relevant sentences, then propagate the resulting scores into a second word-level attention layer. We tested the proposed model on the well-known CNN/Dailymail dataset, and found that it was able to generate summaries with a much higher abstractive power than state-of-the-art models, in spite of an unavoidable (but slight) decrease in terms of the Rouge scores. Journal Article Applied Sciences 12 20 10382 MDPI AG 2076-3417 summarization; knowledge graph; neural networks; pagerank; natural language processing 14 10 2022 2022-10-14 10.3390/app122010382 COLLEGE NANME Hillary Rodham Clinton Law School COLLEGE CODE HRCL Swansea University Another institution paid the OA fee This research received no external funding. 2022-10-27T15:03:55.0934551 2022-10-15T11:38:34.2131134 Faculty of Humanities and Social Sciences Hilary Rodham Clinton School of Law Giovanni Siragusa 0000-0002-1797-7956 1 Livio Robaldo 0000-0003-4713-8990 2 61559__25467__da3628c247fb40e7b04e6b1c79ae4945.pdf applsci-12-10382(1).pdf 2022-10-15T11:41:47.1454486 Output 879054 application/pdf Version of Record true © 2022 by the authors. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license true eng https://creativecommons.org/licenses/by/4.0/
title	Sentence Graph Attention for Content-Aware Summarization
spellingShingle	Sentence Graph Attention for Content-Aware Summarization Livio Robaldo
title_short	Sentence Graph Attention for Content-Aware Summarization
title_full	Sentence Graph Attention for Content-Aware Summarization
title_fullStr	Sentence Graph Attention for Content-Aware Summarization
title_full_unstemmed	Sentence Graph Attention for Content-Aware Summarization
title_sort	Sentence Graph Attention for Content-Aware Summarization
author_id_str_mv	b711cf9f3a7821ec52bd1e53b4f6cf9e
author_id_fullname_str_mv	b711cf9f3a7821ec52bd1e53b4f6cf9e_***_Livio Robaldo
author	Livio Robaldo
author2	Giovanni Siragusa Livio Robaldo
format	Journal article
container_title	Applied Sciences
container_volume	12
container_issue	20
container_start_page	10382
publishDate	2022
institution	Swansea University
issn	2076-3417
doi_str_mv	10.3390/app122010382
publisher	MDPI AG
college_str	Faculty of Humanities and Social Sciences
hierarchytype
hierarchy_top_id	facultyofhumanitiesandsocialsciences
hierarchy_top_title	Faculty of Humanities and Social Sciences
hierarchy_parent_id	facultyofhumanitiesandsocialsciences
hierarchy_parent_title	Faculty of Humanities and Social Sciences
department_str	Hilary Rodham Clinton School of Law{{{_:::_}}}Faculty of Humanities and Social Sciences{{{_:::_}}}Hilary Rodham Clinton School of Law
document_store_str	1
active_str	0
description	Neural network-based encoder–decoder (ED) models are widely used for abstractive text summarization. While the encoder first reads the source document and embeds salient information, the decoder starts from such encoding to generate the summary word-by-word. However, the drawback of the ED model is that it treats words and sentences equally, without discerning the most relevant ones from the others. Many researchers have investigated this problem and provided different solutions. In this paper, we define a sentence-level attention mechanism based on the well-known PageRank algorithm to find the relevant sentences, then propagate the resulting scores into a second word-level attention layer. We tested the proposed model on the well-known CNN/Dailymail dataset, and found that it was able to generate summaries with a much higher abstractive power than state-of-the-art models, in spite of an unavoidable (but slight) decrease in terms of the Rouge scores.
published_date	2022-10-14T09:08:39Z
_version_	1828368123152039936
score	11.057753

Sentence Graph Attention for Content-Aware Summarization

Similar Items