No Cover Image

Journal article 622 views 66 downloads

Sentence Graph Attention for Content-Aware Summarization

Giovanni Siragusa Orcid Logo, Livio Robaldo Orcid Logo

Applied Sciences, Volume: 12, Issue: 20, Start page: 10382

Swansea University Author: Livio Robaldo Orcid Logo

  • applsci-12-10382(1).pdf

    PDF | Version of Record

    © 2022 by the authors. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license

    Download (858.45KB)

Check full text

DOI (Published version): 10.3390/app122010382

Abstract

Neural network-based encoder–decoder (ED) models are widely used for abstractive text summarization. While the encoder first reads the source document and embeds salient information, the decoder starts from such encoding to generate the summary word-by-word. However, the drawback of the ED model is...

Full description

Published in: Applied Sciences
ISSN: 2076-3417
Published: MDPI AG 2022
Online Access: Check full text

URI: https://cronfa.swan.ac.uk/Record/cronfa61559
first_indexed 2022-10-15T10:43:47Z
last_indexed 2023-01-13T19:22:23Z
id cronfa61559
recordtype SURis
fullrecord <?xml version="1.0"?><rfc1807><datestamp>2022-10-27T15:03:55.0934551</datestamp><bib-version>v2</bib-version><id>61559</id><entry>2022-10-15</entry><title>Sentence Graph Attention for Content-Aware Summarization</title><swanseaauthors><author><sid>b711cf9f3a7821ec52bd1e53b4f6cf9e</sid><ORCID>0000-0003-4713-8990</ORCID><firstname>Livio</firstname><surname>Robaldo</surname><name>Livio Robaldo</name><active>true</active><ethesisStudent>false</ethesisStudent></author></swanseaauthors><date>2022-10-15</date><deptcode>HRCL</deptcode><abstract>Neural network-based encoder&#x2013;decoder (ED) models are widely used for abstractive text summarization. While the encoder first reads the source document and embeds salient information, the decoder starts from such encoding to generate the summary word-by-word. However, the drawback of the ED model is that it treats words and sentences equally, without discerning the most relevant ones from the others. Many researchers have investigated this problem and provided different solutions. In this paper, we define a sentence-level attention mechanism based on the well-known PageRank algorithm to find the relevant sentences, then propagate the resulting scores into a second word-level attention layer. We tested the proposed model on the well-known CNN/Dailymail dataset, and found that it was able to generate summaries with a much higher abstractive power than state-of-the-art models, in spite of an unavoidable (but slight) decrease in terms of the Rouge scores.</abstract><type>Journal Article</type><journal>Applied Sciences</journal><volume>12</volume><journalNumber>20</journalNumber><paginationStart>10382</paginationStart><paginationEnd/><publisher>MDPI AG</publisher><placeOfPublication/><isbnPrint/><isbnElectronic/><issnPrint/><issnElectronic>2076-3417</issnElectronic><keywords>summarization; knowledge graph; neural networks; pagerank; natural language processing</keywords><publishedDay>14</publishedDay><publishedMonth>10</publishedMonth><publishedYear>2022</publishedYear><publishedDate>2022-10-14</publishedDate><doi>10.3390/app122010382</doi><url/><notes/><college>COLLEGE NANME</college><department>Hillary Rodham Clinton Law School</department><CollegeCode>COLLEGE CODE</CollegeCode><DepartmentCode>HRCL</DepartmentCode><institution>Swansea University</institution><apcterm>Another institution paid the OA fee</apcterm><funders>This research received no external funding.</funders><projectreference/><lastEdited>2022-10-27T15:03:55.0934551</lastEdited><Created>2022-10-15T11:38:34.2131134</Created><path><level id="1">Faculty of Humanities and Social Sciences</level><level id="2">Hilary Rodham Clinton School of Law</level></path><authors><author><firstname>Giovanni</firstname><surname>Siragusa</surname><orcid>0000-0002-1797-7956</orcid><order>1</order></author><author><firstname>Livio</firstname><surname>Robaldo</surname><orcid>0000-0003-4713-8990</orcid><order>2</order></author></authors><documents><document><filename>61559__25467__da3628c247fb40e7b04e6b1c79ae4945.pdf</filename><originalFilename>applsci-12-10382(1).pdf</originalFilename><uploaded>2022-10-15T11:41:47.1454486</uploaded><type>Output</type><contentLength>879054</contentLength><contentType>application/pdf</contentType><version>Version of Record</version><cronfaStatus>true</cronfaStatus><documentNotes>&#xA9; 2022 by the authors. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license</documentNotes><copyrightCorrect>true</copyrightCorrect><language>eng</language><licence>https://creativecommons.org/licenses/by/4.0/</licence></document></documents><OutputDurs/></rfc1807>
spelling 2022-10-27T15:03:55.0934551 v2 61559 2022-10-15 Sentence Graph Attention for Content-Aware Summarization b711cf9f3a7821ec52bd1e53b4f6cf9e 0000-0003-4713-8990 Livio Robaldo Livio Robaldo true false 2022-10-15 HRCL Neural network-based encoder–decoder (ED) models are widely used for abstractive text summarization. While the encoder first reads the source document and embeds salient information, the decoder starts from such encoding to generate the summary word-by-word. However, the drawback of the ED model is that it treats words and sentences equally, without discerning the most relevant ones from the others. Many researchers have investigated this problem and provided different solutions. In this paper, we define a sentence-level attention mechanism based on the well-known PageRank algorithm to find the relevant sentences, then propagate the resulting scores into a second word-level attention layer. We tested the proposed model on the well-known CNN/Dailymail dataset, and found that it was able to generate summaries with a much higher abstractive power than state-of-the-art models, in spite of an unavoidable (but slight) decrease in terms of the Rouge scores. Journal Article Applied Sciences 12 20 10382 MDPI AG 2076-3417 summarization; knowledge graph; neural networks; pagerank; natural language processing 14 10 2022 2022-10-14 10.3390/app122010382 COLLEGE NANME Hillary Rodham Clinton Law School COLLEGE CODE HRCL Swansea University Another institution paid the OA fee This research received no external funding. 2022-10-27T15:03:55.0934551 2022-10-15T11:38:34.2131134 Faculty of Humanities and Social Sciences Hilary Rodham Clinton School of Law Giovanni Siragusa 0000-0002-1797-7956 1 Livio Robaldo 0000-0003-4713-8990 2 61559__25467__da3628c247fb40e7b04e6b1c79ae4945.pdf applsci-12-10382(1).pdf 2022-10-15T11:41:47.1454486 Output 879054 application/pdf Version of Record true © 2022 by the authors. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license true eng https://creativecommons.org/licenses/by/4.0/
title Sentence Graph Attention for Content-Aware Summarization
spellingShingle Sentence Graph Attention for Content-Aware Summarization
Livio Robaldo
title_short Sentence Graph Attention for Content-Aware Summarization
title_full Sentence Graph Attention for Content-Aware Summarization
title_fullStr Sentence Graph Attention for Content-Aware Summarization
title_full_unstemmed Sentence Graph Attention for Content-Aware Summarization
title_sort Sentence Graph Attention for Content-Aware Summarization
author_id_str_mv b711cf9f3a7821ec52bd1e53b4f6cf9e
author_id_fullname_str_mv b711cf9f3a7821ec52bd1e53b4f6cf9e_***_Livio Robaldo
author Livio Robaldo
author2 Giovanni Siragusa
Livio Robaldo
format Journal article
container_title Applied Sciences
container_volume 12
container_issue 20
container_start_page 10382
publishDate 2022
institution Swansea University
issn 2076-3417
doi_str_mv 10.3390/app122010382
publisher MDPI AG
college_str Faculty of Humanities and Social Sciences
hierarchytype
hierarchy_top_id facultyofhumanitiesandsocialsciences
hierarchy_top_title Faculty of Humanities and Social Sciences
hierarchy_parent_id facultyofhumanitiesandsocialsciences
hierarchy_parent_title Faculty of Humanities and Social Sciences
department_str Hilary Rodham Clinton School of Law{{{_:::_}}}Faculty of Humanities and Social Sciences{{{_:::_}}}Hilary Rodham Clinton School of Law
document_store_str 1
active_str 0
description Neural network-based encoder–decoder (ED) models are widely used for abstractive text summarization. While the encoder first reads the source document and embeds salient information, the decoder starts from such encoding to generate the summary word-by-word. However, the drawback of the ED model is that it treats words and sentences equally, without discerning the most relevant ones from the others. Many researchers have investigated this problem and provided different solutions. In this paper, we define a sentence-level attention mechanism based on the well-known PageRank algorithm to find the relevant sentences, then propagate the resulting scores into a second word-level attention layer. We tested the proposed model on the well-known CNN/Dailymail dataset, and found that it was able to generate summaries with a much higher abstractive power than state-of-the-art models, in spite of an unavoidable (but slight) decrease in terms of the Rouge scores.
published_date 2022-10-14T05:29:18Z
_version_ 1822106906803568640
score 11.048302