Journal article 761 views 106 downloads
Re-Engineered Word Embeddings for Improved Document-Level Sentiment Analysis
Applied Sciences, Volume: 12, Issue: 18, Start page: 9287
Swansea University Author: Scott Yang
-
PDF | Version of Record
Copyright: © 2022 by the authors. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Download (1.77MB)
DOI (Published version): 10.3390/app12189287
Abstract
In this paper, a novel re-engineering mechanism for the generation of word embeddings is proposed for document-level sentiment analysis. Current approaches to sentiment analysis often integrate feature engineering with classification, without optimizing the feature vectors explicitly. Engineering fe...
Published in: | Applied Sciences |
---|---|
ISSN: | 2076-3417 |
Published: |
MDPI AG
2022
|
Online Access: |
Check full text
|
URI: | https://cronfa.swan.ac.uk/Record/cronfa61289 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
first_indexed |
2022-09-20T15:34:18Z |
---|---|
last_indexed |
2023-01-13T19:21:58Z |
id |
cronfa61289 |
recordtype |
SURis |
fullrecord |
<?xml version="1.0"?><rfc1807><datestamp>2022-10-12T14:26:59.5698930</datestamp><bib-version>v2</bib-version><id>61289</id><entry>2022-09-20</entry><title>Re-Engineered Word Embeddings for Improved Document-Level Sentiment Analysis</title><swanseaauthors><author><sid>81dc663ca0e68c60908d35b1d2ec3a9b</sid><ORCID>0000-0002-6618-7483</ORCID><firstname>Scott</firstname><surname>Yang</surname><name>Scott Yang</name><active>true</active><ethesisStudent>false</ethesisStudent></author></swanseaauthors><date>2022-09-20</date><deptcode>SCS</deptcode><abstract>In this paper, a novel re-engineering mechanism for the generation of word embeddings is proposed for document-level sentiment analysis. Current approaches to sentiment analysis often integrate feature engineering with classification, without optimizing the feature vectors explicitly. Engineering feature vectors to match the data between the training set and query sample as proposed in this paper could be a promising way for boosting the classification performance in machine learning applications. The proposed mechanism is designed to re-engineer the feature components from a set of embedding vectors for greatly increased between-class separation, hence better leveraging the informative content of the documents. The proposed mechanism was evaluated using four public benchmarking datasets for both two-way and five-way semantic classifications. The resulting embeddings have demonstrated substantially improved performance for a range of sentiment analysis tasks. Tests using all the four datasets achieved by far the best classification results compared with the state-of-the-art.</abstract><type>Journal Article</type><journal>Applied Sciences</journal><volume>12</volume><journalNumber>18</journalNumber><paginationStart>9287</paginationStart><paginationEnd/><publisher>MDPI AG</publisher><placeOfPublication/><isbnPrint/><isbnElectronic/><issnPrint/><issnElectronic>2076-3417</issnElectronic><keywords>sentiment analysis; semantic classification; feature re-engineering; NLP</keywords><publishedDay>16</publishedDay><publishedMonth>9</publishedMonth><publishedYear>2022</publishedYear><publishedDate>2022-09-16</publishedDate><doi>10.3390/app12189287</doi><url/><notes/><college>COLLEGE NANME</college><department>Computer Science</department><CollegeCode>COLLEGE CODE</CollegeCode><DepartmentCode>SCS</DepartmentCode><institution>Swansea University</institution><apcterm>SU College/Department paid the OA fee</apcterm><funders>Swansea University</funders><projectreference/><lastEdited>2022-10-12T14:26:59.5698930</lastEdited><Created>2022-09-20T16:28:03.8981891</Created><path><level id="1">Faculty of Science and Engineering</level><level id="2">School of Mathematics and Computer Science - Computer Science</level></path><authors><author><firstname>Scott</firstname><surname>Yang</surname><orcid>0000-0002-6618-7483</orcid><order>1</order></author><author><firstname>Farzin</firstname><surname>Deravi</surname><order>2</order></author></authors><documents><document><filename>61289__25165__0dbe035591024cd1b167a0c610d0441d.pdf</filename><originalFilename>61289.VOR.pdf</originalFilename><uploaded>2022-09-20T16:32:20.4680138</uploaded><type>Output</type><contentLength>1861110</contentLength><contentType>application/pdf</contentType><version>Version of Record</version><cronfaStatus>true</cronfaStatus><documentNotes>Copyright: © 2022 by the authors. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.</documentNotes><copyrightCorrect>true</copyrightCorrect><language>eng</language><licence>https://creativecommons.org/licenses/by/4.0/</licence></document></documents><OutputDurs/></rfc1807> |
spelling |
2022-10-12T14:26:59.5698930 v2 61289 2022-09-20 Re-Engineered Word Embeddings for Improved Document-Level Sentiment Analysis 81dc663ca0e68c60908d35b1d2ec3a9b 0000-0002-6618-7483 Scott Yang Scott Yang true false 2022-09-20 SCS In this paper, a novel re-engineering mechanism for the generation of word embeddings is proposed for document-level sentiment analysis. Current approaches to sentiment analysis often integrate feature engineering with classification, without optimizing the feature vectors explicitly. Engineering feature vectors to match the data between the training set and query sample as proposed in this paper could be a promising way for boosting the classification performance in machine learning applications. The proposed mechanism is designed to re-engineer the feature components from a set of embedding vectors for greatly increased between-class separation, hence better leveraging the informative content of the documents. The proposed mechanism was evaluated using four public benchmarking datasets for both two-way and five-way semantic classifications. The resulting embeddings have demonstrated substantially improved performance for a range of sentiment analysis tasks. Tests using all the four datasets achieved by far the best classification results compared with the state-of-the-art. Journal Article Applied Sciences 12 18 9287 MDPI AG 2076-3417 sentiment analysis; semantic classification; feature re-engineering; NLP 16 9 2022 2022-09-16 10.3390/app12189287 COLLEGE NANME Computer Science COLLEGE CODE SCS Swansea University SU College/Department paid the OA fee Swansea University 2022-10-12T14:26:59.5698930 2022-09-20T16:28:03.8981891 Faculty of Science and Engineering School of Mathematics and Computer Science - Computer Science Scott Yang 0000-0002-6618-7483 1 Farzin Deravi 2 61289__25165__0dbe035591024cd1b167a0c610d0441d.pdf 61289.VOR.pdf 2022-09-20T16:32:20.4680138 Output 1861110 application/pdf Version of Record true Copyright: © 2022 by the authors. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license. true eng https://creativecommons.org/licenses/by/4.0/ |
title |
Re-Engineered Word Embeddings for Improved Document-Level Sentiment Analysis |
spellingShingle |
Re-Engineered Word Embeddings for Improved Document-Level Sentiment Analysis Scott Yang |
title_short |
Re-Engineered Word Embeddings for Improved Document-Level Sentiment Analysis |
title_full |
Re-Engineered Word Embeddings for Improved Document-Level Sentiment Analysis |
title_fullStr |
Re-Engineered Word Embeddings for Improved Document-Level Sentiment Analysis |
title_full_unstemmed |
Re-Engineered Word Embeddings for Improved Document-Level Sentiment Analysis |
title_sort |
Re-Engineered Word Embeddings for Improved Document-Level Sentiment Analysis |
author_id_str_mv |
81dc663ca0e68c60908d35b1d2ec3a9b |
author_id_fullname_str_mv |
81dc663ca0e68c60908d35b1d2ec3a9b_***_Scott Yang |
author |
Scott Yang |
author2 |
Scott Yang Farzin Deravi |
format |
Journal article |
container_title |
Applied Sciences |
container_volume |
12 |
container_issue |
18 |
container_start_page |
9287 |
publishDate |
2022 |
institution |
Swansea University |
issn |
2076-3417 |
doi_str_mv |
10.3390/app12189287 |
publisher |
MDPI AG |
college_str |
Faculty of Science and Engineering |
hierarchytype |
|
hierarchy_top_id |
facultyofscienceandengineering |
hierarchy_top_title |
Faculty of Science and Engineering |
hierarchy_parent_id |
facultyofscienceandengineering |
hierarchy_parent_title |
Faculty of Science and Engineering |
department_str |
School of Mathematics and Computer Science - Computer Science{{{_:::_}}}Faculty of Science and Engineering{{{_:::_}}}School of Mathematics and Computer Science - Computer Science |
document_store_str |
1 |
active_str |
0 |
description |
In this paper, a novel re-engineering mechanism for the generation of word embeddings is proposed for document-level sentiment analysis. Current approaches to sentiment analysis often integrate feature engineering with classification, without optimizing the feature vectors explicitly. Engineering feature vectors to match the data between the training set and query sample as proposed in this paper could be a promising way for boosting the classification performance in machine learning applications. The proposed mechanism is designed to re-engineer the feature components from a set of embedding vectors for greatly increased between-class separation, hence better leveraging the informative content of the documents. The proposed mechanism was evaluated using four public benchmarking datasets for both two-way and five-way semantic classifications. The resulting embeddings have demonstrated substantially improved performance for a range of sentiment analysis tasks. Tests using all the four datasets achieved by far the best classification results compared with the state-of-the-art. |
published_date |
2022-09-16T04:20:01Z |
_version_ |
1763754326834020352 |
score |
11.035634 |