SwanNLP at SemEval-2026 Task 5: An LLM-based Framework for Plausibility Scoring in Narrative Word Sense Disambiguation

Sumanathilaka, Deshan; Micallef, Nicholas; Hough, Julian; Don, Saman Galgodage

doi:https://doi.org/

Conference Paper/Proceeding/Abstract 146 views

SwanNLP at SemEval-2026 Task 5: An LLM-based Framework for Plausibility Scoring in Narrative Word Sense Disambiguation

Deshan Sumanathilaka

, Nicholas Micallef

, Julian Hough

, Saman Galgodage Don

Proceedings of the 20th International Workshop on Semantic Evaluation (SemEval-2026)

Swansea University Authors: Deshan Sumanathilaka , Nicholas Micallef , Julian Hough , Saman Galgodage Don

Abstract

Recent advances in language models have substantially improved Natural Language Understanding (NLU). Although widely used benchmarks suggest that Large Language Models (LLMs) can effectively disambiguate, their practical applicability in real-world narrative contexts remains underexplored. SemEval-2...

Full description

Published in:	Proceedings of the 20th International Workshop on Semantic Evaluation (SemEval-2026)
Published:	San Diego, California, United States ACL Anthology
URI:	https://cronfa.swan.ac.uk/Record/cronfa71784

first_indexed	2026-04-22T16:08:42Z
last_indexed	2026-05-16T05:22:44Z
id	cronfa71784
recordtype	SURis
fullrecord	<?xml version="1.0"?><rfc1807><datestamp>2026-05-15T13:14:54.3983083</datestamp><bib-version>v2</bib-version><id>71784</id><entry>2026-04-22</entry><title>SwanNLP at SemEval-2026 Task 5: An LLM-based Framework for Plausibility Scoring in Narrative Word Sense Disambiguation</title><swanseaauthors><author><sid>2fe44f0c1e7d845dc21bb6b00d5b2085</sid><ORCID>0009-0005-8933-6559</ORCID><firstname>Deshan</firstname><surname>Sumanathilaka</surname><name>Deshan Sumanathilaka</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>1cc4c84582d665b7ee08fb16f5454671</sid><ORCID>0000-0002-2683-8042</ORCID><firstname>Nicholas</firstname><surname>Micallef</surname><name>Nicholas Micallef</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>082d773ae261d2bbf49434dd2608ab40</sid><ORCID>0000-0002-4345-6759</ORCID><firstname>Julian</firstname><surname>Hough</surname><name>Julian Hough</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>116b635ad8617f7bb8bb56ac9d3b72b6</sid><ORCID/><firstname>Saman</firstname><surname>Galgodage Don</surname><name>Saman Galgodage Don</name><active>true</active><ethesisStudent>false</ethesisStudent></author></swanseaauthors><date>2026-04-22</date><deptcode>MACS</deptcode><abstract>Recent advances in language models have substantially improved Natural Language Understanding (NLU). Although widely used benchmarks suggest that Large Language Models (LLMs) can effectively disambiguate, their practical applicability in real-world narrative contexts remains underexplored. SemEval-2026 Task 5 addresses this gap by introducing a task that predicts the human-perceived plausibility of a word sense within a short story. In this work, we propose an LLM-based framework for plausibility scoring of homonymous word senses in narrative texts using a structured reasoning mechanism. We examine the impact of fine-tuning low-parameter LLMs with diverse reasoning strategies, alongside dynamic few-shot prompting for large-parameter models, on accurate sense identification and plausibility estimation. Our results show that commercial large-parameter LLMs with dynamic few-shot prompting closely replicate human-like plausibility judgments. Furthermore, model ensembling slightly improves performance, better simulating the agreement patterns of five human annotators compared to single-model predictions.</abstract><type>Conference Paper/Proceeding/Abstract</type><journal>Proceedings of the 20th International Workshop on Semantic Evaluation (SemEval-2026)</journal><volume/><journalNumber/><paginationStart/><paginationEnd/><publisher>ACL Anthology</publisher><placeOfPublication>San Diego, California, United States</placeOfPublication><isbnPrint/><isbnElectronic/><issnPrint/><issnElectronic/><keywords/><publishedDay>0</publishedDay><publishedMonth>0</publishedMonth><publishedYear>0</publishedYear><publishedDate>0001-01-01</publishedDate><doi/><url/><notes/><college>COLLEGE NANME</college><department>Mathematics and Computer Science School</department><CollegeCode>COLLEGE CODE</CollegeCode><DepartmentCode>MACS</DepartmentCode><institution>Swansea University</institution><apcterm/><funders/><projectreference/><lastEdited>2026-05-15T13:14:54.3983083</lastEdited><Created>2026-04-22T17:03:40.7602233</Created><path><level id="1">Faculty of Science and Engineering</level><level id="2">School of Mathematics and Computer Science - Computer Science</level></path><authors><author><firstname>Deshan</firstname><surname>Sumanathilaka</surname><orcid>0009-0005-8933-6559</orcid><order>1</order></author><author><firstname>Nicholas</firstname><surname>Micallef</surname><orcid>0000-0002-2683-8042</orcid><order>2</order></author><author><firstname>Julian</firstname><surname>Hough</surname><orcid>0000-0002-4345-6759</orcid><order>3</order></author><author><firstname>Saman</firstname><surname>Galgodage Don</surname><orcid/><order>4</order></author></authors><documents/><OutputDurs/></rfc1807>
spelling	2026-05-15T13:14:54.3983083 v2 71784 2026-04-22 SwanNLP at SemEval-2026 Task 5: An LLM-based Framework for Plausibility Scoring in Narrative Word Sense Disambiguation 2fe44f0c1e7d845dc21bb6b00d5b2085 0009-0005-8933-6559 Deshan Sumanathilaka Deshan Sumanathilaka true false 1cc4c84582d665b7ee08fb16f5454671 0000-0002-2683-8042 Nicholas Micallef Nicholas Micallef true false 082d773ae261d2bbf49434dd2608ab40 0000-0002-4345-6759 Julian Hough Julian Hough true false 116b635ad8617f7bb8bb56ac9d3b72b6 Saman Galgodage Don Saman Galgodage Don true false 2026-04-22 MACS Recent advances in language models have substantially improved Natural Language Understanding (NLU). Although widely used benchmarks suggest that Large Language Models (LLMs) can effectively disambiguate, their practical applicability in real-world narrative contexts remains underexplored. SemEval-2026 Task 5 addresses this gap by introducing a task that predicts the human-perceived plausibility of a word sense within a short story. In this work, we propose an LLM-based framework for plausibility scoring of homonymous word senses in narrative texts using a structured reasoning mechanism. We examine the impact of fine-tuning low-parameter LLMs with diverse reasoning strategies, alongside dynamic few-shot prompting for large-parameter models, on accurate sense identification and plausibility estimation. Our results show that commercial large-parameter LLMs with dynamic few-shot prompting closely replicate human-like plausibility judgments. Furthermore, model ensembling slightly improves performance, better simulating the agreement patterns of five human annotators compared to single-model predictions. Conference Paper/Proceeding/Abstract Proceedings of the 20th International Workshop on Semantic Evaluation (SemEval-2026) ACL Anthology San Diego, California, United States 0 0 0 0001-01-01 COLLEGE NANME Mathematics and Computer Science School COLLEGE CODE MACS Swansea University 2026-05-15T13:14:54.3983083 2026-04-22T17:03:40.7602233 Faculty of Science and Engineering School of Mathematics and Computer Science - Computer Science Deshan Sumanathilaka 0009-0005-8933-6559 1 Nicholas Micallef 0000-0002-2683-8042 2 Julian Hough 0000-0002-4345-6759 3 Saman Galgodage Don 4
title	SwanNLP at SemEval-2026 Task 5: An LLM-based Framework for Plausibility Scoring in Narrative Word Sense Disambiguation
spellingShingle	SwanNLP at SemEval-2026 Task 5: An LLM-based Framework for Plausibility Scoring in Narrative Word Sense Disambiguation Deshan Sumanathilaka Nicholas Micallef Julian Hough Saman Galgodage Don
title_short	SwanNLP at SemEval-2026 Task 5: An LLM-based Framework for Plausibility Scoring in Narrative Word Sense Disambiguation
title_full	SwanNLP at SemEval-2026 Task 5: An LLM-based Framework for Plausibility Scoring in Narrative Word Sense Disambiguation
title_fullStr	SwanNLP at SemEval-2026 Task 5: An LLM-based Framework for Plausibility Scoring in Narrative Word Sense Disambiguation
title_full_unstemmed	SwanNLP at SemEval-2026 Task 5: An LLM-based Framework for Plausibility Scoring in Narrative Word Sense Disambiguation
title_sort	SwanNLP at SemEval-2026 Task 5: An LLM-based Framework for Plausibility Scoring in Narrative Word Sense Disambiguation
author_id_str_mv	2fe44f0c1e7d845dc21bb6b00d5b2085 1cc4c84582d665b7ee08fb16f5454671 082d773ae261d2bbf49434dd2608ab40 116b635ad8617f7bb8bb56ac9d3b72b6
author_id_fullname_str_mv	2fe44f0c1e7d845dc21bb6b00d5b2085_*_Deshan Sumanathilaka 1cc4c84582d665b7ee08fb16f5454671__Nicholas Micallef 082d773ae261d2bbf49434dd2608ab40__Julian Hough 116b635ad8617f7bb8bb56ac9d3b72b6_*_Saman Galgodage Don
author	Deshan Sumanathilaka Nicholas Micallef Julian Hough Saman Galgodage Don
author2	Deshan Sumanathilaka Nicholas Micallef Julian Hough Saman Galgodage Don
format	Conference Paper/Proceeding/Abstract
container_title	Proceedings of the 20th International Workshop on Semantic Evaluation (SemEval-2026)
institution	Swansea University
publisher	ACL Anthology
college_str	Faculty of Science and Engineering
hierarchytype
hierarchy_top_id	facultyofscienceandengineering
hierarchy_top_title	Faculty of Science and Engineering
hierarchy_parent_id	facultyofscienceandengineering
hierarchy_parent_title	Faculty of Science and Engineering
department_str	School of Mathematics and Computer Science - Computer Science{{{_:::_}}}Faculty of Science and Engineering{{{_:::_}}}School of Mathematics and Computer Science - Computer Science
document_store_str	0
active_str	0
description	Recent advances in language models have substantially improved Natural Language Understanding (NLU). Although widely used benchmarks suggest that Large Language Models (LLMs) can effectively disambiguate, their practical applicability in real-world narrative contexts remains underexplored. SemEval-2026 Task 5 addresses this gap by introducing a task that predicts the human-perceived plausibility of a word sense within a short story. In this work, we propose an LLM-based framework for plausibility scoring of homonymous word senses in narrative texts using a structured reasoning mechanism. We examine the impact of fine-tuning low-parameter LLMs with diverse reasoning strategies, alongside dynamic few-shot prompting for large-parameter models, on accurate sense identification and plausibility estimation. Our results show that commercial large-parameter LLMs with dynamic few-shot prompting closely replicate human-like plausibility judgments. Furthermore, model ensembling slightly improves performance, better simulating the agreement patterns of five human annotators compared to single-model predictions.
published_date	0001-01-01T07:25:26Z
_version_	1866955936486129664
score	11.106836

SwanNLP at SemEval-2026 Task 5: An LLM-based Framework for Plausibility Scoring in Narrative Word Sense Disambiguation

Similar Items