No Cover Image

Conference Paper/Proceeding/Abstract 601 views 53 downloads

Opportunities and Challenges of Automatic Speech Recognition Systems for Low-Resource Language Speakers

Thomas Reitmaier, Electra Wallington, Dani Kalarikalayil Raju, Ondrej Klejch, Jennifer Pearson, Matt Jones Orcid Logo, Peter Bell, Simon Robinson Orcid Logo, Jen Pearson Orcid Logo

CHI Conference on Human Factors in Computing Systems (CHI '22), April 29–May 5, 2022, New Orleans, LA, USA. ACM, New York, NY, USA, Pages: 1 - 17

Swansea University Authors: Thomas Reitmaier, Matt Jones Orcid Logo, Simon Robinson Orcid Logo, Jen Pearson Orcid Logo

  • chi22-533.pdf

    PDF | Version of Record

    Distributed under the terms of a Creative Commons Attribution 4.0 (CC-BY) Licence.

    Download (1.77MB)

DOI (Published version): 10.1145/3491102.3517639

Abstract

Automatic Speech Recognition (ASR) researchers are turning their attention towards supporting low-resource languages, such as isiXhosa or Marathi, with only limited training resources. We report and reflect on collaborative research across ASR & HCI to situate ASR-enabled technologies to suit th...

Full description

Published in: CHI Conference on Human Factors in Computing Systems (CHI '22), April 29–May 5, 2022, New Orleans, LA, USA. ACM, New York, NY, USA
ISBN: 978-1-4503-9157-3
Published: New York, NY, USA ACM Digital Library 2022
URI: https://cronfa.swan.ac.uk/Record/cronfa59573
Tags: Add Tag
No Tags, Be the first to tag this record!
first_indexed 2022-03-10T14:58:46Z
last_indexed 2023-01-11T14:40:57Z
id cronfa59573
recordtype SURis
fullrecord <?xml version="1.0"?><rfc1807><datestamp>2022-10-31T14:24:29.6257772</datestamp><bib-version>v2</bib-version><id>59573</id><entry>2022-03-10</entry><title>Opportunities and Challenges of Automatic Speech Recognition Systems for Low-Resource Language Speakers</title><swanseaauthors><author><sid>ccd66b64d11d76b9cd8b28e9d42a0ff0</sid><firstname>Thomas</firstname><surname>Reitmaier</surname><name>Thomas Reitmaier</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>10b46d7843c2ba53d116ca2ed9abb56e</sid><ORCID>0000-0001-7657-7373</ORCID><firstname>Matt</firstname><surname>Jones</surname><name>Matt Jones</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>cb3b57a21fa4e48ec633d6ba46455e91</sid><ORCID>0000-0001-9228-006X</ORCID><firstname>Simon</firstname><surname>Robinson</surname><name>Simon Robinson</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>6d662d9e2151b302ed384b243e2a802f</sid><ORCID>0000-0002-1960-1012</ORCID><firstname>Jen</firstname><surname>Pearson</surname><name>Jen Pearson</name><active>true</active><ethesisStudent>false</ethesisStudent></author></swanseaauthors><date>2022-03-10</date><deptcode>SCS</deptcode><abstract>Automatic Speech Recognition (ASR) researchers are turning their attention towards supporting low-resource languages, such as isiXhosa or Marathi, with only limited training resources. We report and reflect on collaborative research across ASR &amp; HCI to situate ASR-enabled technologies to suit the needs and functions of two communities of low-resource language speakers, on the outskirts of Cape Town, South Africa and in Mumbai, India. We build on longstanding community partnerships and draw on linguistics, media studies and HCI scholarship to guide our research. We demonstrate diverse design methods to: remotely engage participants; collect speech data to test ASR models; and ultimately field-test models with users. Reflecting on the research, we identify opportunities, challenges, and use-cases of ASR, in particular to support pervasive use of WhatsApp voice messaging. Finally, we uncover implications for collaborations across ASR &amp; HCI that advance important discussions at CHI surrounding data, ethics, and AI.</abstract><type>Conference Paper/Proceeding/Abstract</type><journal>CHI Conference on Human Factors in Computing Systems (CHI '22), April 29&#x2013;May 5, 2022, New Orleans, LA, USA. ACM, New York, NY, USA</journal><volume/><journalNumber/><paginationStart>1</paginationStart><paginationEnd>17</paginationEnd><publisher>ACM Digital Library</publisher><placeOfPublication>New York, NY, USA</placeOfPublication><isbnPrint>978-1-4503-9157-3</isbnPrint><isbnElectronic/><issnPrint/><issnElectronic/><keywords>Speech/language, automatic speech recognition, mobile devices</keywords><publishedDay>29</publishedDay><publishedMonth>4</publishedMonth><publishedYear>2022</publishedYear><publishedDate>2022-04-29</publishedDate><doi>10.1145/3491102.3517639</doi><url/><notes/><college>COLLEGE NANME</college><department>Computer Science</department><CollegeCode>COLLEGE CODE</CollegeCode><DepartmentCode>SCS</DepartmentCode><institution>Swansea University</institution><apcterm>External research funder(s) paid the OA fee (includes OA grants disbursed by the Library)</apcterm><funders>UKRI</funders><projectreference>EP/T024976/1</projectreference><lastEdited>2022-10-31T14:24:29.6257772</lastEdited><Created>2022-03-10T14:57:32.7140042</Created><path><level id="1">Faculty of Science and Engineering</level><level id="2">School of Mathematics and Computer Science - Computer Science</level></path><authors><author><firstname>Thomas</firstname><surname>Reitmaier</surname><order>1</order></author><author><firstname>Electra</firstname><surname>Wallington</surname><order>2</order></author><author><firstname>Dani Kalarikalayil</firstname><surname>Raju</surname><order>3</order></author><author><firstname>Ondrej</firstname><surname>Klejch</surname><order>4</order></author><author><firstname>Jennifer</firstname><surname>Pearson</surname><order>5</order></author><author><firstname>Matt</firstname><surname>Jones</surname><orcid>0000-0001-7657-7373</orcid><order>6</order></author><author><firstname>Peter</firstname><surname>Bell</surname><order>7</order></author><author><firstname>Simon</firstname><surname>Robinson</surname><orcid>0000-0001-9228-006X</orcid><order>8</order></author><author><firstname>Jen</firstname><surname>Pearson</surname><orcid>0000-0002-1960-1012</orcid><order>9</order></author></authors><documents><document><filename>59573__22666__12859841e0c34bc9949e1c476fc39f76.pdf</filename><originalFilename>chi22-533.pdf</originalFilename><uploaded>2022-03-24T14:23:03.5419140</uploaded><type>Output</type><contentLength>1854554</contentLength><contentType>application/pdf</contentType><version>Version of Record</version><cronfaStatus>true</cronfaStatus><documentNotes>Distributed under the terms of a Creative Commons Attribution 4.0 (CC-BY) Licence.</documentNotes><copyrightCorrect>true</copyrightCorrect><language>eng</language><licence>https://creativecommons.org/licenses/by/4.0/</licence></document></documents><OutputDurs/></rfc1807>
spelling 2022-10-31T14:24:29.6257772 v2 59573 2022-03-10 Opportunities and Challenges of Automatic Speech Recognition Systems for Low-Resource Language Speakers ccd66b64d11d76b9cd8b28e9d42a0ff0 Thomas Reitmaier Thomas Reitmaier true false 10b46d7843c2ba53d116ca2ed9abb56e 0000-0001-7657-7373 Matt Jones Matt Jones true false cb3b57a21fa4e48ec633d6ba46455e91 0000-0001-9228-006X Simon Robinson Simon Robinson true false 6d662d9e2151b302ed384b243e2a802f 0000-0002-1960-1012 Jen Pearson Jen Pearson true false 2022-03-10 SCS Automatic Speech Recognition (ASR) researchers are turning their attention towards supporting low-resource languages, such as isiXhosa or Marathi, with only limited training resources. We report and reflect on collaborative research across ASR & HCI to situate ASR-enabled technologies to suit the needs and functions of two communities of low-resource language speakers, on the outskirts of Cape Town, South Africa and in Mumbai, India. We build on longstanding community partnerships and draw on linguistics, media studies and HCI scholarship to guide our research. We demonstrate diverse design methods to: remotely engage participants; collect speech data to test ASR models; and ultimately field-test models with users. Reflecting on the research, we identify opportunities, challenges, and use-cases of ASR, in particular to support pervasive use of WhatsApp voice messaging. Finally, we uncover implications for collaborations across ASR & HCI that advance important discussions at CHI surrounding data, ethics, and AI. Conference Paper/Proceeding/Abstract CHI Conference on Human Factors in Computing Systems (CHI '22), April 29–May 5, 2022, New Orleans, LA, USA. ACM, New York, NY, USA 1 17 ACM Digital Library New York, NY, USA 978-1-4503-9157-3 Speech/language, automatic speech recognition, mobile devices 29 4 2022 2022-04-29 10.1145/3491102.3517639 COLLEGE NANME Computer Science COLLEGE CODE SCS Swansea University External research funder(s) paid the OA fee (includes OA grants disbursed by the Library) UKRI EP/T024976/1 2022-10-31T14:24:29.6257772 2022-03-10T14:57:32.7140042 Faculty of Science and Engineering School of Mathematics and Computer Science - Computer Science Thomas Reitmaier 1 Electra Wallington 2 Dani Kalarikalayil Raju 3 Ondrej Klejch 4 Jennifer Pearson 5 Matt Jones 0000-0001-7657-7373 6 Peter Bell 7 Simon Robinson 0000-0001-9228-006X 8 Jen Pearson 0000-0002-1960-1012 9 59573__22666__12859841e0c34bc9949e1c476fc39f76.pdf chi22-533.pdf 2022-03-24T14:23:03.5419140 Output 1854554 application/pdf Version of Record true Distributed under the terms of a Creative Commons Attribution 4.0 (CC-BY) Licence. true eng https://creativecommons.org/licenses/by/4.0/
title Opportunities and Challenges of Automatic Speech Recognition Systems for Low-Resource Language Speakers
spellingShingle Opportunities and Challenges of Automatic Speech Recognition Systems for Low-Resource Language Speakers
Thomas Reitmaier
Matt Jones
Simon Robinson
Jen Pearson
title_short Opportunities and Challenges of Automatic Speech Recognition Systems for Low-Resource Language Speakers
title_full Opportunities and Challenges of Automatic Speech Recognition Systems for Low-Resource Language Speakers
title_fullStr Opportunities and Challenges of Automatic Speech Recognition Systems for Low-Resource Language Speakers
title_full_unstemmed Opportunities and Challenges of Automatic Speech Recognition Systems for Low-Resource Language Speakers
title_sort Opportunities and Challenges of Automatic Speech Recognition Systems for Low-Resource Language Speakers
author_id_str_mv ccd66b64d11d76b9cd8b28e9d42a0ff0
10b46d7843c2ba53d116ca2ed9abb56e
cb3b57a21fa4e48ec633d6ba46455e91
6d662d9e2151b302ed384b243e2a802f
author_id_fullname_str_mv ccd66b64d11d76b9cd8b28e9d42a0ff0_***_Thomas Reitmaier
10b46d7843c2ba53d116ca2ed9abb56e_***_Matt Jones
cb3b57a21fa4e48ec633d6ba46455e91_***_Simon Robinson
6d662d9e2151b302ed384b243e2a802f_***_Jen Pearson
author Thomas Reitmaier
Matt Jones
Simon Robinson
Jen Pearson
author2 Thomas Reitmaier
Electra Wallington
Dani Kalarikalayil Raju
Ondrej Klejch
Jennifer Pearson
Matt Jones
Peter Bell
Simon Robinson
Jen Pearson
format Conference Paper/Proceeding/Abstract
container_title CHI Conference on Human Factors in Computing Systems (CHI '22), April 29–May 5, 2022, New Orleans, LA, USA. ACM, New York, NY, USA
container_start_page 1
publishDate 2022
institution Swansea University
isbn 978-1-4503-9157-3
doi_str_mv 10.1145/3491102.3517639
publisher ACM Digital Library
college_str Faculty of Science and Engineering
hierarchytype
hierarchy_top_id facultyofscienceandengineering
hierarchy_top_title Faculty of Science and Engineering
hierarchy_parent_id facultyofscienceandengineering
hierarchy_parent_title Faculty of Science and Engineering
department_str School of Mathematics and Computer Science - Computer Science{{{_:::_}}}Faculty of Science and Engineering{{{_:::_}}}School of Mathematics and Computer Science - Computer Science
document_store_str 1
active_str 0
description Automatic Speech Recognition (ASR) researchers are turning their attention towards supporting low-resource languages, such as isiXhosa or Marathi, with only limited training resources. We report and reflect on collaborative research across ASR & HCI to situate ASR-enabled technologies to suit the needs and functions of two communities of low-resource language speakers, on the outskirts of Cape Town, South Africa and in Mumbai, India. We build on longstanding community partnerships and draw on linguistics, media studies and HCI scholarship to guide our research. We demonstrate diverse design methods to: remotely engage participants; collect speech data to test ASR models; and ultimately field-test models with users. Reflecting on the research, we identify opportunities, challenges, and use-cases of ASR, in particular to support pervasive use of WhatsApp voice messaging. Finally, we uncover implications for collaborations across ASR & HCI that advance important discussions at CHI surrounding data, ethics, and AI.
published_date 2022-04-29T04:17:00Z
_version_ 1763754136166203392
score 10.990307