Conference Paper/Proceeding/Abstract 903 views
Continuous speech recognition using syllables
Eurospeech '97, Pages: 1171 - 1174
Swansea University Authors:
Rhys Jones , John Mason
Full text not available from this repository: check for access using links below.
Abstract
The vast majority of work in continuous speech recognition uses phoneme-like units as the basic recognition component. The work presented here investigates the practicability of syllable-like units as the building blocks for recognition. A phonetically annotated telephony database is analysed at the...
| Published in: | Eurospeech '97 |
|---|---|
| ISSN: | 1018-4074 |
| Published: |
Grenoble, France
European Speech Communication Association: ESCA
1997
|
| Online Access: |
Check full text
|
| URI: | https://cronfa.swan.ac.uk/Record/cronfa63336 |
| first_indexed |
2023-05-02T17:07:43Z |
|---|---|
| last_indexed |
2024-11-15T18:01:24Z |
| id |
cronfa63336 |
| recordtype |
SURis |
| fullrecord |
<?xml version="1.0"?><rfc1807><datestamp>2023-06-13T13:22:48.4681384</datestamp><bib-version>v2</bib-version><id>63336</id><entry>2023-05-02</entry><title>Continuous speech recognition using syllables</title><swanseaauthors><author><sid>896a6aacfd217fb099481697a43bfe80</sid><ORCID>0000-0003-3928-4701</ORCID><firstname>Rhys</firstname><surname>Jones</surname><name>Rhys Jones</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>284b34c63a5cbc71055047daf2ee1392</sid><firstname>John</firstname><surname>Mason</surname><name>John Mason</name><active>true</active><ethesisStudent>false</ethesisStudent></author></swanseaauthors><date>2023-05-02</date><deptcode>CACS</deptcode><abstract>The vast majority of work in continuous speech recognition uses phoneme-like units as the basic recognition component. The work presented here investigates the practicability of syllable-like units as the building blocks for recognition. A phonetically annotated telephony database is analysed at the syllable level, and a set of syllable-based Hidden Markov Models (HMMs) are built. Refinements including the introduction of syllable-level bigram probabilities, word- and syllable-level insertion penalties, and the investigation of different model topologies are found to improve recogniser performance. It is found that the syllable-based recogniser gives recognition accuracies of over 60%, which compares with 35% as the baseline accuracy for monophone recognition. It is envisaged that practical applications of syllable recognition could be in a hybrid system, where the most common syllable HMMs would be used in conjunction with whole-word and phoneme models.</abstract><type>Conference Paper/Proceeding/Abstract</type><journal>Eurospeech '97</journal><volume/><journalNumber/><paginationStart>1171</paginationStart><paginationEnd>1174</paginationEnd><publisher>European Speech Communication Association: ESCA</publisher><placeOfPublication>Grenoble, France</placeOfPublication><isbnPrint/><isbnElectronic/><issnPrint>1018-4074</issnPrint><issnElectronic/><keywords/><publishedDay>25</publishedDay><publishedMonth>9</publishedMonth><publishedYear>1997</publishedYear><publishedDate>1997-09-25</publishedDate><doi/><url/><notes/><college>COLLEGE NANME</college><department>Culture and Communications School</department><CollegeCode>COLLEGE CODE</CollegeCode><DepartmentCode>CACS</DepartmentCode><institution>Swansea University</institution><apcterm>Not Required</apcterm><funders/><projectreference/><lastEdited>2023-06-13T13:22:48.4681384</lastEdited><Created>2023-05-02T17:59:35.9756576</Created><path><level id="1">Faculty of Humanities and Social Sciences</level><level id="2">School of Culture and Communication - Media, Communications, Journalism and PR</level></path><authors><author><firstname>Rhys</firstname><surname>Jones</surname><orcid>0000-0003-3928-4701</orcid><order>1</order></author><author><firstname>John</firstname><surname>Mason</surname><order>2</order></author><author><firstname>Simon</firstname><surname>Downey</surname><order>3</order></author></authors><documents/><OutputDurs/></rfc1807> |
| spelling |
2023-06-13T13:22:48.4681384 v2 63336 2023-05-02 Continuous speech recognition using syllables 896a6aacfd217fb099481697a43bfe80 0000-0003-3928-4701 Rhys Jones Rhys Jones true false 284b34c63a5cbc71055047daf2ee1392 John Mason John Mason true false 2023-05-02 CACS The vast majority of work in continuous speech recognition uses phoneme-like units as the basic recognition component. The work presented here investigates the practicability of syllable-like units as the building blocks for recognition. A phonetically annotated telephony database is analysed at the syllable level, and a set of syllable-based Hidden Markov Models (HMMs) are built. Refinements including the introduction of syllable-level bigram probabilities, word- and syllable-level insertion penalties, and the investigation of different model topologies are found to improve recogniser performance. It is found that the syllable-based recogniser gives recognition accuracies of over 60%, which compares with 35% as the baseline accuracy for monophone recognition. It is envisaged that practical applications of syllable recognition could be in a hybrid system, where the most common syllable HMMs would be used in conjunction with whole-word and phoneme models. Conference Paper/Proceeding/Abstract Eurospeech '97 1171 1174 European Speech Communication Association: ESCA Grenoble, France 1018-4074 25 9 1997 1997-09-25 COLLEGE NANME Culture and Communications School COLLEGE CODE CACS Swansea University Not Required 2023-06-13T13:22:48.4681384 2023-05-02T17:59:35.9756576 Faculty of Humanities and Social Sciences School of Culture and Communication - Media, Communications, Journalism and PR Rhys Jones 0000-0003-3928-4701 1 John Mason 2 Simon Downey 3 |
| title |
Continuous speech recognition using syllables |
| spellingShingle |
Continuous speech recognition using syllables Rhys Jones John Mason |
| title_short |
Continuous speech recognition using syllables |
| title_full |
Continuous speech recognition using syllables |
| title_fullStr |
Continuous speech recognition using syllables |
| title_full_unstemmed |
Continuous speech recognition using syllables |
| title_sort |
Continuous speech recognition using syllables |
| author_id_str_mv |
896a6aacfd217fb099481697a43bfe80 284b34c63a5cbc71055047daf2ee1392 |
| author_id_fullname_str_mv |
896a6aacfd217fb099481697a43bfe80_***_Rhys Jones 284b34c63a5cbc71055047daf2ee1392_***_John Mason |
| author |
Rhys Jones John Mason |
| author2 |
Rhys Jones John Mason Simon Downey |
| format |
Conference Paper/Proceeding/Abstract |
| container_title |
Eurospeech '97 |
| container_start_page |
1171 |
| publishDate |
1997 |
| institution |
Swansea University |
| issn |
1018-4074 |
| publisher |
European Speech Communication Association: ESCA |
| college_str |
Faculty of Humanities and Social Sciences |
| hierarchytype |
|
| hierarchy_top_id |
facultyofhumanitiesandsocialsciences |
| hierarchy_top_title |
Faculty of Humanities and Social Sciences |
| hierarchy_parent_id |
facultyofhumanitiesandsocialsciences |
| hierarchy_parent_title |
Faculty of Humanities and Social Sciences |
| department_str |
School of Culture and Communication - Media, Communications, Journalism and PR{{{_:::_}}}Faculty of Humanities and Social Sciences{{{_:::_}}}School of Culture and Communication - Media, Communications, Journalism and PR |
| document_store_str |
0 |
| active_str |
0 |
| description |
The vast majority of work in continuous speech recognition uses phoneme-like units as the basic recognition component. The work presented here investigates the practicability of syllable-like units as the building blocks for recognition. A phonetically annotated telephony database is analysed at the syllable level, and a set of syllable-based Hidden Markov Models (HMMs) are built. Refinements including the introduction of syllable-level bigram probabilities, word- and syllable-level insertion penalties, and the investigation of different model topologies are found to improve recogniser performance. It is found that the syllable-based recogniser gives recognition accuracies of over 60%, which compares with 35% as the baseline accuracy for monophone recognition. It is envisaged that practical applications of syllable recognition could be in a hybrid system, where the most common syllable HMMs would be used in conjunction with whole-word and phoneme models. |
| published_date |
1997-09-25T06:26:49Z |
| _version_ |
1857624535669407744 |
| score |
11.096913 |

