Text-to-speech and speech recognition are becoming increasingly
important in our digital world. Major languages such as English are well
catered for, but smaller languages such as Welsh and the other Celtic
languages are often left behind. Wikipedia is both a huge resource for
the creation of Celtic automatic speech capabilities and a platform for
deploying the technology. A new project to make text-to-speech possible
for Wikipedia has been announced for English and Swedish, (see
https://www.mediawiki.org/wiki/Wikispeech)
which may be extended in time to other languages. However, as far as we
know, there are no plans yet to develop speech recognition in the
Wikipedia environment, and speech recognition for the Celtic languages
in general remains underdeveloped. In our Welsh National Language
Technologies Portal we have published the work we have done so far in
this field (see
http://techiaith.cymru/speech/?lang=en)
aiming at disseminating our resources on free and generous licences. We
now wish to engage with our Celtic colleagues to explore how we can
create speech recognition for our languages with Wikipedia, starting
with training in named entities, and questioning and answering modules
e.g. who was, where is, where/when was someone born etc.
Delyth Prys,
Head of the Language Technologies Unit, Canolfan Bedwyr.
Dewi Jones - Bangor University.