The aim of the GALLU project was to further develop the speech recognition resources available for the Welsh language. The project was funded by a grant from the Welsh Government and S4C. The project built on the foundations laid by the Basic Speech Recognition Project of 2008-9. The following aims were acheied during the project:
- design and develop a collection of prompts that contain all of the phonemes of the Welsh language
- collect, through crowdsourching, recordings of these prompts being pronounced by a large number of varied people in order to create a new Welsh speech corpus.
- use elements of the corpus to train open code speech recognition software (Julius) and HTK to control the movement of a toy robot on a Raspberry Pi.
- prepare the corpus for future developments with Welsh dictation systems including creating a typology of language registers with appropriate metadata on a trained corpus which has been tagged with the register characteristics.
- create a plug-in which detects and confirms the default language of the browser in order to Welshify the crowdsourcing pages and other webpages.
Participation
Although the project has formally ended, we continue to collect voices through the Paldaruo app for future use. Welsh speakers of any background or proficiency are invited to participate by downloading the app and reading aloud the displayed prompts so that speech recognition software can be trained to understand Welsh.