Celtic Language Technology Workshop

 

Research community and workshops dedicated to language and speech processing technologies for the Celtic languages


 

** Call for papers now open for the Fourth Celtic Language Technology Workshop (CLTW), co-located with LREC 2022 **

Introduction

The CLTW community and workshop – inaugurated at COLING (Dublin) in 2014 – has become a critical focus and forum for researchers working in natural language processing (NLP) and language technologies for Celtic languages. In particular, it has galvanised and catalysed research by facilitating communication and collaboration internationally. Our community is interested in language technology for both contemporary and historical stages of of the Celtic languages.

In Classical times, Celtic languages were found across a wide swathe of modern Eurasia. Today, they are spoken in regions of the UK and Ireland, as well as in Brittany, France. The modern languages are: Irish, Breton, Manx, Welsh, Cornish and Scottish Gaelic. Although their hereditary communities are small compared to those of most other European languages, they continue to have a vibrant presence in their traditional areas as well as in urban centres. While Irish is the only Celtic language that has official EU language status (since 2007), Welsh, Gaelic and Manx have co-official status. Breton and Cornish also have some limited status in their home regions. That said, all Celtic languages face the same issue in lacking NLP resources to ensure continued technology support in the digital era.

While the Celtic languages share certain aspects of their sociolinguistic situation with other minority languages, their common linguistic features (e.g. VSO word order, initial mutations and reasonably complex morphology) also present unique challenges for the development of robust NLP tools. By gathering researchers from all of the Celtic languages, CLTW aims to share best practice in overcoming these difficulties.

Background and collaboration

Collaboration between UK and Irish research in the field of Celtic language technologies goes back to 2004 when Bangor University in the UK and Trinity College Dublin, UCD and DCU in Dublin initiated research on Welsh and Irish Speech Processing Resources (WISPR) under the Interreg III Wales/Ireland programme. This collaboration was later extended to NUIG through the establishment of the Celtic Language Technologies Group in 2014. This group has now extended our interests in language technologies to all modern Celtic languages. We ran three successful workshops resulting in the publication of prestigious proceedings, one allied to COLING in 2014, one to TALN in 2016 and one to the European Association for Machine Translation in 2019 (see Workshops).

A Roundtable discussion was held at the Celtic Congress, at Bangor University, also in 2019, to discuss further collaboration between the modern Celtic languages in the UK and Ireland in the field of Celtic Language Technologies. The conclusions of the Roundtable expressed the desire for closer collaboration: “Individually, our language communities are very small, and joining together helps us achieve a critical mass for developing research projects and providing mutual support.” See here for a full record of the discussion.

Workshops

The fourth installment in the Celtic Language Technology Workshop series will be co-located with LREC 2022 in Marseille, France. The CLTW series has seen three successful previous installments:

  • • Dublin, Ireland 2019, co-located with MT Summit XVII (websiteproceedings)
  • • Paris, France 2016, as part of the 6th joint JEP-TALN-RECITAL 2016 conference (proceedings)
  • • Dublin, Ireland 2014, co-located with the 25th International Conference on Computational Linguistics (COLING) (websiteproceedings)

Join the community

We always welcome new members. Please join the Celtic Language Technology community by signing up to our Google group and mailing list at celtic-language-technology@googlegroups.com.