Uppsala University * Dept. of Linguistics and Philology * Compuational Linguistics


Basic Language Resource Kit for Swedish Language Technology

Part of the project
"An Infrastructure for Swedish language technology",
financed by the Swedish Research Council
2007-2008

  På svenska



Project description

Research and development on language technology needs an infrastructure of publicly available and standardized basic resources. Such resources can be data or programs to process and use the data. Such basic resources are called "Basic Language Resource Kit", or BLARK. This project is a part of a national venture to develop infrastructure for Swedish language technology which is strongly supported by the langauge technology community in Sweden.

A BLARK has to be created for each language separately. For Swedish, there are several resources, but it is unclear of what type they are, and if they are available. Therefore, we need to make an inventory and describe the existing language resources. Also, it is necessary to invent the need of such resources for future development. The goal of our ongoing work is to prepare for the creation of an infrastructure for Swedish language technology. To make the Swedish BLARK as useful as possible, it is of great importance that everybody working with Swedish language technology participates in the inventory process. Inthe future, we plan to survey Finnish, Jiddisch, Meänkieli, Romani chib, and Sami, which are official languages in Sweden.

The project will be carried out in three phases. In a first phase, we have collected information about language resources from existing institutions, industry and individuals who use language technology in their work, and investigated their needs for Swedish language resources. In a next phase, we have defined which resources should be part of a Swedish BLARK, and listed the specific resources as well as the needs. Lastly, we apply for support from funding agencies to build up these language resources.

This project is a part of the larger project called "An Infrastructure for Swedish language technology". It is carried out in co-operation with KTH, and Göteborg and Linköping University, and co-ordinated by Lars Borin at Göteborg University.

Participants in BLARK

Rolf Carlson, KTH
Kjell Elenius, KTH
Eva Forsbom, Uppsala universitet
Beáta Megyesi, Uppsala universitet

Publications/Presentations

Elenius,K., Forsbom, E., and Megyesi, B. 2008. Language Resources and Tools for Swedish: A Survey. In Proceedings of LREC 2008. LREC 2008, Marrakesh, Marocko

Elenius,K., Forsbom, E., and Megyesi, B. 2008. Survey on Swedish Language Resources. Report, March 2008. Dept. of Speech, Music and Hearing, KTH and Dept. of Linguistics and Philology, Uppsala University

Forsbom,E. and Megyesi, B. 2007. Draft Questionnaire for the Swedish BLARK, presentation at BLARK/SNK workshop, January 28, 2007, GSLT retreat, Gullmarsstrand, Sweden.

Sågvall Hein, A. and Forsbom, E. 2006. A Swedish BLARK, presentation at BLARK workshop, January 29, 2006, GSLT retreat, Gullmarsstrand, Sweden.

BLARK/SNK workshop, January 28, 2007
BLARK workshop, January 29, 2006: Call, Meeting notes (in Swedish)




Institutionen för lingvistik och filologi, Uppsala universitet, Sverige
Besöksadress: Engelska parken, Humanistiskt centrum, Thunbergsvägen 3
Postadress: Institutionen för lingvistik och filologi, Box 635, SE-751 26 Uppsala, Sverige.
Tel: +46 (0)18 471 22 52
Fax: +46 (0)18 471 10 94