PLUG - Parallel Corpora in Linköping, Uppsala, and Göteborg

This is a cooperative project aimed at the development, evaluation and application of programs for alignment and data generation from parallel corpora with Swedish as either source or target language. Applications include machine translation, computer-aided translation, translation data bases, multi-lingual web dictionaries and translator's training. The participating departments are Swedish language, Göteborg university, Computer and information science, Linköping university, and Linguistics, Uppsala university.

Project in Progress: The Work Packages


Ahrenberg, Lars, Merkel, Magnus, Sågvall Hein, A., Tiedemann, J., 2000.
Evaluation of Word Alignment Systems. In Proceedings of LREC 2000, Athens/Greece.

[pdf, 406kB] [ps, 757kB] [gzipped ps, 236kB]
Ahrenberg, Lars, Andersson, Mikael & Merkel, Magnus, 1998.
A Simple Hybrid Aligner for Generating Lexical Correspondences in Parallell Texts. Proceedings of COLING '98/ACL '98.

[postscript, 513kB] [gzipped postscript, 91kB]
Ahrenberg, L., Andersson, M. & Merkel, M., forthcoming.
A knowledge-lite approach to word alignment. J. Veronis (ed.), Parallel Text Processing, Kluwer Academic Press.
Danielsson, Pernilla & Mühlenbock, Katarina, 1998.
When Stålhandske becomes Steelglove. A Corpus Based Study of Names in Parallell Texts. Proceedings of AMTA'98, Langhorne, Pennsylvania, USA: Lecture Notes in Computer Science, Springer-Verlag, Heidelberg.
Danielsson, P. & Mühlenbock, K., 1998,
Retrieval of Name Translations in Parallel Corpora. In: Proceedings of TALC98, Seacourt Press, Oxford, pp. 58-64.
Lindvall, Lars & Ridings, Daniel, 1998.
Länkade texter och kontrastiv lingvistik, in Kungl. Vitterhets Historie och Antikvitets Akademiens Årsbok 1998, pp. 154-173.
Magnus Merkel & Mikael Andersson, 2000,
Knowledge-lite extraction of multi-word units with language filters and entropy thresholds. In Proceedings of RIAO'2000, Collége de France, Paris, France, April 12-14, 2000, Volume1, pp. 737-746.
[pdf, 39kB] [ps, 276kB] [gzipped ps, 79kB]
Merkel, M., Andersson, M. & Ahrenberg, L., forthcoming,
The PLUG Link Annotator - Interactive Construction of Data from Parallel Corpora. In L. Borin (ed.) Parallel Corpora, Parallel Worlds, Proceedings of Parallel Corpus Symposium, Uppsala, April 22-23, 1999, Uppsala University.
Merkel, M., 1999b,
Understanding and enhancing translation by parallel text processing. Linköping Studies in Science and Technology. Dissertation No. 607. Linköping University. Dept. of Computer and Information Science.
Mühlenbock, Katarina, forthcoming.
Kan ett namn bäras över språkgränsen? Något om fynden i en svensk-italiensk parallellkorpus.
Ridings, D., 1998.
PEDANT. Parallel texts in Göteborg. LEXIKOS 8 (Afrilex-reeks/series 8: 1998) sid. 1-26.
Sågvall Hein, A. , forthcoming.
The PLUG-project: Parallel Corpora in Linköping, Uppsala, Göteborg. Aims and achievements. In L. Borin (ed.) Parallel Corpora, Parallel Worlds, Proceedings of Parallel Corpus Symposium, Uppsala, April 22-23, 1999, Uppsala University.
Tiedemann, J., 2000,
Extracting Phrasal Terms using Bitext. In Proceedings of the Workshop on Terminology Resources and Computation, held in conjunction with LREC 2000, Athens/Greece, May 2000.
[pdf, 167 kB] [ps, 146 kB] [gzipped ps, 60 kB]
Tiedemann, J., 1999,
Word Alignment Step by Step. In Proceedings of the 12th Nordic Conference on Computational Linguistics, 1999, Technical University of Trondheim. Department of Linguistics.
[pdf, 442 kB] [ps, 683 kB] [gzipped ps, 208 kB] [slides - html] [slides - ps]
Tiedemann, J., forthcoming,
Uplug - a modular corpus tool for parallel corpora. In L. Borin (ed.) Parallel Corpora, Parallel Worlds. Proceedings of Parallel Corpus Symposium, Uppsala, April 22-23, 1999, Uppsala University. Department of Linguistics.
[abstract] [ps, 765 kB] [gzipped ps, 226 kB] [pdf, 320 kB]
Tiedemann, Jörg, 1999.
Automatic Construction of Weighted String Similarity Measures In Proceedings of the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora (EMNLP/VLC-99), University of Maryland, MD, USA, 1999.
[postscript] [compressed postscript] [pdf]
Tiedemann, Jörg, 1998a.
Extraction of Translation Equivalents from Parallel Corpora In Proceedings of the 11th Nordic Conference on Computational Linguistics, Center for Sprogteknologi, Copenhagen, 1998.

[postscript, 326kB] [compressed postscript, 43kB] [html]
Tiedemann, Jörg, 1997.
Automatical Lexicon Extraction from Aligned Bilingual Corpora. Diploma thesis, University of Magdeburg, 1997.

[abstract] [postscript, 2.8Mb] [compressed postscript, 269kB] [html]





last update: 05/24/2000

comments to