Sara Stymne

I am Docent in Computational Linguistics, working as assistant professor (biträdande lektor) in the Computational Linguistics and Language Technology group, Department of Linguistics and Philology, Uppsala University since 2017. I have been working in this group since 2012 first as a post-doc (2012-2015), than as a researcher (2015-2017). My main research interest is multilingual dependency parsing. I'm also intersted in digital philology, and of how compuational linguistics can be used to solve research questions in other fields, mainly focuing on literature studies. My previous work was on machine translation, where I mainly focused on discourse-aware translation, compound processing, and error analyis.

I was previously a researcher at the Department of computer and information science at Linköping University until 2012. I received a PhD in Computational Linguistics from Linköping University in 2012, with the thesis Text Harmonization Strategies for Phrase-Based Statistical Machine Translation. I received a Licentiate degree in Computational Linguistics in 2009, and a Master's degree in Cognitive science in 2006, both from Linköping University.

I spent the autumn 2010 and spring 2009 at Xerox Research Centre Europe in Grenoble, France.

Projects

Current

  • Fictional prose and language change. The role of colloquialization in the history of Swedish 1830–1930. PI: David Håkansson. This is a project funded by VR, running 2021-2023. My main role is to develop lanugage technology tools for the analysis of dialogue, narrative and stylistic features of literature.
  • Domain-sensitive cross-lingual dependency parsing. This project has funding for a postdoc for two years 2020-2022, by eSSENCE at Uppsala University.

Previous

Software and resources

uuPronPred is a BiLSTM-based system for cross-lingual pronoun prediction.

uuparser is a dependency parser based on BiLSTM feature extractors (main developer: Miryam de Lhoneux)

Docent is a document-level machine translation decoder. (main developer: Christian Hardmeier)

Blast is a tool for error analysis of machine translation output.

Annotated compounds in German and Swedish. Small sets of running text from Europarl annotated with compounds in two ways.