Note: this web page will shortly be shutdown. New location: https://www2.lingfil.uu.se/cl/sara/
I am Docent in Computational Linguistics, working as a senior lecturer in the Computational Linguistics and Language Technology group, Department of Linguistics and Philology, Uppsala University since 2017. I have been working in this group since 2012 as a post-doc (2012-2015), as a researcher (2015-2017) and as an assistant professor (2017-2023). My main research interest is multilingual dependency parsing. I'm also intersted in digital philology, and of how compuational linguistics can be used to solve research questions in other fields, mainly focuing on literature studies. My previous work was on machine translation, where I mainly focused on discourse-aware translation, compound processing, and error analyis.
I was previously a researcher at the Department of computer and information science at Linköping University until 2012. I received a PhD in Computational Linguistics from Linköping University in 2012, with the thesis Text Harmonization Strategies for Phrase-Based Statistical Machine Translation. I received a Licentiate degree in Computational Linguistics in 2009, and a Master's degree in Cognitive science in 2006, both from Linköping University.
I spent the autumn 2010 and spring 2009 at Xerox Research Centre Europe in Grenoble, France.
- Fictional prose and language change. The role of colloquialization in the history of Swedish 1830–1930. PI: David Håkansson. This is a project funded by VR, running 2021-2023. My main role is to develop lanugage technology tools for the analysis of dialogue, narrative and stylistic features of literature.
- Domain-sensitive cross-lingual dependency parsing. This project has funding for a postdoc for two years 2020-2022, by eSSENCE at Uppsala University.
- Datalab for results in the public sector. Project funded by Vinnova. The Uppsala University focus is on a sub project with the goal of identifying causality in government reports, in collaboration with The Swedish National Financial Management Authority and RISE.
- Från närläsning till fjärrläsning: digital humaniora och nya former för textanalys. (From close to distant reading: digital humanities and new forms for textual analysis). PI Johan Svedjedal. Collaboration project 2017-2019, funded by Circus at Uppsala Unviersity.
- Efficient Algorithms for Natural Language Processing Beyond Sentence Boundaries. Postdoc project funded by eSSENCE - The e-Science Collaboration, 2012-2015.
Software and resources
uuPronPred is a BiLSTM-based system for cross-lingual pronoun prediction.
Docent is a document-level machine translation decoder. (main developer: Christian Hardmeier)
Blast is a tool for error analysis of machine translation output.
Annotated compounds in German and Swedish. Small sets of running text from Europarl annotated with compounds in two ways.