research

This page provides a high-level description of my research goals and interests. Where relevant, I link papers I have collaborated on to illustrate the kind of research I like to do.

At a general level, the two core questions underlying my research are:

  • How can we reduce the language technology inequality between languages of the world?
  • How can linguistic knowledge inform NLP?

This second question is even more generally related to a core question of AI: how can we combine neural and symbolic reasoning. I explore these questions in the context of three overarching themes: multilingual NLP, interpretability and syntax/structure. I describe each theme in turn. (Links in italics indicate papers I co-authored.)

Multilingual NLP

The field of multilingual NLP is booming. This is due in no small part to large multilingual pretrained language models (PLMs) such as mBERT and XLM-R which have been found to have surprising cross-lingual transfer capabilities in spite of receiving no cross-lingual supervision. There is, however, a sharp divide between languages that benefit from this transfer and languages that do not, and there is ample evidence that transfer works best between typologically similar languages. This means that the majority of world languages are still left behind and inequalities in access to language technology are increasing. I find it important to work towards reducing these inequalities. Working on multilingual NLP is also interesting from a scientific perspective, to help uncover what aspects of language can be modelled as universals, and what parts need language specific adaptations. I think we can make progress in this area by looking at the field of typology (example) as well as the area of algorithmic fairness (example). I am also very excited to find out to what extent pixel-based representations of language can be leveraged for cross-lingual transfer.

Interpretability

With the move from rule-based to statistical to, now, neural methods, NLP models have increasingly gained accuracy, but at the cost of interpretability. Neural models are largely used as black boxes that transform an input representation (e.g. words in a sentence) to some output prediction (e.g. parse trees, or a translation of the sentence, etc). An emerging field in AI is trying to understand what neural models learn in order to regain some interpretability. Linguistic knowledge is highly relevant to interpretability in NLP. Linguistics can inform what phenomena are difficult to process which can be used for creating useful probes (example), challenge datasets (example) and targeted evaluations.

Syntax and Structure

Linguistics can inform inductive biases and the question of whether or not a structural bias in the architecture is a useful one has received a lot of attention in the literature (example 1 and 2). Relatedly, the question of whether modelling syntax or structure is necessary for natural language processing is unresolved. It is a long-standing and currently active research area. Recent research has shown that dependency trees can be read from contextualized representations trained for language modelling without receiving any supervision for syntactic parsing. This finding leads to questioning whether or not dependency parsing is a useful component of an NLP system. This is a recurring question in NLP with works frequently showing the benefits of using dependency parsers as part of their systems (recent example). A natural (and more than a decade old) question is whether we need rigid tree structures. With the prevalence of neural models where everything is represented as a continuous vector, it might make sense to start thinking about modelling trees as continuous representations instead.