Research

My research is centered around the computational modeling of meaning and reasoning in natural language processing. I am very interested in interaction, e.g. in dialogue systems and interactive natural language generation.

I worked a lot on grammar formalisms and their expressive capacity before neural models made such topics less relevant. This line of research still informs my interest in neurosymbolic models; I am the speaker of a large PhD program on the topic.

You can find my publications on my group homepage or on Google Scholar. Here are some highlights over the years:

I am perhaps best known for the “octopus paper”, for which Emily Bender and I won the ACL 2020 Best Theme Paper award. In this paper we argue that meaning cannot be learned from form alone; that is, even the most powerful neural language model will eventually lose the Turing Test if it is trained only on text.
With my students Jonas Groschwitz and Matthias Lindemann, I developed the neurosymbolic AM parser for semantic parsing into graphs (online demo). The AM parser had state-of-the-art parsing accuracy across multiple graphbanks, is extremely efficient, and is one of the very few semantic parsers that performs well both broad-coverage and on compositional generalization.
I am interested in how semantic parsers perform out-of-distribution: do they have the right inductive bias to learn the right generalizations from limited training data? My student Yuekun Yao showed that neurosymbolic models outperform purely neural models on structural generalization, and we developed the SLOG dataset, which specifically exercises structural generalization. Yuekun also showed how to automatically check the output of a semantic parser for correctness and derive accuracy bounds for unlabeled test data. My thoughts on compositionality in general are in this survey article.
I developed Interpreted Regular Tree Grammars with Marco Kuhlmann; they are a powerful generalization over a large class of grammar formalisms (including TAG, FTAG, minimalist grammars, and text-to-graph grammars). Alto is our fast and extensible Java implementation of IRTG and related algorithms.
I organized the GIVE Challenge, the first major task-based evaluation of NLG systems through crowdsourcing (GIVE-2, GIVE-2.5). My students and I also developed a number of planning-based systems for generating instructions in GIVE, which led to several collaborations with planning colleagues and followup work on instruction planning in Minecraft. I am also very interested in using LLMs for planning and their limitations.