SSP Forum: Chris Potts on Evaluating Natural Language Understanding
Symbolic Systems Forum
Towards More Meaningful Evaluations for Natural Language Understanding
Monday, November 18, 2019
Building 460, Room 126 (Margaret Jacks Hall)
It is common to hear that certain natural language understanding (NLU) tasks have been "solved". These claims are often misconstrued as being about general human capabilities (e.g., to answer questions), but they are always actually about how systems performed on narrowly defined evaluations. Recently, adversarial testing methods have begun to expose how narrow many of these successes are. In this talk, I'll discuss what these results tell us about progress in the field, and I'll argue that they should prompt us to move beyond standard accuracy-based evaluations, to ask deeper questions of our NLU models.