Interpretability of Transfer in Multi-Lingual Models


This project has been supported by the VENI funding line financed by the Dutch Research Council (NWO).

Summary


When we learn a new language, we try to map existing conceptual representations to new words and structures. We benefit from our prior linguistic knowledge and rely on cross-lingual transfer to overcome limitations in the foreign language. If we were able to model this process computationally, we could predict human behavior and anticipate learning obstacles such as interference effects due to false friends. Multilingual models which are sensitive to typological differences could facilitate individualized support for learners with diverse language backgrounds.

Recent multilingual models are implemented as neural networks which perform complex matrix transformations to jointly represent multiple languages in high-dimensional vector space. They are developed from an engineering perspective and are optimized for tasks such as cross-lingual information retrieval or machine translation. Surprisingly, the models obtaining the highest performances apply a language-agnostic training procedure on mixed training data from multiple languages. It remains an open question whether the resulting multilingual representations are able to capture and predict cross-lingual transfer effects in humans or merely exploit shallow lexical parallels in the training data.
In this interdisciplinary project, I want to combine knowledge about human cognition and language typology to examine how cross-lingual transfer is reflected in computational multilingual models. The information flow and the intermediate linguistic representations in neural models usually remain opaque to human users but newly developed interpretability methods such as gradient-based saliency [36] or influence functions [30] open up new analysis perspectives. The experiments will be driven by a cross-lingual analysis of eye-tracking data and interpreted with respect to typological features. Based on the findings, a new diagnostic dataset will be developed to analyze how computational variables facilitate or impede cross-lingual transfer in multilingual models. The project adds a vital human-centered perspective to the field of multilingual modeling and envisions the development of better computational models for language learning.

Cooperators


Yuval Pinter, Charlotte Pouw, Wondimagegnhue Tufa