Statistical and computational models of the visual world paradigm: Growth curves and individual differences

https://doi.org/10.1016/j.jml.2007.11.006Get rights and content

Abstract

Time course estimates from eye tracking during spoken language processing (the “visual world paradigm”, or VWP) have enabled progress on debates regarding fine-grained details of activation and competition over time. There are, however, three gaps in current analyses of VWP data: consideration of time in a statistically rigorous manner, quantification of individual differences, and distinguishing linguistic effects from non-linguistic effects. To address these gaps, we have developed an approach combining statistical and computational modeling. The statistical approach (growth curve analysis, a technique explicitly designed to assess change over time at group and individual levels) provides a rigorous means of analyzing time course data. We introduce the method and its application to VWP data. We also demonstrate the potential for assessing whether differences in group or individual data are best explained by linguistic processing or decisional aspects of VWP tasks through comparison of growth curve analyses and computational modeling, and discuss the potential benefits for studying typical and atypical language processing.

Introduction

In the decade since the (re)discovery that eye movements provide an exquisitely sensitive on-line measure of spoken language processing (Tanenhaus, Spivey-Knowlton, Eberhard, & Sedivy, 1995; cf. Cooper, 1974), the “visual world paradigm” (VWP) has been applied to time course questions at the level of sentences (Altmann and Kamide, 1999, Tanenhaus et al., 1995), phonologically based lexical competition (Allopenna, Magnuson, & Tanenhaus, 1998), semantically based lexical competition (Huettig and Altmann, 2005, Yee and Sedivy, 2006), and even subphonemic details of word recognition (Dahan et al., 2001b, McMurray et al., 2002, Salverda et al., 2003).1 Typically, participants are presented with a set of objects (on a tabletop or computer display) and they follow spoken instructions to interact with the display (touching, clicking, or moving objects) or answer spoken questions about the display. In contrast to conventional psycholinguistic techniques like lexical decision or naming, one typically obtains multiple data points during processing for each trial.

For example, in a display containing beaker, beetle, speaker, and carriage, as a participant hears an instruction like click on the beaker, he might generate an eye movement to beetle when only the first two segments have been heard, and then look to the beaker 100 ms later. This leads to trial-level data schematized in the upper row of Fig. 1. At any moment on a single trial, a participant can either fixate an object or not, so trial level proportions are 0 or 1 for each item of interest at any point in time. Trial data are averaged over items and participants in order to arrive at a time course estimate like that shown in the bottom of Fig. 1. From the data of Allopenna et al. (1998), for example, we learn that fixation proportions map onto phonetic similarity over time; by the time listeners are hearing the /i/ in a word like beaker, they are equally likely to be fixating the target or a cohort (like beetle), while rhymes (like speaker) are fixated less and later (but more than unrelated items, like carriage). VWP data stand in stark contrast to data from tasks like lexical decision, where the data points represent single, post-perceptual measures. As a result, the VWP provides fine-grained data in the context of a natural task. However, there are three important gaps in current analyses of this paradigm. We describe each briefly, and then describe our approach to filling these gaps.

Although the paradigm’s most powerful contribution is the ability to estimate the fine-grained time course of activation and competition among linguistic representations, when the data are analyzed statistically, time is usually ignored or treated inappropriately. Fig. 2, Fig. 3, Fig. 4 illustrate typical approaches. In the simplest strategy, standard general linear model (GLM) analyses, such as analysis of variance or t-tests, are applied to a greatly compressed representation of the time course data. Fig. 2 and the left columns of Fig. 3, Fig. 4 schematize this approach; mean fixation proportion to each item is computed for a single window of analysis on the time course data (top in each figure), resulting in data like that schematized in the lower panels of each figure. While this approach minimizes the number of GLM assumptions violated (cf. Chambers et al., 2004, Magnuson et al., 2003), it expressly discards the precious fine-grained detail the VWP provides. The data are presented graphically in continuous time course form, but statistical analyses are applied to the radically reduced mean proportions. Aside from the loss of grain, this approach works well for data where relations among items of interest are stable across the analysis window (as in Fig. 3). For cases where, however, there is a change in the rank order of fixation proportions over time (e.g., where one competitor type dominates early in the time course, but another does later, or, as in Fig. 4, there is an interaction with time in a comparison of targets from different conditions), this approach is obviously inappropriate, as it does not retain any detail about time course.

Another common approach is to calculate mean fixation proportions in successive windows of analysis (right columns of Fig. 3, Fig. 4), and to perform a repeated measures analysis on mean fixation proportions across windows (Allopenna et al., 1998). This preserves more of the time course, but there are typically no independent principles for determining the “correct” time windows and different size windows can produce very different results. More importantly, this approach treats time as a factor with levels corresponding to individual time windows. This analysis naturally focuses on whether the patterns of data in some time windows differ from patterns in other windows, not on the trajectory of change over time (i.e., the estimate of the time course of cognitive processing), which is the unique insight provided by the VWP.

We describe a statistical approach, growth curve analysis (GCA), which builds on techniques explicitly designed to assess change over time (Singer & Willett, 2003). These techniques have been applied primarily to longitudinal behavioral data in the developmental literature. To apply it to VWP data, eye tracking data are treated as longitudinal data collected on a fast time scale. The approach provides a formal model of the impact of differences between conditions and/or individuals on parameters (such as intercept and slope) of individual × condition curves of fixation proportions over time. We will introduce the method in detail and then by example after discussing the other gaps in current approaches to VWP data.

Researchers who have examined trial-by-trial data from the VWP know that there is substantial between-participant variability; to the best of our knowledge there have been no attempts to assess this variability, and so its implications are unknown. Simply describing these differences is an important step—how well do measures of central tendency describe the range of performance observed, and how is performance distributed over that range? Under growth curve analysis, parameters are estimated that characterize individual differences. The mere characterization of variability across individuals provides a starting point for analyzing individual differences. Our approach goes further, with the aim of unpacking whether individual differences stem from differences in language processing or other processes, such as motor-decision processes controlling eye movements.

Going beyond description and unpacking individual differences requires that we grapple with some vexing methodological questions about the VWP. There are compelling arguments that fixation probabilities over time provide an exquisitely sensitive estimate of linguistic processing (given, for example, that fixations map onto phonetic similarity over time down to a subphonemic level; Dahan et al., 2001b). However, eye movement behavior in the VWP is influenced by the contents of the display (Dahan, Magnuson, & Tanenhaus, 2001a), and we expect individual differences in motor-decisional thresholds for saccades. We will present a strategy for grappling with these issues by comparing growth curve analysis with simulations of the TRACE model of speech perception (McClelland & Elman, 1986) coupled with a simple decision model that converts TRACE activations to predicted fixation proportions over time (Allopenna et al., 1998, Dahan et al., 2001a). Specifically, we test combinations of TRACE and decision parameters for simulating individual and individual × condition data. It may be possible to fit the same data by changing TRACE parameters or decision model parameters. To the degree that individual data can be fit by TRACE but cannot be fit by the decision model, we can provisionally attribute individual differences to variation in linguistic processing. With a model like TRACE, we can further explore the range of model parameters that provide good individual × condition fits to generate causal hypotheses regarding differences in linguistic processing. When applied at the individual or group level, this approach has promise for illuminating characteristics of language impairments.

In the next section, we provide a brief, fairly informal introduction to growth curve analysis. Then we turn to practical examples, analyzing some recent VWP spoken word recognition results. We begin with examples at the group level, and then demonstrate assessment of individual differences using growth curve and TRACE models. We close with a brief discussion of implications and alternative approaches. Readers interested in applying growth curve modeling should see Singer and Willett (2003). The book is refreshingly accessible, and there is an accompanying web site with sample code for SAS, SPSS, and S+/R. In addition, we provide general step-by-step instructions for GCA in the Appendix A, and SAS and R code and raw data for the analyses presented here are available at http://magnuson.psy.uconn.edu/GCA (we have run the analyses in SAS and R, though many statistical packages have multi-level modeling capabilities and any of them should be able to conduct growth curve analyses).

Section snippets

Growth curve analysis

The growth curve modeling approach to analyzing data from the VWP rests on the assumption that the properties of the task (the characteristics of the selected words, the visual display, etc.) create an underlying probability distribution of fixation locations (i.e., targets, competitors, distractors, etc.) over time. The observed fixation proportions reflect this underlying distribution. The goal of the analytic approach is to describe the functional form of the probability distribution. It is

Analysis of target fixations

As an example of the growth curve approach applied to the visual world paradigm, we present results from a VWP investigation of effects of frequency, cohort density (sum of frequencies of words overlapping with a target at onset), and lexical neighborhood density (sum of frequencies of words differing from the target by no more than one phoneme) on spoken word recognition (see Magnuson, Dixon, Tanenhaus, & Aslin, 2007, for details). As in typical VWP experiments, on each trial, four simple

Individual differences in VWP data

The previous sections demonstrate that growth curve analysis provides a robust and powerful statistical tool for understanding the time course of effects of experimenter-manipulated variables such as word frequency and phonological similarity. In addition, this analytic method can quantify individual participant effects for a variety of within-participant designs. For within-participant designs, the level-2 models carry information about individual differences averaged over conditions (i.e.,

General discussion

The visual world paradigm has proven to be a powerful technique for investigating spoken language processing from subphonemic details to sentence processing, though the paradigm lacks standard and appropriate statistical tools for analyzing time course data. In this report we have described a statistical technique based on multilevel polynomial regression that is specifically developed for analyzing change over time. We have shown how this technique can be used to analyze typical visual world

Acknowledgments

We thank Ted Strauss for his help with running the simulations and Len Katz, Bob McMurray, Christoph Scheepers, and two anonymous reviewers for suggestions that improved this paper substantially. This research was supported by NIDCD Grant R01DC005765 to J.S.M., NICHD NRSA F32HD052364 to D.M., and NICHD Grants HD01994 and HD40353 to Haskins Laboratories.

References (22)

  • Cited by (370)

    View all citing articles on Scopus
    View full text