Skip to content
Publicly Available Published by De Gruyter Mouton March 9, 2018

The Yale Grammatical Diversity Project: Morphosyntactic variation in North American English

  • Raffaella Zanuttini EMAIL logo , Jim Wood , Jason Zentz ORCID logo and Laurence Horn
From the journal Linguistics Vanguard


The Yale Grammatical Diversity Project approaches the empirical domain of North American English from the perspective of generative microcomparative syntax. In addition to eliciting judgments from speakers of particular varieties, we also conduct large-scale surveys, map the results of those surveys geographically, conduct statistical tests taking geography and other social variables into account, and look for theoretically significant linguistic correlations. In all cases, we do this with the primary goal of understanding variation between speakers at the individual level. While our goals and methodologies are informed by our theoretical perspective, we expect that our work and results will be of interest to linguists working in other frameworks and even to the public more generally. This article outlines the goals and methodologies of the project and describes in broad strokes some of the results obtained so far, as well as some of the ways we have shared our findings with others, inside and outside academia.

1 Project overview

The Yale Grammatical Diversity Project (YGDP) approaches the empirical domain of North American English from the perspective of generative microcomparative syntax.

1.1 Our approach to the study of morphosyntactic variation

Generative linguists are primarily interested in the mental grammar of individual speakers, modeled as a system of rules that can form some linguistic units (syllables, words, sentences, etc.) but not others. While it is obvious that there are systematic grammatical differences across speakers of different languages or dialects, differences also exist among people within their speech communities (see Trousdale and Adger [2007] and Cornips [2015] for some interesting perspectives on this issue). If every individual has a mental grammar, we might expect that what it means to “speak the same language” is to have mental grammars that are similar but not necessarily identical (and, of course, to share a significant proportion of the lexicon).

Generative microcomparative syntax, then, is the study of the differences between similar mental grammars, with the goal of furthering our broader understanding of the human language faculty. As Kayne (2005: 283) points out, “microcomparative syntax work provides us with a new kind of microscope with which to look into the workings of syntax.” Several projects conducted within this paradigm are listed at, including ASIS (Poletto and Benincà 2007), SAND (Barbiers et al.2005, Barbiers et al.2008), and ScanDiaSyn (, in Italy, the Netherlands/Belgium, and Scandinavia, respectively.

For our project, we recruit (and in some cases develop) methodologies especially suited to the kinds of questions we are interested in asking. We collect data in the form of acceptability judgments in order to determine which kinds of sentences can be generated by individual speakers’ mental grammars and which cannot. In addition to eliciting judgments from speakers of particular varieties, we also conduct large-scale surveys, map the results of those surveys geographically, conduct statistical tests taking geography and other social variables into account, and look for theoretically significant linguistic correlations. In all cases, we do this with the primary goal of understanding variation between speakers at the individual level.[1] While our goals and methodologies are informed by our theoretical perspective, we expect that our work and results will be of interest to linguists working in other frameworks and even to the public more generally. Therefore, in addition to our technical theoretical work, we are committed to highlighting our descriptive findings and providing freely accessible resources intended for a broader audience.

1.2 Situating the YGDP within the study of variation in English

From an empirical perspective, we are interested in morphosyntactic variation across the varieties of English spoken across North America. Interspeaker variation within North American English has long been an object of linguistic investigation (see Schneider [2008] and Wolfram and Schilling [2016] for useful overviews). However, most studies of morphosyntactic variation, as noted by Kortmann (2003), have focused on particular phenomena such as negative concord, positive anymore, subject–verb agreement, and multiple modals (Reed and Montgomery 2016), or on particular varieties such as Alabama English (Feagin 1979), Appalachian English (Wolfram and Christian 1976; Montgomery and Hall 2004b; Hazen et al. 2013), African American English (Labov et al. 1968; Baugh 1983; Rickford 1999; Green 2002; Lanehart 2015), and Canadian English (Tagliamonte 2006).

When it comes to large-scale investigations (considering multiple phenomena across multiple varieties), researchers of North American English have conducted surveys, compiled corpora, or collected data from a variety of resources and expert contributors. The most prominent nationwide surveys of individual speakers of American English have focused on phonology (Labov et al. 2006), the lexicon (Carver 1987; Hall 2013), or both (Kurath et al. 1939–1943 and subsequent work on the Linguistic Atlas of the United States and Canada;[2]Vaux and Golder 2003). Grieve (2009, 2016) has created a large corpus of letters to the editor from regional newspapers all over the country, which he uses to conduct statistical and geographical analyses of lexical variation.[3]

Perhaps most comparable to our project in empirical focus and scope is a large body of work (Kortmann and Schneider 2004; Kortmann 2005; Kortmann et al. 2005; Szmrecsanyi and Kortmann 2009; Hernández et al. 2011; Hickey 2012; Kortmann and Lunkenheimer2012; Kortmann and Lunkenheimer2013; Siemund 2013; Gerwin 2014) that considers questions of morphosyntactic variation in English within a research paradigm that integrates functional typology with dialectology. This approach is sometimes called variationist typology or sociolinguistic typology (Siemund 2013, 283). These researchers provide systematic overviews of the features attested (along with the degree of attestation) in particular varieties of English and useful summaries of existing literature on these topics. In many cases, they utilize quantitative techniques to reveal generalizations that illuminate how different varieties of English reflect crosslinguistic tendencies. Moreover, Kortmann and Lunkenheimer (2013) (eWAVE) provides an interactive interface to explore, compare, and geographically visualize the presence of certain features across varieties of English worldwide.

Our project differs from this work in several respects. First, our empirical scope is distinct: we focus exclusively on North American English, whereas most of the work in this line of research is on English as spoken in the British Isles and around the world.[4] Second, variationist typologists approach dialect variation from a top-down perspective: they begin with a list of varieties of English and a corresponding list of experts on these varieties, who are responsible for determining the prevalence of a given syntactic property or phenomenon in each variety.[5] Our approach, on the other hand, does not take specific varieties as a given; we are trying to discover what they are bottom-up. In order to do so, we quantify over individual speakers’ acceptability judgments on specific sentences rather than over varieties or over categorical ratings of the prevalence of a grammatical property as assigned by researchers at the level of the variety. Third, while many of our goals overlap with those of the variationist typologists (e.g., mapping the geographic distribution of morphosyntactic phenomena that vary across English speakers/varieties), one key goal of the typological investigation is to evaluate whether varieties with distinct historical trajectories or contact scenarios (e.g., first-language varieties versus second-language varieties versus English-based pidgins and creoles) have fundamentally different structural properties. By contrast, we are interested in discovering how observations gleaned from the study of English morphosyntactic variation can answer formal questions framed within generative syntactic theory. In sum, both lines of research seek to explore morphosyntactic variation in English, but our scope, methods, and goals are complementary.

1.3 Roadmap

Our next section outlines the overall goals of our project in more detail, and then we move on (Section 3) to discuss methodological issues, including the use of geographical maps of our findings. Finally, we describe in broad strokes some of the results we have obtained so far (Section 4) and some of the ways we have been able to share our findings with others, inside and outside academia (Section 5).

2 Project goals

The project has a number of aims, which can be grouped under two overarching goals:

  1. to collect and make available information about morphosyntactic variation found across speakers of English in North America;

  2. to conduct and foster new research on morphosyntactic variation that can broaden our knowledge at both the empirical and theoretical levels.

To reach our first goal, we have been gathering information about syntactic variation in North American English from a number of different sources. We make it publicly available on a website (, which is organized in a series of pages, each devoted to a particular aspect of the syntax of English that varies across native speakers. For example, the phenomena illustrated in (1) all have dedicated pages:

He just kept a-beggin’ and a-cryin’ and a-wantin’ to go out.a-prefixing
(Wolfram 1976: 45; McQuaid 2012: 32(11c))
That’s so Eighties.drama SO
(Adams 2003: 77; Irwin 2014)
The cat wants fed.needs washed
(Murray and Simon 1999: 162 (9); Edelstein 2014: 243 (4a))
Bill can touch the ceiling, and so can’t don’t I
(Lawler 1974: 359 (10); Wood 2014)
f. They didn’t nobody like him.split subject
(Feagin 1979: 238; Zanuttini and Bernstein 2014)
I BÍN had this.stressed BIN
(Rickford 1975: 106 (12); Harris 2013)

These pages are intended to be useful to scholars and at the same time accessible to anyone who is not an academic but is interested in language for professional or personal reasons. For the linguist, the website is a repository of information concerning minimal differences in the syntax of North American English. For non-linguists (teachers, journalists, people who are curious about the way we speak, etc.), it is a place to find a detailed yet accessible description of aspects of the syntax of English that might have attracted interest because they differ across speakers (often raising the questions of who’s right and who’s wrong), or because they are associated (often pejoratively) with a certain group of people.

All the pages (found under the Phenomena tab) have a similar structure: they start with an example sentence that exemplifies the syntactic property that varies across speakers, followed by a very accessible description (see for example, the page on negative inversion, They contain a paragraph or two with information about any extralinguistic factor that has been identified as restricting its distribution, such as geographic region, age, gender, or ethnicity. Next come one or more sections with a slightly more technical description of the syntactic and semantic properties of the element under investigation, followed by a list of bibliographic references (constantly being updated).

We strive to reach our second goal by conducting research on some of the microvariation we find, and by encouraging other linguists (regardless of theoretical framework or level of expertise) to do the same. Recently, we have focused on pronouns, specifically examining two phenomena that vary across American English speakers. The phenomena of interest are exemplified by sentences such as (2) and (3):

Here’s you a dog. (Here’s a dog for you.)
(Horn 2014: 334n7)
We don’t any of us need anything.
(Montgomery and Hall 2004a: 413; Zanuttini and Bernstein 2014: 152 (25a))

From the empirical point of view, much needs to be discovered about these two types of sentences. Wood et al. (2015a) and Section 3.4 below detail the geographic coverage of the “dative presentative” construction shown in (2). Wood and Zanuttini (2016) propose a syntactic analysis of this phenomenon, revolving around the features of a functional head Appl (Pylkkänen 2008), but this analysis is still under development and revision (Wood and Zanuttini forthcoming). (See also the open questions in Wood et al. 2015a, 313.) As for (3), our initial research has shown that for such sentences, both the nominative subject (e.g., we) and the partitive (e.g., of us) generally must be pronominal. Building on Zanuttini and Bernstein (2014), Wood et al. (2015b) proposed that such structures are derived by movement. We are currently testing predictions this proposal makes about more complex structures such as we linguists.

3 Project methodology

While one primary goal of our current research is to gain a better understanding of the syntax of pronouns, we are also committed to developing methodologies that can be used by dialect syntacticians working in other empirical domains. This section provides some information about these methodologies, and further details are provided in the supplementary materials.

3.1 Examining existing sources

In our efforts to document grammatical variation within North American English, we have examined a broad swath of relevant literature on English morphosyntactic variation ( in order to compile a list of phenomena, build a bibliography, and collect examples. We have also explored less formal forums such as blogs and social media sites in search of the same kinds of information – we have found that these can be a particularly rich source of data and commentary on understudied and/or new phenomena.

We store bibliographic metadata and PDFs of our sources in a group Zotero database, which allows us to search across all our sources, tag and filter sources by the phenomena they discuss, and export references in whatever format is required for a particular publication venue. The bibliography is publicly accessible at

3.2 Plotting attested examples

We use Google Fusion Tables to catalog the attested examples we have found in the literature. Each example is associated with metadata including the source where we found it, the phenomenon (or phenomena) it illustrates, the nature and date of attestation, the acceptability of the sentence for the speaker, and the speaker’s speech variety, ethnicity, age, socioeconomic class, locale, and region. We assign geographic coordinates to each example based on the descriptions given in the original source, which allows us to display these examples in interactive Google Maps embedded on our website. Each example is displayed as a pin on the map, and hovering over a pin will reveal the metadata available for that example.

3.3 Administering surveys online

One key methodology we have used to gather new data is to administer acceptability judgment surveys online.

3.3.1 Design

In our online surveys, we collect both demographic information about each participant and their acceptability judgments on a set of sentences. All sentences are provided in written form, accompanied by a 5-point Likert scale, bounded by 1 (labeled “totally unacceptable, even in informal settings”) and 5 (labeled “totally acceptable”). Our supplementary materials include a survey manual with details about our design decisions, as well as a complete sample survey in Qualtrics (QSF) format. That file contains our survey instructions and questions, as well as all flow logic, formatting, and answer choices. We also include an annotated version of the same survey in PDF format.

3.3.2 Administration

Our surveys are designed and hosted using Qualtrics online software. So far, we have administered these surveys using Amazon Mechanical Turk (MTurk), an online crowdsourcing platform that allows “requesters” to pay freelance workers to complete “Human Intelligence Tasks” (HITs). Workers select which HITs they want to complete; examples include tagging photographs based on their content, transcribing audio recordings, and increasingly, academic research experiments and surveys. A number of studies have validated the use of MTurk for social science research (Behrend et al. 2011; Sprouse 2011; Johnson and Borden 2012), and it is quickly becoming a popular tool for experimental syntax and semantics specifically (Gibson et al. 2011; Kotek et al. 2011; Sprouse 2011; Karttunen 2014; Erlewine and Kotek 2016).

We find online crowdsourcing to be an ideal way to distribute our surveys because it allows us to very quickly and inexpensively receive responses from hundreds of participants distributed widely throughout the United States. As we show in Wood et al. (2015a), our participant pool is quite diverse in terms of age, gender, education, and income, but not in terms of race/ethnicity, mirroring Ipeirotis’s (2010) findings for the MTurk user population as a whole.[6] This means that we can use our MTurk surveys to test the influence of some but not all social variables that may play a role in language variation; in particular, other methods are better suited for studying variation that correlates with race/ethnicity.

3.3.3 Data processing

After downloading our survey results from Qualtrics in CSV format, we process the dataset so that it can be used for geospatial and statistical analysis. One primary task is to geocode the data; that is, to add geographic coordinates for each location provided by the participant. A second major task is to remove responses that we cannot use because the participant:

  • did not complete the survey,

  • completed the survey more than once,

  • grew up outside the United States,

  • had a transient childhood,

  • or failed the controls.

We typically end up keeping only about 50% of responses. For further details about our geocoding workflow and response exclusion criteria, see the survey manual in the supplementary materials.

3.4 Conducting geospatial analysis

Our survey methodology lends itself nicely to studying geographical variation in acceptability judgments, since we get participants from all across the country. This benefit does come with an analytical problem: acceptability judgments are gradient, and in most cases, their spatial distribution is gradient too. So we face two kinds of questions. First, how do we represent the spatial distribution of judgments graphically, so that one can examine a map and glean patterns from it? Second, when are the patterns we see statistically reliable?

For the first question, there are many possibilities. After piloting some of them, we determined that a twofold approach is the most useful. First, we plot the individual participants’ primary childhood residence as points on a map. Often, we present judgments of 1–2 in one color, and 4–5 as a second color, omitting 3s. This is only for visual presentation; 3s are included in all calculations, including the interpolation and hot spot analyses discussed below.

Second, we use interpolation to visualize patterns of values. Interpolation fills in, for each part of the map, what its expected value is, based on a computation over the values of the points closest to it. We use the inverse-distance weighted algorithm (see Wood [2016]). We visualize the interpolation with different shades for 1–2, 2–3, 3–4, and 4–5. The result smooths over the points to reveal broader patterns in a visually clear way.

Together, collapsing the values of the point data (1 and 2 as the same color and 4 and 5 as the same color) and projecting interpolation under them generally provide a good visual overview of a dataset. But it is not always clear whether a geographic pattern is statistically reliable. One useful tool for this is the Gi* statistic (Grieve et al. 2011; Tamminga 2013), referred to as a “hot spot” analysis in ArcGIS, the software we use for mapping and geospatial analysis. The hot spots test is conducted for each data point. If there are any hot spots, we then draw borders around them to indicate an overall “region” of contiguous hot spots.[7]

An example is shown in Figure 1, which shows the results of the sentence Here’s you some money. The shaded interpolation reveals that the sentence is judged better in the South than in other areas. The preponderance of green dots there (representing judgments of 4 or 5), versus black dots (representing judgments of 1 or 2), reinforces this, and tells us what the interpolation is based on. Some areas, for example, have more data than others, and this is valuable information. The red border indicates a hot spot region, and blue borders indicate cold spot regions. This tells us that the pattern we see is statistically significant.

Figure 1: Here’s you some money.
Figure 1:

Here’s you some money.

Another technique we find useful is to overlay previously postulated dialect boundaries over our data. We can append each data point with its dialect region, and include that information in a regression analysis along with age, sex, race, etc. We also average over such regions and map them out. For example, the map in Figure 2 shows the average judgments for do-support with the have yet to construction (see Section 4). The dialect regions come from the Atlas of North American English (Labov et al. 2006). The darker the shade of blue, the higher the judgment.[8] This map shows that such sentences are degraded for Southern speakers, but we find many acceptances in several areas in the North. Figure 3 shows the values and 95% confidence intervals for these regions; a one-way ANOVA reveals that the differences among means is statistically significant (F[12, 494] = 3.054, p = 0.0004); Tukey-corrected multiple comparisons reveal significant pairwise differences between the South and both Inland North and Western Pennsylvania, and between Western Pennsylvania and New York.

Figure 2: Do-support with the have yet to construction.
Figure 2:

Do-support with the have yet to construction.

Figure 3: Do-support with the have yet to construction.
Figure 3:

Do-support with the have yet to construction.

4 Overview of project results

In this section, we briefly discuss some of what we have been finding in our ongoing survey work.

4.1 Interspeaker variation “in every room”

In many cases we find that some sentence or construction has a particular geographical distribution (see Figure 1). We will discuss examples of this kind of result below. In many other cases, however, we find rampant interspeaker variation without any geographic or demographic correlate. Some examples of this type are shown in (4).

Shouldn’t have Pam remembered her name?
(Johnson 1988: 160 (13a))
John threatened me to come to my house.
(Hartman 2011: 127 (33b); see also Zubizarreta 1982)
John seems like Mary defeated him.
(Asudeh and Toivonen 2012: 329 (20b); see also Rogers 1973)

In (4a), two auxiliaries appear to the left of the subject of a yes–no question (rather than the standard single auxiliary). Sentence (4b) exemplifies subject control in the presence of an indirect object, an important construction type in the control literature (see Landau [2013: 149 and references therein]). Copy raising, recently discussed by Landau (2011) and Asudeh and Toivonen (2012), is exemplified in (4c) – in this case involving an embedded object pronoun (him) that matches the matrix subject (John). All three of these sentence types vary across speakers, but none of the variation, as far as we can tell, has geographic correlates. See Wood et al. (2015a, 307) for a map of (4a) and Wood (2016) for a map of (4b). Figure 4 presents a map of a variant of (4c).

Figure 4: John seems like Mary offended him.
Figure 4:

John seems like Mary offended him.

These kinds of results are certainly open to interpretation; the point here is just that a variety of sentences that are reported in the syntax literature to vary across speakers exhibit “variation in every room” – that is, interspeaker variation that might be expected in any given room of native speakers. Knowing what kinds of syntactic phenomena vary across individuals in the same speech community raises a host of interesting questions about language acquisition and online sentence processing. Moreover, generative syntacticians aim to understand interspeaker variation in general, whether it is tied to geography or not (Kayne 2013: 133).

4.2 Implicational relationships

One kind of result that has been theoretically illuminating, independent of geographic concerns, involves implicational relationships of the kind familiar from linguistic typology (Greenberg [1963, 1966] and later work; see Szmrecsanyi and Kortmann [2009] and Siemund [2013] for further discussion of these relationships across English dialects). For example, Tyler and Wood (forthcoming) study survey results focusing on the have yet to (HYT) construction, illustrated with sentences like (5). One of the questions they pursue is whether have is an auxiliary, leading us to expect a yes–no question as in (6a), or a main verb, leading us to expect a yes–no question as in (6b).

I have yet to visit my grandmother.
Have you yet to visit your grandmother?
Do you have yet to visit your grandmother?

There is quite a bit of variation in the judgments of sentences like those in (6). The quantitative results lead Tyler and Wood (forthcoming) to two conclusions. First, there are enough speakers who accept (6b) to take it to be a genuine syntactic option for many speakers. Second, speakers who accept (6b) are overwhelmingly likely to accept (6a) as well, but not the other way around: many speakers accept (6a) but reject (6b). Tyler and Wood (forthcoming) develop a syntactic analysis of the HYT construction intended to derive this asymmetry. Results like this strike us as one key area where quantitative studies can help raise (and answer) interesting theoretical questions.

4.3 Geographic diffusion

Because we ask for the ages of our survey participants, we are able to analyze changes in a grammatical phenomenon’s geographic distribution over time. An example of this comes from the “personal dative” construction (Christian 1991; Webelhuth and Dannenberg 2006; Horn2008, Horn2013; Gerwin 2014: Ch. 7; Hutchinson and Armstrong 2014). As illustrated in (7), the dative (her) is obligatorily pronominal and coreferent with the subject (she):

She has her a new boyfriend.

Personal datives were thought to be a phenomenon characteristic of the South (Webelhuth and Dannenberg 2006), though some (e.g., Christian 1991: 18) have conjectured that it may be found in other vernacular varieties as well. Our research has shown that it may be spreading geographically in “apparent time” (Bailey et al. 1991). Among speakers over 40, the construction is primarily accepted in the South. Among speakers between 18 and 30, however, acceptance is much more widespread. (Speakers between 31 and 40 fall somewhere in the middle.) This result, which may reflect what Horn (2008: 176) calls the “Braxton effect,”[9] illustrates the kind of results we might expect when we intersect geographic region with other linguistically relevant social categories.

4.4 Known geographic distributions

A final kind of result involves confirming or elaborating on previously hypothesized geographic distinctions. We will discuss one example of each, both taken from Wood (2016), to which the reader is referred for more details.

For an example of confirmation, the so don’t I construction previously mentioned in (1e) has long been thought to be restricted to eastern New England (Labov 1972: 815; Hall 2013). This is borne out in our survey data. While nearly half of the participants in eastern New England accept sentences like (1e), exceedingly few outside of eastern New England do.

For an example of elaboration, the be done my homework construction previously mentioned in (1e) has been thought to be characteristic of Canadian, Vermont, and Philadelphia English (Labov 2001; Yerastov2008, Yerastov2010, Yerastov2015; Fruehwald and Myler 2015).[10] This is borne out in our survey data, but we also find a more complex picture: it is favored not only in Philadelphia, but also in surrounding areas such as Delaware, southern New Jersey, and Maryland. It is also highly favored in much of New England, including not just Vermont, but New Hampshire, eastern Massachusetts, and Maine.

4.5 Summary

In sum, our primary research objectives stem from theoretical questions, and our survey methodology is designed to collect data that will bear on syntactic theory. But since our surveys also collect geographic and demographic information, we can analyze this information to determine which demographic factors, if any, constrain the phenomena we study. Knowing this can certainly help us to advance knowledge relevant to syntacticians, at the very least to help us find speakers of a construction we are interested in. Beyond that, we also aim, with this aspect of our project, to build on a rich tradition of dialect research and contribute to a broader understanding of how syntactic variation distributes across speakers of North American English.

5 Project outreach

As should be clear by now, the Yale Grammatical Diversity Project consists of many components. This has allowed us to collaborate with linguists at different levels of seniority and experience, from undergraduates with limited background, to graduate students, postdoctoral fellows, and faculty members. Each person has the opportunity to learn new content and skills, and to be involved in the mentoring of less experienced team members.

We have integrated our research into our teaching. For example, we have offered a course called “Grammatical Diversity in US English” as a freshman seminar, as a seminar for linguistics majors, and as an advanced syntax seminar open to both advanced undergraduate and graduate students. Our students tell us they have enjoyed sharing their new understanding of linguistic variation with interested friends and family members. They also valued the opportunity to meet professional linguists in the classroom, when we were able to invite the author(s) of their readings or public intellectuals who address interspeaker variation in American English in the media. Beyond teaching, we have advised PhD dissertations (Harris In progress; Matyiku 2017) and senior essays on topics related to the project.

We share our work with linguists outside our institution by publishing in a broad variety of venues; by seeking opportunities to give talks at conferences, workshops, and departmental colloquia; and by inviting other linguists with relevant interests to speak at our group meetings. We have also organized two workshops at annual meetings of the Linguistic Society of America. Both represented valuable opportunities to present our work to other members of our professional organization and to give exposure to the work of younger colleagues working on these topics, whether at Yale or at other institutions.

To share our findings and insights with non-linguists, we have been using our website (described in Section 2), Zotero bibliographic database (see Section 3.1), public lectures, and various types of media outlets. These include a Facebook page (, a blog (, two op-ed pieces (Zanuttini2014, Zanuttini2015), and press interviews. We value these opportunities to share what we learn from this project with a wider audience, not only to pass on what we know, but also to correct some misconceptions about language in general and dialects in particular. For example, some people think that one may find different lexical items and different “accents” in American English, i.e., differences in the phonological system, but not differences in the grammar (the morphosyntactic system); when grammatical differences are noted, they are attributed to simple ignorance of the “correct grammar” of English. As linguists, we can address such misconceptions, to raise awareness of what it means for an individual to master a language and, perhaps even more important, to discredit prejudice masquerading as the preservation of “correct” or “proper” grammar.

6 Conclusion

We have presented an overview of the goals and methods of the Yale Grammatical Diversity Project and sketched the types of results we have been finding. We see this work as informing and being informed by theoretical syntax, as well as illuminating the factors that correlate with the distribution of grammatical phenomena. We also see this line of work as a great opportunity to capitalize on existing public interest in language variation to illustrate the different ways in which linguists approach the study of language, to dispel myths, and to make scientific results accessible outside of academia.


Our work has benefited from interactions with more colleagues and students than we could possibly mention here, but to name just a few, we are especially grateful for ongoing discussions and collaborations with Judy Bernstein, Bob Frank, Lisa Green, Bill Haddican, Tricia Irwin, Greg Johnson, Goldie Ann McQuaid, Neil Myler, Teresa O’Neill, and Christina Tortora. Thanks also to the audiences at a variety of conferences where we have presented parts of the work described here. We are indebted to Bernd Kortmann and Eric Potsdam for their incisive comments on our manuscript. Finally, thanks to former and current members of the project for their contributions: Matt Barros, Phoebe Gaston, Alysia Harris, Nick Huang, Aidan Kaplan, Luke Lindemann, Zach Maher, Sabina Matyiku, Tom McCoy, Rachel Regan, Katie Ruffing, Peter Staub, Dennis Storoshenko, and Matt Tyler.

  1. Funding: The work described in this paper is supported in part by NSF grant BCS-1423872.


Adams, Michael. 2003. Slayer slang: A Buffy the Vampire Slayer lexicon. Oxford: Oxford University Press.Search in Google Scholar

Asudeh, Ash & Ida Toivonen. 2012. Copy raising and perception. Natural Language & Linguistic Theory 30(2). 321–380.10.1007/s11049-012-9168-2Search in Google Scholar

Bailey, Guy, Tom Wikle, Jan Tillery & Lori Sand. 1991. The apparent time construct. Language Variation and Change 3(3). 241–264. in Google Scholar

Barbiers, Sjef, Johan van der Auwera, Hans Bennis, Eefje Boef, Gunther De Vogelaer & Margreet van der Ham. 2008. Syntactic atlas of the Dutch dialects. Vol. 2. Amsterdam: Amsterdam University Press.Search in Google Scholar

Barbiers, Sjef, Hans Bennis, Gunther De Vogelaer, Magda Devos & Margreet van der Ham. 2005. Syntactic atlas of the Dutch dialects. Vol. 1. Amsterdam: Amsterdam University Press.10.5117/9789053567005Search in Google Scholar

Baugh, John. 1983. Black street speech: Its history, structure, and survival (Texas Linguistics Series). Austin: University of Texas Press.Search in Google Scholar

Behrend, Tara S., David J. Sharek, Adam W. Meade & Eric N. Wiebe. 2011. The viability of crowdsourcing for survey research. Behavior Research Methods 43(3). 800–813. in Google Scholar

Carver, Craig M. 1987. American regional dialects: A word geography. Ann Arbor: University of Michigan Press.10.3998/mpub.12484Search in Google Scholar

Christian, Donna. 1991. The personal dative in Appalachian speech. In Peter Trudgill & J. K. Chambers (eds.), Dialects of English: Studies in grammatical variation (Longman Linguistics Library), 13–19. London: Longman.Search in Google Scholar

Cornips, Leonie. 2015. An interview on linguistic variation with Leonie Cornips. Isogloss 1(2). 313–317. in Google Scholar

Edelstein, Elspeth. 2014. This syntax needs studied. In Raffaella Zanuttini & Laurence R. Horn (eds.), Micro-syntactic variation in North American English (Oxford Studies in Comparative Syntax), 242–268. Oxford: Oxford University Press.10.1093/acprof:oso/9780199367221.003.0008Search in Google Scholar

Erlewine, Michael Yoshitaka & Hadas Kotek. 2016. A streamlined approach to online linguistic surveys. Natural Language & Linguistic Theory 34(2). 481–495. in Google Scholar

Feagin, Crawford. 1979. Variation and change in Alabama English: A sociolinguistic study of the white community. Washington, DC: Georgetown University Press.Search in Google Scholar

Fruehwald, Josef & Neil Myler. 2015. I’m done my homework—Case assignment in a stative passive. Linguistic Variation 15(2). 141–168. in Google Scholar

Gerwin, Johanna. 2014. Ditransitives in British English dialects (Topics in English Linguistics 50.3). Berlin: De Gruyter Mouton.10.1515/9783110352320Search in Google Scholar

Gibson, Edward, Steve Piantadosi & Kristina Fedorenko. 2011. Using Mechanical Turk to obtain and analyze English acceptability judgments. Language & Linguistics Compass 5(8). 509–524. in Google Scholar

Green, Lisa J. 2002. African American English: A linguistic introduction. Cambridge: Cambridge University Press.10.1017/CBO9780511800306Search in Google Scholar

Greenberg, Joseph H. 1963. Some universals of grammar with particular references to the order of meaningful elements. In Joseph H. Greenberg (ed.), Universals of language: Report of a conference held at Dobbs Ferry, New York, April 13–15, 1961, 73–113. Cambridge, MA: MIT Press.Search in Google Scholar

Greenberg, Joseph H. 1966. Language universals, with special reference to feature hierarchies (Janua linguarum. Series minor 59). The Hague: Mouton.Search in Google Scholar

Grieve, Jack. 2009. A corpus-based regional dialect survey of grammatical variation in written Standard American English. Flagstaff, AZ: Northern Arizona University dissertation.Search in Google Scholar

Grieve, Jack. 2016. Regional variation in written American English (Studies in English language). Cambridge: Cambridge University Press.10.1017/CBO9781139506137Search in Google Scholar

Grieve, Jack, Dirk Speelman & Dirk Geeraerts. 2011. A statistical method for the identification and aggregation of regional linguistic variation. Language Variation and Change 23(2). 193–221. in Google Scholar

Hall, Joan Houston (ed.). 2013. Dictionary of American regional English (DARE). Digital edition. (accessed 12 July, 2016). Cambridge, MA: Harvard University Press.Search in Google Scholar

Harris, Alysia Nicole. 2013. Stressed BIN BIN causing stress: A formal semantic and pragmatic account of the focused remote perfect marker in AAE. Qualifying paper. New Haven, CT: Yale University.Search in Google Scholar

Harris, Alysia Nicole. In progress. The non-aspectual meaning of African American English ‘aspect’ markers: A grammar for emotion and expectation. New Haven, CT: Yale University dissertation.Search in Google Scholar

Hartman, Jeremy. 2011. (Non-)intervention in A-movement: Some cross-constructional and cross-linguistic considerations. Linguistic Variation 11(2). 121–148.10.1075/lv.11.2.01harSearch in Google Scholar

Hazen, Kirk, Jaime Flesher & Erin Simmons. 2013. The Appalachian range: The limits of language variation in West Virginia. In Amy D. Clark & Nancy M. Hayward (eds.), Talking Appalachian: Voice, identity, and community, 54–69. Lexington, KY: University Press of Kentucky.Search in Google Scholar

Hernández, Nuria, Daniela Kolbe & Monika Edith Schulz. 2011. A comparative grammar of British English dialects: Modals, pronouns and complement clauses (Topics in English Linguistics 50.2). Berlin: De Gruyter Mouton.10.1515/9783110240290Search in Google Scholar

Hickey, Raymond (ed.). 2012. Areal features of the anglophone world. (Topics in English Linguistics 80). Berlin: De Gruyter Mouton.10.1515/9783110279429Search in Google Scholar

Horn, Laurence R. 2008. “I love me some him”: The landscape of non-argument datives. Empirical Issues in Syntax and Semantics 7. 169–192.Search in Google Scholar

Horn, Laurence R. 2013. I love me some datives: Expressive meaning, free datives, and F-implicature. In Daniel Gutzmann & Hans-Martin Gärtner (eds.), Beyond expressives: Explorations in use-conditional meaning (Current Research in the Semantics/Pragmatics Interface 28), 151–199. Leiden: Brill.10.1163/9789004183988_006Search in Google Scholar

Horn, Laurence R. 2014. Afterword: Microvariation in syntax and beyond. In Raffaella Zanuttini & Laurence R. Horn (eds.), Micro-syntactic variation in North American English (Oxford Studies in Comparative Syntax), 324–348. Oxford: Oxford University Press.10.1093/acprof:oso/9780199367221.003.0011Search in Google Scholar

Huang, Yuan, Diansheng Guo, Alice Kasakoff & Jack Grieve. 2016. Understanding U.S. regional linguistic variation with Twitter data analysis. Computers, Environment and Urban Systems 59. 244–255. in Google Scholar

Hutchinson, Corinne & Grant Armstrong. 2014. The syntax and semantics of personal datives in Appalachian English. In Raffaella Zanuttini & Laurence R. Horn (eds.), Micro-syntactic variation in North American English (Oxford Studies in Comparative Syntax), 178–214. Oxford: Oxford University Press.10.1093/acprof:oso/9780199367221.003.0006Search in Google Scholar

Ipeirotis, Panos. 2010. Demographics of Mechanical Turk. NYU Center for Digital Economy Research Working Paper CeDER-10-01. New York: New York University.Search in Google Scholar

Ipeirotis, Panos. 2015. Demographics of Mechanical Turk: Now live! (April 2015 edition). A Computer Scientist in a Business School. (accessed 22 July, 2016).Search in Google Scholar

Irwin, Patricia. 2014. SO [totally] speaker-oriented: An analysis of “Drama SO”. In Raffaella Zanuttini & Laurence R. Horn (eds.), Micro-syntactic variation in North American English (Oxford Studies in Comparative Syntax), 29–70. Oxford: Oxford University Press.10.1093/acprof:oso/9780199367221.003.0002Search in Google Scholar

Johnson, Kyle. 1988. Verb raising and ‘have’. McGill Working Papers in Linguistics 6(1). 156–167.Search in Google Scholar

Johnson, Dan R. & Lauren A. Borden. 2012. Participants at your fingertips: Using Amazon’s Mechanical Turk to increase student–faculty collaborative research. Teaching of Psychology 39(4). 245–251. in Google Scholar

Karttunen, Lauri. 2014. Three ways of not being lucky. Presentation at Semantics and Linguistic Theory (SALT) 24, Austin, May 30–June 1.Search in Google Scholar

Kayne, Richard S. 2005. Some notes on comparative syntax: With special reference to English and French. In Movement and silence (Oxford Studies in Comparative Syntax), 277–333. Oxford: Oxford University Press.10.1093/acprof:oso/9780195179163.003.0012Search in Google Scholar

Kayne, Richard S. 2013. Comparative syntax. Lingua 130. 132–151. in Google Scholar

Kortmann, Bernd. 2003. Comparative English dialect grammar: A typological approach. In Ignacio M. Palacios Martínez, María José López Couso, Patricia Fra López & Elena Seoane Posse (eds.), Fifty years of English studies in Spain (1952–2002): A commemorative volume (Cursos e congresos da Universidade de Santiago de Compostela 140), 63–81. Santiago de Compostela: Universidade de Santiago de Compostela.Search in Google Scholar

Kortmann, Bernd. 2005. Freiburg English dialect corpus (FRED). Freiburg, Germany: University of Freiburg. (accessed 21 October, 2016).10.1515/9783110197518.1Search in Google Scholar

Kortmann, Bernd, Tanja Herrmann, Lukas Pietsch & Susanne Wagner (eds.). 2005. A comparative grammar of British English dialects: Agreement, gender, relative clauses. (Topics in English Linguistics 50.1). Berlin: Mouton de Gruyter.10.1515/9783110197518.1Search in Google Scholar

Kortmann, Bernd & Kerstin Lunkenheimer (eds.). 2012. The Mouton world atlas of variation in English. Berlin: De Gruyter Mouton.10.1515/9783110280128Search in Google Scholar

Kortmann, Bernd & Kerstin Lunkenheimer (eds.). 2013. The electronic world atlas of varieties of English (eWAVE). Leipzig: Max Planck Institute for Evolutionary Anthropology. (accessed 29 September, 2016).Search in Google Scholar

Kortmann, Bernd & Edgar W. Schneider (eds.). 2004. A handbook of varieties of English: A multimedia reference tool. Berlin: De Gruyter.Search in Google Scholar

Kotek, Hadas, Yasutada Sudo, Edwin Howard & Martin Hackl. 2011. Most meanings are superlative. In Jeffrey T. Runner (ed.), Experiments at the interfaces (Syntax and Semantics 37), 101–145. Leiden: Brill.10.1108/S0092-4563(2011)0000037008Search in Google Scholar

Kurath, Hans, Miles L. Hanley, Bernard Bloch, Marcus Lee Hansen & Jr. Guy S. Lowman (eds.). 1939–1943. Linguistic atlas of New England. (Linguistic Atlas of the United States and Canada). Providence, RI: Brown University.Search in Google Scholar

Labov, William. 1972. Negative attraction and negative concord in English grammar. Language 48(4). 773–818. in Google Scholar

Labov, William. 2001. Principles of linguistic change. Vol. 2, Social factors (Language in Society 29). Oxford: Blackwell.Search in Google Scholar

Labov, William, Sharon Ash & Charles Boberg. 2006. The atlas of North American English: Phonetics, phonology, and sound change: A multimedia reference tool. Berlin: Mouton de Gruyter.10.1515/9783110167467Search in Google Scholar

Labov, William, Paul Cohen, Clarence Robins & John Lewis. 1968. A study of the non-standard English of Negro and Puerto Rican speakers in New York City. Vol. 1: Phonological and grammatical analysis. US Department of Health, Education & Welfare, Office of Education CRP-3288. New York: Columbia University.Search in Google Scholar

Landau, Idan. 2011. Predication vs. aboutness in copy raising. Natural Language & Linguistic Theory 29(3). 779–813. in Google Scholar

Landau, Idan. 2013. Control in generative grammar: A research companion. Cambridge: Cambridge University Press.10.1017/CBO9781139061858Search in Google Scholar

Lanehart, Sonja L. (ed.). 2015. The Oxford handbook of African American Language. (Oxford Handbooks in Linguistics). Oxford: Oxford University Press.Search in Google Scholar

Lawler, John M. 1974. Ample negatives. Chicago Linguistic Society (CLS) 10. 357–377.Search in Google Scholar

Matyiku, Sabina. 2017. Semantic effects of head movement: Evidence from negative auxiliary inversion constructions. New Haven, CT: Yale University dissertation.Search in Google Scholar

McQuaid, Goldie Ann. 2012. Variation at the morphology-phonology interface in Appalachian English. Washington, DC: Georgetown University dissertation.Search in Google Scholar

Montgomery, Michael & Joseph S. Hall. 2004a. Dictionary of Smoky Mountain English. Knoxville: University of Tennessee Press.Search in Google Scholar

Montgomery, Michael & Joseph S. Hall. 2004b. Grammar and syntax of Smoky Mountain English. In Dictionary of Smoky Mountain English, xxxv–lxv. Knoxville: University of Tennessee Press.Search in Google Scholar

Murray, Thomas E. & Beth Lee Simon. 1999. Want + past participle in American English. American Speech 74(2). 140–164.Search in Google Scholar

Poletto, Cecilia & Paola Benincà. 2007. The ASIS enterprise: A view on the construction of a syntactic atlas for the Northern Italian dialects. Nordlyd 34(1). 35–52. in Google Scholar

Pylkkänen, Liina. 2008. Introducing arguments (Linguistic Inquiry Monographs 49). Cambridge, MA: MIT Press.10.7551/mitpress/9780262162548.001.0001Search in Google Scholar

Reed, Paul & Michael Montgomery (eds.). 2016. MultiMo: The database of multiple modals. (accessed 13 October, 2016).Search in Google Scholar

Rickford, John R. 1975. Carrying the new wave into syntax: The case of Black English BÍN. In Ralph W. Fasold & Roger W. Shuy (eds.), Analyzing variation in language: Papers from the second colloquium on New Ways of Analyzing Variation (NWAV), 162–183. Washington, DC: Georgetown University Press.Search in Google Scholar

Rickford, John R. 1999. African American Vernacular English: Features, evolution, educational implications (Language in Society 26). Malden, MA: Blackwell Publishers.Search in Google Scholar

Rogers, Andrew Daylon. 1973. Physical perception verbs in English: A study in lexical relatedness. Los Angeles: University of California dissertation.Search in Google Scholar

Schneider, Edgar W. (ed.). 2008. Varieties of English. Vol. 2, The Americas and the Caribbean. Berlin: Mouton de Gruyter.10.1515/9783110208405.0.23Search in Google Scholar

Schneider, Edgar W. 2012. Regional profile: North America. In Bernd Kortmann & Kerstin Lunkenheimer (eds.), The Mouton world atlas of variation in English, 734–762. Berlin: De Gruyter Mouton.10.1515/9783110280128.734Search in Google Scholar

Siemund, Peter. 2013. Varieties of English: A typological approach. Cambridge: Cambridge University Press.10.1017/CBO9781139028240Search in Google Scholar

Sprouse, Jon. 2011. A validation of Amazon Mechanical Turk for the collection of acceptability judgments in linguistic theory. Behavior Research Methods 43(1). 155–167. in Google Scholar

Szmrecsanyi, Benedikt & Bernd Kortmann. 2009. The morphosyntax of varieties of English worldwide: A quantitative perspective. Lingua 119. 1643–1663. in Google Scholar

Tagliamonte, Sali. 2006. “So cool, right?”: Canadian English entering the 21st century. Canadian Journal of Linguistics 51(2). 309–331. in Google Scholar

Tamminga, Meredith. 2013. Phonology and morphology in Dutch indefinite determiner syncretism: Spatial and quantitative perspectives. Journal of Linguistic Geography 1(2). 115–124. in Google Scholar

Trousdale, Graeme & David Adger (eds.). 2007. Special issue on English dialect syntax: Theoretical perspectives. English Language and Linguistics 11(2). 257–435.10.1017/S1360674307002249Search in Google Scholar

Tyler, Matthew & Jim Wood. Forthcoming. Microvariation in the have yet to construction. Linguistic Variation.Search in Google Scholar

Vaux, Bert & Scott Golder. 2003. The Harvard dialect survey. Cambridge, MA: Harvard University. (accessed 29 September, 2016).Search in Google Scholar

Webelhuth, Gert & Clare J. Dannenberg. 2006. Southern American English personal datives: The theoretical significance of dialectal variation. American Speech 81(1). 31–55. in Google Scholar

Wolfram, Walt. 1976. Toward a description of a-prefixing in Appalachian English. American Speech 51(1). 45–56. in Google Scholar

Wolfram, Walt & Donna Christian. 1976. Appalachian speech. Arlington, VA: Center for Applied Linguistics.Search in Google Scholar

Wolfram, Walt & Natalie Schilling. 2016. American English: Dialects and variation. 3rd edn. (Language in Society 25). Chichester, UK: Wiley Blackwell.Search in Google Scholar

Wood, Jim. 2014. Affirmative semantics with negative morphosyntax: Negative exclamatives and the New England So AUXn’t NP/DP construction. In Raffaella Zanuttini & Laurence R. Horn (eds.), Micro-syntactic variation in North American English (Oxford Studies in Comparative Syntax), 71–114. Oxford: Oxford University Press.10.1093/acprof:oso/9780199367221.003.0003Search in Google Scholar

Wood, Jim. 2016. Quantifying acceptability judgments in regional American English dialect syntax. Unpublished manuscript. New Haven, CT: Yale University.Search in Google Scholar

Wood, Jim, Laurence Horn, Raffaella Zanuttini & Luke Lindemann. 2015a. The Southern dative presentative meets Mechanical Turk. American Speech 90(3). 291–320. in Google Scholar

Wood, Jim, Einar Freyr Sigurðsson & Raffaella Zanuttini. 2015b. Partitive doubling in Icelandic and Appalachian English. North East Linguistic Society (NELS) 45(3). 217–226.Search in Google Scholar

Wood, Jim & Raffaella Zanuttini. 2016. Microvariation in American English applicative structures. Paper presented at Formal Ways of Analyzing Variation (FWAV) 3, CUNY Graduate Center, New York.Search in Google Scholar

Wood, Jim & Raffaella Zanuttini. Forthcoming. Datives, data, and dialect syntax in American English. Glossa.Search in Google Scholar

Yerastov, Yuri. 2008. I am done dinner: A case of lexicalization. Canadian Linguistic Association (CLA) 2008. 1–15.Search in Google Scholar

Yerastov, Yuri. 2010. I’m done dinner: When synchrony meets diachrony. Calgary, AB: University of Calgary dissertation.Search in Google Scholar

Yerastov, Yuri. 2015. A construction grammar analysis of the transitive be perfect in present-day Canadian English. English Language and Linguistics 19(1). 157–178. in Google Scholar

Zanuttini, Raffaella. 2014. Our language prejudices don’t make no sense. Pacific Standard. (accessed 15 April, 2015).Search in Google Scholar

Zanuttini, Raffaella. 2015. Don’t fear our changing language. Pacific Standard. (accessed 15 April, 2015).Search in Google Scholar

Zanuttini, Raffaella & Judy B. Bernstein. 2014. Transitive expletives in Appalachian English. In Raffaella Zanuttini & Laurence R. Horn (eds.), Micro-syntactic variation in North American English (Oxford Studies in Comparative Syntax), 143–177. Oxford: Oxford University Press.10.1093/acprof:oso/9780199367221.003.0005Search in Google Scholar

Zubizarreta, María Luisa. 1982. On the relation of the lexicon to syntax. Cambridge, MA: MIT dissertation.Search in Google Scholar

Supplemental Material

The online version of this article offers supplementary material (DOI:

Received: 2016-08-01
Accepted: 2016-12-16
Published Online: 2018-03-09

©2018 Walter de Gruyter GmbH, Berlin/Boston

Downloaded on 27.9.2023 from
Scroll to top button