Category: Language

  • Urbanization, Ethnic Diversity, and the Rise of Indonesian

    As part of a multi-year project on language shift in contemporary Indonesia, Abby Cohn, Maya Ravindranth, and I have been using the incredible census data provided by IPUMS to study what factors determine whether Indonesians speak Indonesian at home. The data are remarkable in that they comprise a 1% sample the 2010 Indonesian census—which means that our sample size is 2,358,774 individuals. And better yet, anyone can access these data.

    One thing that sociolinguists know is that urbanization leads to language shift in multilingual societies. In the Indonesian context, this means that speakers of ethnic languages like Javanese, Sundanese, Batak, and so forth will shift to speaking Bahasa Indonesia, the country’s national language.

    But what’s going on here? Is this a consequence of urbanization itself, and the accompanying process of “modernization” of everyday life that (1) expose you to media in the national language and (2) lead to shifting identities away from regional/ethnic to national? Or is it a consequence of the ethnic diversity found in urban areas, which lead speakers of different languages to encounter one another more regularly and thus increase the benefits of speaking a common national language? In principle these two processes are distinct: you could have urban areas without ethnic diversity, or rural areas that are highly diverse. The neat thing about Indonesia is that it is so big and heterogeneous that we have instances of urban and rural districts that are both homogenous and diverse. This allows us to distinguish the two effects from one another.

    Because we know the district (kabupaten or kota) in which every individual lives, and we know his/her ethnic group, we can calculate a district-level measure of ethnic diversity (using a so-called Ethnic Fractionalization index [PDF]). We also know whether or not each individual is classified as living in an urban residence or not, so we can use that to calculate the fraction of each district that is urban. Both of these measures range from 0 to 1. Comparing each of the 494 districts recorded in the 2010 census, here is what we find.

    The good news is just how varied Indonesian districts are. There are ethnically homogenous, wholly urban districts (Kota Blitar, on Java) as well as ethnically homogenous, entirely rural districts (Nias Barat, off the coast of Sumatra). And looking to the right side of this scatterplot, we see a range of incredibly diverse districts, all of which are on Papua, that range from highly urban to highly rural.

    From there, we fit a hierarchical/multilevel logistic regression model in which we predict whether or not an individual speaks Indonesian at home as a function of a range of individual-level characteristics (age and its square, gender, religion, education, etc.) as well as district-level urbanization, ethnic diversity, and their interaction. We then predict, based on the results of that model, the probability that an individual speaks Indonesian as a function of their district’s ethnic diversity and at the 10th, 50th, and 90th percentiles of district urbanization. Here is what we find.

    If you live in an ethnically homogenous district, the likelihood that you speak in Indonesian at home is very low, no matter how urban that district is. But as ethnic diversity increases, so does the likelihood of speaking Indonesian—and especially so in urban districts. This shows very clearly that the relationship between urbanization and language shift in a diverse country like Indonesia really does depend on whether or not urbanization comes with increasing ethnic diversity. And although the relationship between ethnic diversity and language shift is largest for urban districts, this relationship is substantively quite large in rural districts too.

    Note, though, that to reach such a conclusion, you need a really diverse country like Indonesia that allows you to separate urbanization from ethnic diversity empirically. Thanks, Indonesia.

  • Passive Unfortunate

    The LA Review of Books’ China Channel recently featured an essay on the passive voice in Mandarin (HT LanguageLog). Entitled “Passive Aggressive,” it explains a particular Mandarin construction of the passive voice that emphasizes that something happened that has a negative connotation. Example:

    Gōngkè bèi gǒu chī diào le
    功課 被 狗 吃掉 了
    Homework bèi dog eat up le

    The homework was eaten up by the dog.

    The essay also notes that this “adversative passive” can be found in other Asian languages as well, including many (Japanese, Vietnamese, and Indonesian) that are 100% unrelated to Chinese (or to one another, for that matter). Adversative passive is an example of what linguists call an areal feature, or a linguistic feature that is shared across multiple languages regardless of their relationship with one another.

    But is Mandarin’s adversative passive bèi construction actually a parallel to the others? I am skeptical that the parallel is so clear.

    In Indonesian and Vietnamese, there are clear grammatical distinctions between a non-adversative passive and the adversative passive. These are things which a student learns in the first year of study. In Vietnamese:

    The active voice can be changed to passive voice by adding the following words: “được” if the verb describing the action implies beneficial effects for the agent and “bị” if the verb describing the action implies negative effects. The words “được” and “bị” must stand in front of the main verb.

    Trà được trồng ở Nhật Bản.
    Tea is grown in Japan.

    Anh ta bị chóng mặt.
    He is feeling dizzy.

    And in Indonesian:

    Transitive sentences can be transformed into passive sentences by:

    1, making the object of the active sentence become the subject of the passive sentence;
    2. replacing the prefix me- with di-
    3. making the subject of the active sentence become the agent…

    The prefix ter- is also used to express the passive voice but the prefix ter- implies that the action is accidentally done.

    As the above link notes, it’s entirely possible in Indonesian to have parallel passive constructions, one of which implies just passive voice, the other what I like to call the “passive unfortunate.”*

    Rumahnya dibakar tadi malam [= the house was burned down last night]
    Rumahnya terbakar tadi malam [= the house was unfortunately/accidentally burned down last night]

    For Mandarin to be a real parallel, we would need there to be a construction of the passive voice that does not imply adversativeness or unfortunateness. Does such a construction exist? This online resource provides examples of passive voice constructions that do not use bèi, but none is presented in the same way as the clear được/bị** or di-/ter- distinctions. If such a parallel does not exist, then the adversativeness of bèi is a pragmatic feature of the passive voice in Mandarin rather than a grammatical one, as in Indonesian ter- and Vietnamese bị.

    NOTES

    * Many an Indonesian poem and song lyric features the phrase terjatuh cinta [ = fell in love with involuntarily]. Indonesian also has an oddly rich grammar for expressing unfortunate things: terjatuh cinta, kejatuhan cinta, kena jatuh cinta
    ** I wonder if it’s incidental that the Chinese adversative passive particle bèi is so similar to the Vietnamese adversative particle bị.