Category: Research

  • OMFG Exogenous Variation! Or, Can You Find Good Nails When You Find an Indonesian Politics Hammer

    I am teaching the Government department’s Comparative Methods course in the upcoming semester, and that has gotten me thinking quite a bit about the newest trends (or, maybe, fads) in empirical political science. One stands out: experimental or quasi-experimental research designs that promise that we can have clean identification of causal relationships that matter for politics. This post is about the pernicious things that can happen when we find a quasi-experiment, and their consequences for contemporary Indonesian political studies.

    For non-political scientists, let me explain some terms. (Political scientists, I will be playing fast and loose with terminology here, so hold back on your criticisms.)

    • Experiment: a treatment (T) is randomized across the subjects of a study to assess its effects on some outcome (Y). For example, what if, during campaign season, we randomly assigned some people to get face-to-face contacts, others to get direct mailings, other to get phone calls, and others to get no contact at all, to see if contact increases turnout. T0 = no contact, T1 = (face-to-face contact, mail, phone call), Y0 = not vote, Y1 = vote. Does being exposed to T1 increase the probability of Y1?
    • Quasi-experimental: “like an experiment,” meaning that for some reason we can believe that a “treatment” has been assigned to subjects/units “as if” it were random. For example, what if a colonial border accidentally cuts some ethnic groups into two parts, leaving them in different countries, which have different political contexts? The treatment here is national political context, and the outcome is ethnic politics. This is important because it’s hard to actually randomize the most important things. You can’t randomize national political context, so you need some clever way to find the ways that it varies as if it were random.
    • Identification: we know that an independent variable X causes an outcome Y, not the reverse (Y causes X) or something degenerate (W causes X and Y, or W causes Y and just happens to be correlated with X too). If you’ve ever heard that correlation does not imply causation, then identification refers to the statement that in a specific instance, yes, correlation does imply causation because we can rule out all reasons why it might not. This is best with a treatment (T) from an experiment rather than your garden variety independent variable (X).
    • Clean identification: short hand for “we are pretty sure that we know that we have identified the effects of X on Y”
    • Exogenous variation: the mechanism that gives you the quasi-experiment. Exogenous is the key part: it means that the assignment of treatment versus control is known to be external to the processes that generate the outcomes that you want to study. Sometimes we can find one or more other variables (Z) that govern assignment to the treatment status (Z causes X but not Y conditional on other W that cause Y), and we call these instrumental variables. Sometimes we just know that X varies for reasons other than it is caused by Y or W.
    • Cause: hard to explain the precise definition of what it means to cause something in this framework, but let’s follow Mahoney and say that under this framework, a cause is “a value on a variable that makes an outcome more likely; a cause increases the probability that an outcome will take place.” When we make a statement like, say, economic development leads to democracy, what we mean is that higher values on the economic development variable correspond to higher probabilities that a country is a democracy (NB: this is still pretty imprecise, but the point should be clear).

    Today, the mainstream core of empirical political science research in the United States is careful about estimating causal relationships. Many of the tricks of the trade involve finding research designs that give us confidence that statistical correlations or structured comparisons are identifying causal relations. The class I’m teaching will devote more than half of the semester to considering various strategies for finding research designs that can do this. If an experiment gives us the best ways to identify causal effects, then a quasi-experiment is second-best.

    It all depends, though, on the ability to justify the exogeneity of your exogenous variation. The problem is, it is actually really, really tough to do this. The universe does not provide a lot of natural experiments for us to work with. One common instrument is rainfall (yes, really). In the social world, lots of things are interrelated–they jointly cause each other, they are jointly caused something else, they are correlated but not causally related, and it’s therefore difficult to find associations that “only go one way.” Moreover, when we find what we think is exogenous variation, the assumption of exogeneity cannot be properly tested, even using super-duper statistics.

    The implication of all this is that when we do find an independent variable that is clearly exogenous to the outcome we want to study, it’s very exciting. OMFG exogenous variation! But the fact that variation is exogenous has nothing to do with whether that variation is useful or interesting for the study of the outcomes that political scientists care about. Even less so for those of us who have a reason to care about a particular country.

    It turns out that recently, economists and political scientists have identified a source of exogenous variation in Indonesian politics: whether or not your district-level executive (bupati or walikota) was directly elected, indirectly elected, or appointed. Why is this exogenous? Here’s why: Under the New Order, these leaders were all (essentially) appointed–the details are more interesting but that’s basically the story. I believe their terms were five years. From time to time, one would die, or one would get removed or replaced, and the term clock would start over. This meant that by 1999, various bupati and walikota were being appointed at various different times. After democracy arrived in 1999, district heads were permitted to serve out their terms. The new local election laws had the new district heads being indirectly elected when their predecessors’ terms ended. Then in 2005, the law changed again, and now district heads were directly elected. That meant that in, say, 2006, you have some district heads who were indirectly elected, and some who were directly elected. This variation is exogenous to most things that political scientists think could be the consequences of direct versus indirect elections.

    I know personally of at least five separate papers by five separate scholars that focus on estimating the effect of the direct election of your bupati/walikota on some sort of outcome. Full disclosure: even I have worked on this. I am a fan of a lot of this research: there are some very smart scholars doing some very clever work on Indonesia’s political economy. But what strikes me is how narrow their research questions are. For one, any study using exogenous variation in district head elections is temporally bounded (by 2010 all district heads were directly elected). But more generally, I just do not think it’s likely to explain a lot of interesting things about Indonesian politics. If your goal is to understand Indonesian politics, or to describe Indonesian politics in any sort of sophisticated manner, knowing that the direct election of your district head causes the probability of some outcome to change will just not do much for you. You will have identified a true, but possibly trivial, causal effect.

    The problem, in other words, is that the search for exogenous variation has superseded the study of causal relations that cannot be cleanly identified. One of my colleagues relates the story of his grad school colleagues who instead of learning languages or reading history, “sat around in [BUILDING NAME REDACTED] trying to think up instruments.” Again, I am more than guilty of doing this myself, but when this trend dies down, we may be left with a discipline of political scientists who are unable to say anything about politics. (As a personal aside, I should confess that this is my greatest fear as a researcher, that I am good at research design but just not sensitive enough to say important things about the countries that I care the most about.)

    Of course, there are good arguments in favor of a narrow focus on identification. Trivially true things have the nice feature of at least being true, and it’s not clear how to assess whether we learn anything from claims that are interesting but possibly false. One interpretation of “how science works”–and I do think that we should be in the business of doing science–is through the gradual accumulation of small findings that we are confident about, not through competing grand theories of everything that have shaky empirical support. Take a physics analogy. Not everyone gets to come up with the theory of relativity; in fact, the vast majority of physicists spend their lives making small and careful contributions. In that sense, our focus on more and more narrow sets of questions is the sign of a maturing discipline, not a narrow one

    But the consequences for Indonesian political studies are striking. Let’s say that someone, like me for instance, got into the business of studying Indonesian politics because we care about Indonesia. If we as political scientists train our students to focus on identifiable questions, we will only be able to study identifiable causal relations. We will not study what most Indonesians care about. Angus Deaton put it best: “we have at least some control over the light but choose to let it fall where it may and then proclaim that whatever it illuminates is what we were looking for all along.” That is bad news for Indonesian political studies in the United States, and it should be cause for concern.

  • Voting for Philippine Independence

    The Philippines used to be an American colony. Its main exports to the mainland–which were not subject to tariffs because, well, the Philippine islands weren’t a different country–were sugar and copra. Sugar (from sugarcane) was cheaper and of higher quality than domestically-produced sugar, which comes primarily from sugar beets. Copra is refined into coconut oil, which competed with other vegetable oils, animal oils and fats, and fish oils. In the 1930s, this especially meant cottonseed oil, which was turned into soaps. It also meant butter, because of recent innovations that meant that coconut oil could be partially hydrogenated and turned into margarine.

    Does the desire for protection from tariff-free sugar and copra imports explain the decision to grant the Philippines independence? The point is this: get the Philippines outside of our borders, and we can impose tariffs. Let’s look at cotton, sugar beet, and milk production across the states. Let’s also throw in sugar cane and the percentage of a state’s population that is of Filipino ancestry, and then compare that to Senate votes for independence in the Tydings-McDuffie Act of 1934 (which granted the Philippines self-government, later to become independence). You could do this with House votes on that bill too, but absent data on milk, sugar cane, etc. by congressional district, the results will not be particularly helpful.

    I’ve transformed sugar beet/cane production, cotton production, and milk production to an approximately logarithmic scale. The maps show that sugar beet and cotton production, taken together, unite most of the West with the South, and this is where most of the votes in favor of independence come from.

    Let’s look at this more formally. The dependent variable is the number of Senate votes in a state (0, 1, or 2). The independent variables are the (transformed) variables above, along with a measure of partisanship in each state’s senate delegation. I estimate an ordered logistic regression, with the results below.

    DV: Senate Votes for Philippine Independence in 1934, by State

    Variable Estimate S.E. t value
    cotton 0.3725 0.2049 1.818
    sugar beets 0.3652 0.2071 1.763
    sugar cane 6.4251 2.410e-07 2.666e+07
    milk -0.1963 0.3147 -0.6237
    filipinos 1908.9278 6.972e-04 2.738e+06
    democrat 3.0975 1.231 2.516

    These results support the idea that cotton and sugar beet lobbies mattered; not so much support for the dairy lobby. The huge, highly statistically significant coefficients on “sugar cane” and “filipinos” represent the fact that Louisiana (which produced the overwhelming majority of sugar cane) and California (which had the overwhelming majority of Filipinos) voted for independence.

    The question that this does not answer is why the U.S. did not grant independence to Puerto Rico or Hawaii at the same time. Both of these territories exported tremendous amounts of sugar to the U.S., so my argument would expect that there would be a demand to get them out of the U.S. too. There was an attempt to do so with Puerto Rico in the late 1930s (led by many of the same people), but it seems to have failed, and there was no vote on it. I’m unaware of any similar move for Hawaii.

    Any thoughts on this would be welcome. My hunch is that it has to do with the interaction of the structure of the Filipino sugar industry and the addition of copra as a main export from the Philippines. Puerto Rican and Hawaiian sugar plantations were owned mainly by Americans (contrast that to the majority indigenous Filipino sugar industry) and the other export products produced in PR and Hawaii (coffee and pineapples, respectively) did not compete with anything produced in the U.S. mainland. Input from the world’s leading authority of the expansion of the states westward in the 1800s would be most appreciated.