I’m doing a Zoom talk on Friday afternoon at FLACSO Mexico City this coming Friday. The title of the talk is “Single Country Research in Comparative Politics: The Promises and Pitfalls of Aligning Methodology with Ontology.” There’s no paper yet, but I’m taking this as an opportunity to try out an argument about what has happened to comparative politics in the last 20 years. Here is the argument.
In “Aligning Ontology and Methodology in Comparative Politics“, Peter Hall argued that there was a growing disconnect between the regression-based model of large-n comparison in comparative politics, and the growing body of research on history, sequence, and path-dependence in many of the sophisticated comparative politics works of the late-1990s and early-2000s. He argued that they should be aligned:
I will argue for the usefulness of a method, based on small-N comparison, that has long been available to the field but underappreciated because small-N comparison has too often been seen as a terrain for the application of “weak” versions of the statistical method rather than as one on which a robust but different kind of method can be practiced.
But there is an unspoken assumption that underlies Hall’s argument. Like most everyone reading this piece, Hall assumed that a proper resolution of this tension would necessarily mean aligning method with theory.
I will propose on Friday that in point of fact, the opposite has happened. Twenty years on, comparative politics has aligned theory with method, and so the dominant ontology of comparative politics is one that is amenable to quantitative analysis in the causal inference template. This is an important thing to notice. And it is worth thinking about whether this is a good thing or a bad thing.
There will be plenty of graphs to make the descriptive point, but the bigger point is a conceptual one, so I’ll be walking through how I think about the argument and its stakes in more of a lecture format. The argument draws the arguments I’ve made here (PDF) and here (PDF) and on the thinking I’ve done with various collaborators about the past and future of comparative politics and area studies.
This post is a “data-dump” of additional exploratory analysis. The tl;dr version is that urbanization is another strong predictor of which coalition prevailed in peninsular constituencies, but accounting for urbanization mostly does not wipe out the predictive capacity of ethnic structure.
I want to focus on four variables in particular.
Constituency size (area in km2)
Constituency population density (log of population/km2).
Median income
Unemployment rate
Taking these into account will give us a sense of how much the relationship between ethnicity and election outcomes is plausibly due to things that happened to be correlated with ethnicity: rural and lower-income constituencies in peninsular Malaysia area also those which tend to be majority bumiputera. This is a major inferential challenge that I have written about on this blog and in this piece in the Journal of East Asian Studies (pdf).
What you see below are sixteen graphs. Let me explain. For each variable (see the four subtitles), we have the correlation among peninsular constituencies between that variable and each coalition’s vote share. We also have the predicted probability of victory, derived from a simple multinomial logistic regression where that variable is the only predictor.
The summary message from this graph is pretty simple. There is indeed a strong correlation between constituency size and constituency population density—each a reasonable measure of urbanization—and coalition vote shares. PN did better in rural constituencies with low population densities, and PH did better in urban constituencies with high population densities. Looking at income and unemployment, though, we find no relationship whatsoever.
Because these analyses are only looking at bivariate relationships between one predictor and each election outcome, though, they don’t allow us to compare the strength of various predictors. For that, I estimate a series of OLS regressions of the following form:
We are predicting each coalition’s vote share as a function of ethnicity, the four variables above, and state fixed effects .* Standard errors are clustered at the state level. Here is what I find.
BN Vote Share
PN Vote Share
PH Vote Share
Bumiputera Population Share
0.27***
0.59***
-.86***
(0.04)
(0.03)
(0.03)
Ln(Area)
-2.38
1.00
1.64
(1.23)
(1.18)
(1.74)
Ln(Population/km2)
-5.09*
2.11
3.12*
(1.17)
(1.20)
(0.61)
Unemployment Rate
1.06
-1.44
0.01
(0.72)
(0.94)
(0.50)
Median Income
-0.00
-0.00
0.00*
(0.00)
(0.00)
(0.00)
N
164
164
164
* p < .01, ** p < .01, *** p < .001
The results are very clear: even accounting for urbanization and economic factors, ethnic structure is a very strong and consistent predictor of each coalition’s vote share.
To see how these results compare, I’ve estimated a multinomial logistic regression with all the predictors listed above (and the state fixed effects), and plotted the predicted probability of each coalition winning across the deciles of each predictor.
Ethnic structure remains a very strong predictor of which coalition prevails in peninsular constituencies. But we also wee that PH does well in densely populated places. And interestingly, once we account for ethnic structure and population density, PH also seems to do better in larger districts. Score one for multiple regression.
As a final exercise, let us consider the possibility that the relationship between ethnic structure and electoral outcomes depends on how urban or rural a district is. We can test this possibility by dividing constituencies into quartiles by how densely populated they are, and allowing the relationship between ethnicity and the probability of winning to vary across the quartiles. To read the results below, note that the x-axis tells you the relationship between ethnic structure and vote share for each coalition: points to the right of the line at 0 mean that higher bumiputera population share is associated with a greater chance of victory for that coalition, the reverse for points to the left of 0.
In rural and semi-rural districts, PN’s likelihood of victory is positively related to bumiputera population share, and reverse for PH’s likelihood of victory: PH is less likely to win the higher the bumiputera population share. This is the same as what I found above. But if you look at urban districts, you find no relationship between ethnic structure and the probability of victory for any party: confidence intervals are very large and we cannot reject the null hypothesis of no relationship.
Summing up, here is what we have found:
Urbanization is a very good predictor of vote share and the probability of victory for each of the three coalitions in peninsular elections
Even accounting for urbanization, ethnic structure is still a remarkably strong predictor of election outcomes in the peninsula.
When we allow the relationship between ethnic structure and election outcomes to vary according to how urban that constituency is, we find that that relationship disappears in the most urban peninsular constituencies.
Now we wait to see who forms the government.
NOTES
* State fixed effects are important for accounting for historical patterns of party politics (Kelantan is a PAS stronghold, Johor is an UMNO stronghold, etc.).