Is Quantitative Description without Causal Reasoning Possible?

This week saw the launch of an exciting new journal entitled the Journal of Quantitative Description: Digital Media. Although the bit after the colon delimits topical scope of this particular journal, it is the bit before the colon that is most exciting and which has elicited wide commentary. JQD:DM promises to publish

quantitative descriptive social science. It does not publish research that makes causal claims

This is a big statement, because many if not all mainstream social science journals are increasingly consumed by a focus on causal inference using quantitative methods. To be fair, this has probably been true for a long time now. But the revolution in statistical methods for causal inference in the past forty years has given quantitative social scientists a very sophisticated toolkit for understanding the relationship between statistical procedures and causal claims, such that progress in the latter is now catching up with progress in the former.*

I do not think that anyone seriously holds the position that only causal inference is important. Description has always been essential to the scientific and social scientific enterprise: what is the population of Israel? what is the behavior of the cardinal eating from my bird feeder? and so forth. Yet the task of quantitative description raises an interesting question about the role of causal reasoning in making theoretically relevant descriptive statements.

I will make two assumptions as a starting point:

  1. quantitative description is always theoretical
  2. theoretically interesting tasks of quantitative description involve relating one variable to another variable.**

These assumptions are not assumptions about quantitative methods themselves—one could always simply produce descriptive statistical correlations between, say, refrigerators per capita and infant mortality across Indonesian provinces—but rather about the types of quantitative descriptions that are held to advanced social scientific knowledge. Assumption 1 tells us that we rely on theory to tell us what is potentially informative about a quantitative description, and Assumption 2 tells us that we should focus on what problems arise when we describe relations among variables.***

Under these maintained assumptions, I think that it follows that all quantitative description is done either in the shadow of causal reasoning, or with implicit restrictions on the system of causal relations that the quantitative description partially captures.

Let’s start with a classic example of what seems to be a good quantitative description: creating an index that measures a latent psychological construct. Bill Liddle, Saiful Mujani, and I did this for my 2018 book Piety and Public Opinion, creating what I called a “piety index” designed to capture individual piety across a sample of Indonesian survey respondents. We made this index from multiple variables, and used theory to restrict “what went into” this index, so Assumptions 1 and 2 hold. Isn’t this just descriptive? It is: but note that the grandfather of latent trait analysis, Spearman (1904), proceeded from a model in which the latent construct caused the observable indicators associated with it. This causal claim feels rather innocuous, but it is causal; and any attempt to relate an index of the form that I created with any sorts of other outcomes or correlates must confront some sort of causal model to be interpretable.

Turn to another example: the cross-national relationship between private gun ownership and state terror (a topic I first addressed thousands mass shootings ago). There, I produced descriptive correlations between, well, state terror and private gun ownership, but deliberately asserted that

Of course, these are not estimates of the causal effect of gun ownership (or anything else) on state terror. These are conditional correlations, and there are plenty of reasons why we might believe that the causal relations here are more complicated than what this discussion has implied. 

The point I was trying to raise is that we learn things from these correlations even when we are sure that they are not causal. This, I think, is related to the model that JQD:DM seeks to follow.

But this is not a causation-free analysis! It is interesting only insofar as we can related it to a causal question. We reason through the potential set of causal relations that could have produced that correlation to make sense of what it likely means. A long quote from a follow-up post makes the point (funny enough, it anticipated JQD:DM):

If I were writing an article for a good social science journal, I’d probably stop right here and abandon the project. Thankfully, we have eliminated some of the numerology from quantitative social science in the past two decades, meaning that we cannot wave our magic interpretive wand over a regression table to reach our preferred conclusion. If you want to claim to have identified “the effect of” gun ownership on freedom from state terror, partial correlations will no longer suffice.

But we still learn policy-relevant things from these results even if they do not identify a causal relationship. The first point is to remember that the question of interest is not the average causal effect of gun ownership on state terror (which, for better or for worse, as become the question of interest for quantitative social science research). Instead, our policy question is more squishy: does such widespread gun ownership protect American citizens from tyranny? Here is what we have learned even without an estimate of a causal effect.

1. American citizens aren’t as protected from state terror as we might think.

2. Plenty of countries rate as highly (or more highly) than the U.S. with lower levels of gun ownership.

3. Plenty of countries with lower levels of gun ownership experience far more state terror with lower levels of gun ownership.

4. The partial correlation between gun ownership and state terror disappears when you take regime type and economic development into account.

All of these data are hard to square with the idea that the ubiquity of firearms in the U.S. is protecting Americans from state terror. We can construct a theoretical world in which gun ownership at the levels that we see in the United States today is protecting us from tyranny, but that theoretical world must have a lot of curious features to it to also produce the results from yesterday. 

Understanding what the conditional correlation could have possibly meant implied that we could imagine some sort of causal system from which the qualitative description—the correlation ρY,X is statistically significant, but the correlation ρY,X|W is not—emerged.

There are other examples that I might provide, but I hypothesize that any quantitative description that is held to advance social scientific knowledge in the ways that the journal hopes will be either multivariate measurements of things or tantalizing correlations among things.

Is this bad or wrong? Does it undermine the purpose of JQD:DM? In both cases the answer is no. I reach a different conclusion: that JQD:DM and any journal like it will always confront lurking criticisms that causal reasoning is somehow being smuggled into the quantitative descriptions that they publish. This is a fine problem to have, but I suspect that even a journal explicitly devoted to quantitative description will struggle to police the boundary between descriptive and causal inference.

By way of conclusion, here is a speculative future for journals like JQD:DM. In many if not most cases, there is a lot to be learned from statistical correlations that cannot be given a strong causal interpretation. The standard in most quantitative social science is to target a causal parameter like an average treatment effect or a dose-response function****. The enterprise “fails” if the design does not allow for that target parameter to be identified, and as my example of the paper that I would abandon hints, researchers often will not even try if they know that it is unidentifiable.

Another approach would be to identify a target parameter, a quantitative descriptive fact that is partially informative about that parameter, and a mapping from the former to the latter using assumptions and logical bounds. JQD:DM and journals like it might foreground this sort of approach to highlighting what we learn from quantitative descriptive exercises. A loosely related approach such as that outlined in Little and Pepinsky (2021) might fit nicely under this model as well.

NOTES

* To make explicit what I mean in this sentence: we have long had sophisticated statistical tools, but without the theory of causality required to attribute causal meaning to them.

** Examples of univariate quantitative description would be finding answers to the questions of “how many balls are in that urn?” or “what is the GDP of Venezuela?”

*** Importantly, observe that time is a variable. To describing how a single variable differs across time is a task relating multiple variables to one another.

**** Or, analogously, a sufficient statistic or an identification region.

On the Historiography of Srivijaya

This post is written especially for current students in GOVT 3443/ASIAN 3334, Southeast Asian Politics.

Earlier this semester we briefly discussed the great kingdoms of pre-colonial Southeast Asia, from Dai Viet to the Khmer Empire to Pagan to Majapahit. One kingdom that we mentioned briefly was the Srivijayan Empire, a maritime state whose territory spanned Sumatra and the Malay peninsula. Here is the map that I showed you.

Later we mentioned Srivijaya one more time, in addressing the rise of the Malacca sultanate.

A theme in this part of our lectures was (1) the difficulty of describing the polities of pre-colonial Southeast Asia, owing to the incomplete and fragmentary evidence available to us today, (2) the role of colonial powers and colonial-era scholarship in producing the knowledge that we do have, and (3) the attendant result that it appears that we only can start talking about “politics” in depth in Southeast Asia when we start to get European records.

We tried our best to react against this, noting that in addition to the local evidence that has survived in the form of monuments, temples, constructions, inscriptions, and others, we do have other records left by Chinese and later Arab and European traders. We discussed the concept of the mandala as the dominant pre-colonial political form, of Zomia as outside of the lowland mandala polities, and the distinction between hulu and hilir. Still, we didn’t have much by the way of concrete discussion of any of these empires; for that, you can look to other great classes here for more on these empires’ history, architecture, art, and religion.

However, because I am not a historian of pre-colonial Southeast Asia (and neither are any of you! at least not yet), I presented these pre-colonial empires as basically facts. So, too, did your SarDesai reading. It was not up for debate whether or not the Khmer Empire was a great empire–it was–or whether Ayutthaya was the central political actor in the Chao Phraya valley. It would be unthinkable for me to even debate that.

As it turns out, the same is not true of the Srivijayan Empire.

This blog post by Liam Kelley, a historian of Vietnam, introduces a striking argument that has sparked a serious debate about the status of the Srivijayan Empire. The author’s claim is that the sources that scholars have used to describe Srivijaya as a great empire are talking about something else–in the author’s view, Angkor. The argument is not that there was no such thing as Srivijaya, as there are inscriptions that use the word Srivijaya that have been found in southeastern Sumatra. Rather, the point is that this Srivijaya is not the same thing as the polities described in the important pre-colonial sources that have served as our main evidence for the Srivijayan Empire.

He makes this argument by analyzing the primarily Chinese accounts that serve as the evidentiary basis for describing Srivijaya as an important pre-colonial polity, such as accounts of Chinese traders spending months in Srivijaya learning Sanskrit before later traveling to what is today India. We do not have local evidence of this, we have only the accounts of others, as well as the names of the places that they used, written in the Chinese of the time. Kelley argues that those words (such as Shi-Li-Fo-Shi) do not describe Srivijaya. You can read his posts for all the gory details.*

Remember the important distinction between Srivijaya and the empires of the mainland. Whereas many of those formed around great riverine systems (Mekong, Irrawaddy, Red River, etc.) that allowed the empires to amass large population bases through the intensive cultivation of rice, Srivijaya was a maritime-facing polity. It has been described recently as a thalassocracy: an empire with a maritime focus. Majapahit on Java was a thalassocracy. But Majapahit also left reams of evidence in Java itself of its own existence, and of its own greatness. The same is not true of Srivijaya.

I must insist that I am not qualified to evaluate the argument that Kelley presents. I must also insist that even if it were true that the sources used to describe Srivijaya are actually talking about some other place, this does not logically entail that there was no such thing as Srivijaya: this word appears in inscriptions found in Sumatra, so it describes something.** But there are some important points to take away from this emergent debate, even if it turns out that Kelley is entirely wrong.

First, the evidentiary basis for what we know about Srivijaya is very incomplete. Kelley is not the first to note that the analyses of Srivijaya rest to a large degree on the accounts from others traveling through the region. There is precious little evidence of Srivijaya that comes from the territories where it was located. To say anything about the politics of pre-colonial Southeast Asia in this case requires us to work very hard to assemble an evidentiary base.

Second, the effort to discover and analyze Srivijaya is intimately tied with colonial-era scholarship. I did not fully appreciate, for example, that the first concrete proposal of the existence of a Srivijayan Empire came from George Cœdès–a French archeologist–in 1918. That is not that long ago! He has been described as having “discovered” Srivijaya. A lot of scholarship about Srivijaya rests on his interpretations of words in Old Malay found in contemporary Thailand, and those interpretations are much more contested than I realized.

Third, these facts interact in what might be uncomfortable ways when it comes to post-colonial scholarship and our understanding of the pre-colonial polities of Southeast Asia. The concept of the Srivijayan Empire is important to the concept of Indonesia itself, much like Majapahit is, as a pre-colonial antecedent to the post-colonial state. Much of the post-colonial scholarship on Southeast Asia sought to uncover what John Smail called in 1961 an “autonomous history of Southeast Asia.” That is a history of Southeast Asia that sees the region in its own local terms, rather than merely as a reflection of Indic, Chinese, Arab, and European influences*** as they spread culturally, economically, religiously, and politically throughout the region.

I value this search for an autonomous history of Southeast Asia as well. And yet I am forced to think critically about the possibility of writing that history when the accounts that we use to do so were not produced by Southeast Asians in Southeast Asia.

NOTES

* Also the graphics are great. I wish that these existed as a series of TikToks too.

** But careful. Kelley suggests that Srivijaya describes a person, not a polity. And indeed, to anyone familiar with the Sanskrit influence on naming conventions in Southeast Asia, when you stop to think about it, “Sri Vijaya” sounds like a royal title.

*** Here we grapple with the question of Orientalism in the study of Southeast Asia, because the region itself is the subject to the external gaze of others in the East, not just Europeans.