I am writing a book about Malaysia’s ethnic order*, using the Malay world as a way to think more generally about social categories like ethnicity and how to conceptualize them. This book project combines all of my interests: maritime Southeast Asia, political economy, colonial history (Portuguese, Dutch, English, and Japanese), language, analytic philosophy, social ontology, diaspora studies, criticizing things for not really making sense, trying to subvert paradigms, making custom maps, and statistics. The last of these is the subject of this post.
One of the tasks of my book is to demonstrate that my understanding of Malaysia’s ethnic order is not simply a projection of my own biases onto Malaysian history, society, and culture. To do this, I’ve been using survey data collected from Malaysians (and Indonesians… although that’s not really relevant for today’s post) to try to characterize the country’s ethnic order. An ethnic order, for me, is
the set of beliefs, practices, and ancillary institutions that organize and sustain ethnic categories within a given political community
Such an abstract concept is hard to characterize, and hard to survey people about without supplying exactly the answers that I hope that they will provide (thus invalidating the exercise). So what I have done is to survey Malaysians and ask them questions about what comes to mind when they think of various ethnic groups in the country. They provided open-ended responses, which I analyzed through a Structural Topic Model to estimate the relative salience of different types of responses for different types of ethnic groups.
The result is an analysis that says that, for example, when responses are generally focusing on things like language and religion, they are more likely to be talking about Malays versus Chinese. When they’re talking about economic matters, they are more likely to be talking about Chinese than Indians. And so forth. When the book is done, it will explain why such results are consistent with my particular approach to understanding Malayness in Malaysia.**
But with ChatGPT-4 and other large language models (LLMs) completely blowing up everything, it occurred to me this morning that one might wonder what the computer says about my data. I was inspired to do this by reading this paper from Petter Törnberg, which used ChatGPT-4 to predict what the partisan affiliated with the authors of tweets. I’m reporting here what I found from some initial playing around, and it is remarkable, scary, and open to a lot of interpretation (more on this at the end).
Here is what I did: I logged into ChatGPT-4 and opened up my file containing the open-ended survey responses. I then typed the following at the prompt:
I asked a Malaysian survey respondent the following question:
Now I am going to ask you about several different ethnic groups. I would like to you to tell me two (2) things that come to mind when you think about these groups. There is no right or wrong answer; you may think of particular words or phrases, or perhaps nothing at all.
I asked this about ethnic Malays, ethnic Chinese, and ethnic Indians in Malaysia. Imagine that the respondent gave the response “kurang rajin Islam.” Which of the three ethnic groups above do you think this response would be associated with?
The words kurang rajin Islam are the responses that one survey respondent actually gave. It is a combination of two responses, because I asked the respondent this question twice (one response was kurang rajin and the other was Islam). After hitting Enter, the ChatGPT-4 spit out the following:
Based on the response “kurang rajin Islam,” it is likely associated with the ethnic Malay group in Malaysia.
“Kurang rajin” translates to “less diligent” in English, and “Islam” refers to the religion of Islam. The phrase suggests a perception or association related to the Malay community and their commitment or dedication to their religious practices. It’s important to note that this response reflects the perspective of the individual respondent and should not be taken as a generalization for the entire Malay ethnic group.
This is actually bananas. The computer correctly identified which group that response was associated with. And it kept doing it. I asked
Imagine that the respondent gave the response “pemegang ekonomi.” Which of the three ethnic groups above do you think this response would be associated with?
And the computer said
The response “pemegang ekonomi” translates to “economic holders” or “economic owners” in English. Given this response, it is likely associated with the ethnic Chinese group in Malaysia.
The phrase “pemegang ekonomi” suggests a perception or association related to the economic prowess or prominence of the ethnic Chinese community in Malaysia. It reflects the recognition of their significant contributions to the economic sector in the country. However, it is important to note that this response represents the viewpoint of the individual respondent and should not be taken as a generalization for the entire ethnic Chinese group.
It gets worse, but I’ll spare you. And although it doesn’t always work, these initial forays show that LLMs are remarkably accurate at predicting what group a short survey response was referring to in this specific example. You can think of a lot of other ways that one might study social categories using these AI tools.
I don’t know exactly what to do with these results, but I do know that they’re getting at something significant. For one, the AI developers are mostly trying to rein in LLMs to prevent them from saying racist, sexist, and otherwise biased things. But here is a case where we have the reverse: actual statements (and rather explicitly biased in both cases) by real humans that can be classified according to the best guess of who is being referred to. How can this be stopped?
But I think that there’s another level at which to think about this result. That is, knowing that ChatGPT-4 is not thinking, but rather just doing a super-duper prediction model of what words are most likely to be sensible responses to other words, what could one learn from this? Is this evidence of something, and if so, of what? What people think? What people have written down? What Malaysians have written down? What researchers or journalists have written down?
There are even more levels. What about the coarseness of my own prompt (I wrote Malaysia, when I really mean peninsular Malaysia)? Should I give feedback on these results, thus contributing to reinforcement learning? If I write up these results, would I then be further contributing to stereotypes by producing a text that will feed into the same LLMs to produce even more such results? What if my own ethical goals are not to reinforce, but rather to undermine the present social order?
I don’t have any great answer to these questions. I am a pessimist about what LLMs will do to human society who nevertheless marvels at what they can do sometimes.
NOTES
* You can think of an ethnic order as like a “racial order.”
** But not Indonesia. That’s important. And also, not really even Sabah/Sarawak, this is about peninsular Malaysia. That’s also important.