Month: August 2014

  • Text Mining Regime Type: What Could Go Wrong?

    I am 100% fascinated by this post by Jay Ulfelder on using text mining to code regime type. I love how it allows us to think concretely about fuzzy classifications, and to be explicit about uncertainty. If I had to guess, I’d say that this is the future of regime coding.

    I was also a bit dismayed to see that some of my favorite countries score lower on a likelihood-of-democracy score than I would like.

    Image credit: Jay Ulfelder, from http://dartthrowingchimp.wordpress.com/2014/08/25/mining-texts-to-generate-fuzzy-measures-of-political-regime-type-at-low-cost/

    What could explain why? Well, the one thing that we depend on in such exercises is an unbiased (or at least “randomly” biased) source of text. And it could be that there are features of countries that lead sources to discuss them differently. Let’s take an example: the killing of an unarmed black man by a police officer will almost certainly not affect the way that Freedom House discusses the United States. However, a police killing of an unarmed man in a country like Indonesia may affect the way that Freedom House discusses Indonesia. Same with violence against Sikhs in the U.S. versus violence against Christians, Ahmadis, and Shia in Indonesia. The list goes on.

    Now, this need not undermine the exercise. My guess is one could model the differences in the way that the underlying texts represent political conditions in the countries that they cover using a rich set of observables. If we hypothesize that the sources talk about rich countries differently, they talk about Muslim countries differently, and they talk about countries with histories of authoritarianism differently (so that the authoritarian histories have “slow decays”), then perhaps that could be used as input to the classification algorithm. I don’t have access to the Ulfelder, Schrodt, and Ward paper, but I wonder if they’ve explored such issues.

    One thing is for sure: I wish I could raise these questions myself at their presentation. Unfortunately, I am a discussant on a panel at the same time, at the always popular 8AM slot. So, I look forward to reading more.

  • Why Crowd-Sourced Election Monitoring Mattered in Indonesia

    Indonesia’s 2014 election finally came to a close last Friday, as Indonesia’s Constitutional Court rejected the Prabowo-Hatta team’s challenge of the July 9 elections (see Ray Yen at New Mandala for some pictures). This is not an unexpected ruling: we have known for some time that Jokowi-JK won the popular vote by a substantial margin.

    We know this, of course, because of crowd-sourced election monitoring by KawalPemilu.org and other outfits that digitized the tens of thousands of ballot forms that had been scanned by Indonesia’s Electoral Commission (KPU) and placed online. A lot of credit is due to KPU itself for taking the affirmative step to put all of these forms online in the first place. But given the slow pace of KPU in actually releasing the final tallies, KawalPemilu.org gets major kudos for releasing everything, and quickly.

    Let’s take a look at how KPU did. Since July 17, every other day or so I’ve been scraping the village/ward-level vote tallies from the KPU website using a nifty little Python script put together with Seth Soderborg. KPU has 81,000 villages/wards to cover, so it’s not surprising if not everything is ready at once. But a substantial proportion of these results remain missing, over a month after “official” decision by KPU on July 22 and even today, a month later! The slopegraph below illustrates the problem of missing vote returns, over time and by province. The vertical dashed lines are for July 9 (election day), July 22 (the date that the official result was released), and August 22 (the final verdict by the Constitutional Court).

    missing1
    You can see clearly that even today, KPU has yet to release substantial amounts of data on the final vote returns from provinces like Papua, North Maluku, and South Sumatra. And clearly, they were holding back even more data until the August 22 decision: the data points for on August 22 are what I scraped during the North American daytime hours that day, meaning that they were uploaded to the KPU website only hours earlier, right after the Constitutional Court’s decision, Indonesia time.

    Again, KPU gets major credit for putting the scanned forms online. But it was KawalPemilu.org, and everyone who helped count the votes, that confirmed that Jokowi-JK won before anyone else could.

    UPDATE

    Kevin Fogg points out in a comment that the above figure might be misleading in its focus on the percentage of missing villages/wards. It’s also useful to see the total number by province. Here is what that looks like.
    missing2
    What we really want to know, though, is the percentage of Indonesian voters whose votes have yet to be counted. Alas, we can’t count that, not without a separate estimate of the population of each village or ward.