Skip to Main Content
Skip Nav Destination

The ten thousand Kims Free

7 October 2011

A unique project studying Korean names, can trace Korean culture back 1500 years. 

When Koreans marry, the wife retains her name, which is entered into the husband's copy of his family genealogy (jokbo or chokpo in Korean). The practice, which reflects the Confucian reverence for one's ancestors, has continued for centuries. As you might expect, jokbos are of great interest to historians. Less obviously, they provide a means for three physicists to test their statistical theories.

The physicists are Seung Ki Baek and Petter Minnhagen of Umeå University in Sweden and Beom Jun Kim of Sungkyunkwan University in South Korea. In a paper posted on the arXiv e-print server, they describe their analysis of women's names recorded in 10 jokbos that go back 480 years.

Baek, Minnhagen, and Kim divided the jokbos' 480-year span into 30-year intervals and for each interval tallied the number of women who joined the 10 families M, the number of different family names that those women possessed N, and the number of women who possessed the most common family name kmax.

KimHangul.jpg

The physicists wondered whether the changing values of M, N, and kmax could be reproduced by the random group formation (RGF) model. As a starting point, the RGF model assumes that groups (in this case groups of N women with the same family name) form through a mixing process that maximizes the entropy of a probability distribution (in this case the probability, PM(k), that a randomly chosen woman from a population of M has a family name that occurs k times).

The number and size of groups predicted by the RGF model depends on the sample size, which is what you'd expect for family names in real life. As more generations are recorded in the jokbos, the number and frequency of different family names increases. What makes the RGF distinctive is its history independence: For any generation, the frequency distribution of family names retains the same dependence on sample size.

That history independence might seem implausible, given how much famines, wars, industrial revolutions, and other traumas transform societies. To get the idea across, Baek, Minnhagen, and Kim make a comparison with the frequency distribution of words used by an author throughout his or her oeuvre. Because of its length and breadth, Leo Tolstoy's 1440-page novel War and Peace has a different word-frequency distribution than does his 76-page novella The Death of Ivan Ilyich. Nevertheless, you can think of the two distributions as being drawn from the same single and very large "meta-book" that characterizes the novelist's choice and use of words. Likewise Korean family names in the jokbos are drawn from the same "meta-registry" that reflects Korea's enduring culture—provided the RGF model applies, that is.

In fact, it turns out that the RGF model does reproduce how N has varied with M and other patterns derived from the jokbos. What is the origin of the model's success? Baek, Minnhagen, and Kim speculate that the answer lies in the stability of Korean culture:

It seems that some core of the Korean culture has remained intact over at least 1500 years and as both the population and occupied area expanded, it basically swallowed other cultural influences without compromising its core.

One of the RGF model's predictions is that kmax, the number of women who have the most frequently occurring family name, is proportional to M, the sample size (not the case, according to the RGF model, for other family names). Kim is the most common name in the jokbos and, indeed, in Korea. ("Kim" is the name that appears in the accompanying photo.) By applying the RGF model, Baek, Minnhagen, and Kim estimate that in AD 500 Korea was home to 10 000 Kims.

or Create an Account

Close Modal
Close Modal