The Indus Valley civilization, in what is now eastern Pakistan and northwestern India, flourished circa 2500–1900 BCE. To this day its writing, as in the figure, has not been deciphered. Indeed, scholars are unsure if the Indus script represents a language. Other, superficially similar ancient texts are thought to be either rigidly prescribed strings, such as a hierarchical list of deities, or nonlinguistic strings in which order is unimportant. Now computer scientist Rajesh Rao (University of Washington) and colleagues from several Indian institutions have studied the correlations of neighboring tokens (symbols or words) with a statistical tool—the conditional entropy—that reliably distinguishes natural languages from token strings in which the ordering is rigid or unimportant. The Indus script, they conclude, has the structure of a language. Like the conventional entropy, the conditional entropy involves the logarithm of a probability—in this case the conditional probability that a specified token appears, given its...

You do not currently have access to this content.