Essays in English yield information about other languages
Computer scientists at MIT and Israel’s Technion have discovered an unexpected source of information about the world’s languages: the habits of native speakers of those languages when writing in English. The work could enable computers chewing through relatively accessible documents to approximate data that might take trained linguists months in the field to collect. But that data could in turn lead to better computational tools. “These [linguistic] features that our system is learning are of course, on one hand, of nice theoretical interest for linguists,” says Boris Katz, a principal research scientist at MIT’s Computer Science and Artificial Intelligence Laboratory and one of the leaders of the new work. “But on the other, they’re beginning to be used more and more often in applications. Everybody’s very interested in building computational tools for world languages, but in order to build them, you need these features. So we may be able to do much...