First major database of non-native English
After thousands of hours of work, MIT researchers have released the first major database of fully annotated English sentences written by non-native speakers. The researchers who led the project had already shown that the grammatical quirks of non-native speakers writing in English could be a source of linguistic insight. But they hope that their dataset could also lead to applications that would improve computers’ handling of spoken or written language of non-native English speakers. “English is the most used language on the Internet, with over 1 billion speakers,” says Yevgeni Berzak, a graduate student in electrical engineering and computer science, who led the new project. “Most of the people who speak English in the world or produce English text are non-native speakers. This characteristic is often overlooked when we study English scientifically or when we do natural-language processing for English.” Most natural-language-processing systems, which enable smartphone and other computer applications to process requests...