Multiple-choice cloze items constitute a prominent tool for assessing students' competency in using the vocabulary of a language correctly. Without a proper estimation of students' competency in using vocabulary, it will be hard for a computer-assisted language learning system to provide course material tailored to each individual student's needs. Computer-assisted item generation allows the creation of large-scale item pools and further supports Web-based learning and assessment. With the abundant text resources available on the Web, one can create cloze items that cover a wide range of topics, thereby achieving usability, diversity and security of the item pool. One can apply keyword-based techniques like concordancing that extract sentences from the Web, and retain those sentences that contain the desired keyword to produce cloze items. However, such techniques fail to consider the fact that many words in natural languages are polysemous so that the recommended sentences typically include a non-negligible number of irrelevant sentences. In addition, a substantial amount of labor is required to look for those sentences in which the word to be tested really carries the sense of interest. We propose a novel word sense disambiguation-based method for locating sentences in which designated words carry specific senses, and apply generalized collocation-based methods to select distractors that are needed for multiple-choice cloze items. Experimental results indicated that our system was able to produce a usable cloze item for every 1.6 items it returned.
Computational Linguistics and Chinese Language Processing , Vol. 10, No. 3, September 2005, pp. 303-328