同義詞在資訊擷取與語義分類上是很重要的語料資訊，但將兩詞歸納為同義其原由則值得令人探討。從語義(sense)的觀點來說，多義詞組歸到特定同義組合中，其語義中應有與該類字詞同義集合。此類型的代表為《同義詞詞林》(梅家駒、竺一鳴、高蘊琦與殷鴻翔，1983)，將漢語同義字詞區分成具結構類別。而從計算語言學方法來說，同義詞關聯需要參考語料庫中詞組的出現頻率，輔以機器學習方法來計算同義詞相似度。然而前者專家分類原則是透過語感進行，若沒有對同義詞的類別原則加以定義，則後人便會產生對同義詞的混淆。後者機器學習方法使用統計方法來辨別相似詞彙，則會缺乏語義的辨別。為了瞭解同義詞組的概念內涵，本研究提出基於辭典釋義文字的關聯計算原則，試透過計算共同擁有的釋義文字出現比率，以解析兩詞彙間所包涵之釋義概念。並且以《同義詞詞林（擴展版）》為例，從釋義義涵的角度列舉出適合詮釋該詞組的詞彙，突顯該類別所包涵的語義。最後，比較SketchEngine (Kilgarriff et al.,2004)中所取得的同義詞(similar words)之間的差異。本研究計算結果雖然會受辭典釋義內容影響，但辭典釋義內容相較於人工分類原則與統計語料庫所得的數值資料，較能從詞義上詮釋詞彙之間的共有概念。我們希望能透過釋義關聯方法更瞭解詞彙間的交集概念，亦希望能在同義詞的語義計算上，提供辭典釋義與詞條編寫上的思考。 Synonym groups can serve as resourceful linguistic metadata for information extraction and word sense disambiguation. Nevertheless, the reasons two words can be categorized into a particular synonym group need further study, especially when no explanation is available as to why any two words are synonymous. Lexical resources, such as the Chinese Synonym Forest (or Tongyici Cilin) (Mei etal. 1983), assemble lexical items into hierarchical categories via manual categorization. Other than this, statistical measures, such as co-existing probability, have been adopted widely to verify synonymous relationships. Nevertheless, a purely statistical method does not provide description that can help interpret why such a synonymous relationship occurs. We propose a novel method for the study of shared concepts within any synonym group by comparing co-existing words in the dictionary definition of each member in the group. The co-existing words are seen as the representatives of shared concepts that can be used for interpretating any hidden meaning among members of a synonym group. We also compare our results with the thesaurus function in the Sketch Engine (Kilgarriff et al. 2004), which uses statistical data in the form of Sketch scores. The results show that our method can produce concept words according to dictionary definitions, but this method also has its limitations, as it orks only with a finite number of synonyms and under limited computing resources.
中文計算語言學期刊（Special Issue on Chinese Lexical Resources: Theories and Applications）, 18(2), 35-56