政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/58981
English  |  正體中文  |  简体中文  |  Post-Print筆數 : 27 |  Items with full text/Total items : 109948/140897 (78%)
Visitors : 46070458      Online Users : 857
RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version
    Please use this identifier to cite or link to this item: https://nccur.lib.nccu.edu.tw/handle/140.119/58981


    Title: 中英文語句語意推論
    Textual Entailment Recognition for Chinese and English
    Authors: 黃瑋杰
    Huang, Wei Jie
    Contributors: 劉昭麟
    Liu, Chao Lin
    黃瑋杰
    Huang, Wei Jie
    Keywords: 語句推論
    近義詞判定
    經驗法則
    機器學習
    Entailment Recognition
    Near Synonym Recognition
    Heuristic Functions
    Machine Learning
    Date: 2012
    Issue Date: 2013-07-23 13:20:37 (UTC+8)
    Abstract: 語句的推論在自然語言處理相關領域的研究,如資訊檢索、資料擷取、自動摘要或智慧型教學等,已經日趨重要。自2005年Recognizing Textual Entailment (RTE)競賽開始,此議題逐漸受到重視,而Recognizing Inference in Text (RITE-1)競賽亦開始針對中文語句推論的研究議題提供評估的平台。本研究中我們建構一個根據文本分析設計各種函式計算推論關係的模型,並提出一套基於廣義知網的詞彙語意相似度計算方法,加強推論模型對句子語意的理解能力,進而提升推論效果;此外根據過去機器學習的作法,依照上述的函式抽取詞彙語意、語法結構、POS標記、詞彙覆蓋比例與詞彙依賴關係等特徵,採用多種演算法訓練分類模型判斷推論關係。實驗結果顯示我們的兩種系統在中文語句推論關係有不錯的效能,並在NTCIR-10 RITE-2競賽中獲得第二名的佳績,同時對機器學習分類模型效能的分析也指出中英文語料於判斷推論關係時不同的特性與較有效果的特徵集。此外我們透過閱讀測驗的實驗評估,瞭解推論系統於實際應用問題的效能,並指出未來我們可以推論系統為基底,發展閱讀測驗相關的智慧型教學系統,輔助學生閱讀理解的能力與教師在閱讀測驗編輯的品質。
    Recognizing Inference in Text (RITE) has become a serious issue in several research areas, such as Information Retrieval (IR), Information Extraction (IE), Automatic Summarization, or Intelligent Tutoring Systems (ITS). The research topic is getting more important since the First Recognizing Textual Entailment Challenge (RTE-1) was held in 2005. For Asian languages, Recognizing Inference in Text (RITE-1) provides evaluation standards on recognizing entailment systems. In this research, we built a system based on textual analysis and construct several heuristic functions to compute entailment in text. Besides, we proposed a method to measure the similarity between two Chinese words based on E-HowNet and used it to enhance the system’s performance. Moreover, machine learning techniques, such as SVM, J48 and Linear Regression are used to train classification models. We extracted features based on heuristic functions and other syntactic features. The experimental results indicated that our systems achieved great performances and received second places in NTCIR-10 RITE-2. The analysis of machine learning approaches also showed Chinese and English shared different linguistic characteristics and effective features on recognizing textual entailments. Besides, the experimental results of reading comprehensions showed that we can develop intelligent tutoring system based on this research. The intelligent tutoring system is able to enhance students the ability of reading understandings and help on generating quality reading tests.
    Reference: [1] 知網(HowNet),http://www.keenage.com/
    [2] 重編國語辭典修訂版,http://dict.revised.moe.edu.tw/
    [3] 劉群、李素建,“基於《知網》的辭彙語義相似度計算”,中文計算語言學期刊,7(2),頁59-76,2002。
    [4] 廣義知網知識本體架構線上瀏覽系統(Extended-HowNet) ,http://ehownet.iis.sinica.edu.tw/
    [5] Alexander Budanitsky and Graeme Hirst, “Semantic distance in WordNet: An experimental, application-oriented evaluation of five measures”, Workshop on WordNet and Other Lexical Resources, Second Meeting of the North American Chapter of the Association for Computational Linguistics, 2001.
    [6] Andrew Hickl, Jeremy Bensley, John Williams, Kirk Roberts, Bryan Rink and Ying Shi, “Recognizing Textual Entailment with LCC’s GROUNDHOG System”, Proceedings of the Second PASCAL Challenges Workshop on Recognising Textual Entailment, pp. 80-85, 2006.
    [7] Cheng-Wei Shih, Cheng-Wei Lee, Ting-Hao Yang and Wen-Lian Hsu. “IASL RITE System at NTCIR-9”, Proceedings of NTCIR-9 Workshop Meeting, pp. 379-385, 2011.
    [8] Chih-Wei Hsu, Chih-Chung Chang and Chih Jen Lin, A Practical Guide to Support Vector Classification. Retrieved from website: http://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf, 2010.
    [9] Chinese Knowledge Information Processing Group (CKIP), E-HowNet Technical Report. Retrieved from CKIP website: http://rocling.iis.sinica.edu.tw/CKIP/paper/Technical_Reprt_E-HowNet.pdf, 2009.
    [10] Chuan-Jie Lin and Bo-Yu Hsiao, “The Description of the NTOU RITE System in NTCIR-9”, Proceedings of NTCIR-9 Workshop Meeting, pp. 353-356, 2011.
    [11] CKIP Chinese Segmenter, http://ckipsvr.iis.sinica.edu.tw/
    [12] Hideki Shima, Hiroshi Kanayama, Cheng-Wei Lee, Chuan-Jie Lin, Teruko Mitamura, Yusuke Miyao, Shuming Shi and Koichi Takeda, “Overview of NTCIR-9 RITE: Recognizing Inference in TExt”, Proceedings of NTCIR-9 Workshop Meeting, pp. 291-301, 2011.
    [13] Ido Dagon, Oren Glickman and Bernardo Magnini, “The PASCAL Recognising Textual Entailment Challenge”, Machine Learning Challenges. Lecture Notes in Computer Science, 3944, pp. 177-190, Springer, 2006.
    [14] Jianfeng Gao, Mu Li, Andi Wu and Chang-Ning Huang, “Chinese Word Segmentation and Named Entity Recognition: A Pragmatic Approach”, Computational Linguistics, 31(4), 2005.
    [15] Kishore Papineni, Salim Roukos, Todd Ward and Wei-Jing Zhu, “BLEU: a Method for Automatic Evaluation of Machine Translation”, Proceedings of the Fortieth Annual Meeting on ACL, pp. 311-318, 2002.
    [16] Liang Zhou, Chin-Yew Lin and Eduard Hovy, “Re-evaluating Machine Translation Results with Paraphrase Support”, Proceedings of the Conference on EMNLP, pp. 77-84, 2006.
    [17] LibSVM – A Library for Support Vector Machines, http://www.csie.ntu.edu.tw/~cjlin/libsvm/
    [18] Ling Cao, Xipeng Qiu and Xuanjing Huang, “FudanNLP at RITE 2011: a Shallow Semantic Approach to Textual Entailment“, Proceedings of NTCIR-9 Workshop Meeting, pp. 335-338, 2011.
    [19] LingPipe. http://alias-i.com/lingpipe/
    [20] Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann, Ian H. Witten, “The WEKA Data Mining Software: An Update”, SIGKDD Explorations, 11(1), 2009.
    [21] Marta Tatu, Brandon Iles, John Slavick, Adrian Novischi and Dan Moldovan, “COGEX at the Second Recognizing Textual Entailment Challenge”. Proceedings of the Second PASCAL Challenges Workshop on Recognising Textual Entailment, pp. 104-109, 2006.
    [22] Min-Yuh Day, Re-Yuan Lee, Cheng-Tai Liu, Chun Tu, Chin-Sheng Tseng, Loong Tern Yap, Allen-Green C.L. Huang, Yu-Hsuan Chiu and Wei-Ze Hong, “IMTKU Textual Entailment System for Recognizing Inference in Text at NTCIR-9 RITE”, Proceedings of NTCIR-9 Workshop Meeting, pp. 339-344, 2011.
    [23] Rod Adams, “Textual Entailment Through Extended Lexical Overlap”, Proceedings of the Second PASCAL Challenges Workshop on Recognising Textual Entailment, pp. 128-133, 2006.
    [24] Shih-Hung Wu, Wan-Chi Huang, Liang-Pu Chen and Tsun Ku, “Binary-class and Multi-class Chinese Textural Entailment System Description in NTCIR-9 RITE”, Proceedings of NTCIR-9 Workshop Meeting, pp. 422-426, 2011.
    [25] Stanford Dependencies, http://nlp.stanford.edu/software/stanford-dependencies.shtml
    [26] Stanford Named Entity Recognizer, http://www-nlp.stanford.edu/software/CRF-NER.shtml
    [27] Stanford Parser, http://nlp.stanford.edu/software/lex-parser.shtml
    [28] Stanford Tokenizer, http://nlp.stanford.edu/software/tokenizer.shtml
    [29] Stanford Word Segmenter, http://nlp.stanford.edu/software/segmenter.shtml
    [30] The Stanford Natural Language Processing Group, Stanford typed dependencies manual. Retrieved from website: http://nlp.stanford.edu/software/dependencies_manual.pdf, 2012
    [31] WordNet, http://wordnet.princeton.edu/
    [32] YAGO-NAGA Javatools, http://www.mpi-inf.mpg.de/yago-naga/javatools/
    [33] Yaoyun Zhang, Jun Xu, Chenlong Liu, Xiaolong Wang, Ruifeng Xu, Qingcai Chen, Xuan Wang, Yongshuai Hou and Buzhou Tang, “ICRC_HITSZ at RITE: Leveraging Multiple Classifiers Voting for Textual Entailment Recognition”, Proceedings of NTCIR-9 Workshop Meeting, pp. 325-329, 2011.
    [34] Yotaro Watanabe, Yusuke Miyao, Junta Mizuno, Tomohide Shibata, Hiroshi Kanayama, Cheng-Wei Lee, Chuan-Jie Lin, Shuming Shi, Teruko Mitamura, Noriko Kando, Hideki Shima and Kohichi Takeda, “Overview of the Recognizing Inference in Text (RITE-2) at NTCIR-10”, Proceedings of the Tenth NTCIR Conference, 2013.
    Description: 碩士
    國立政治大學
    資訊科學學系
    100753014
    101
    Source URI: http://thesis.lib.nccu.edu.tw/record/#G0100753014
    Data Type: thesis
    Appears in Collections:[Department of Computer Science ] Theses

    Files in This Item:

    File SizeFormat
    301401.pdf8193KbAdobe PDF21001View/Open


    All items in 政大典藏 are protected by copyright, with all rights reserved.


    社群 sharing

    著作權政策宣告 Copyright Announcement
    1.本網站之數位內容為國立政治大學所收錄之機構典藏,無償提供學術研究與公眾教育等公益性使用,惟仍請適度,合理使用本網站之內容,以尊重著作權人之權益。商業上之利用,則請先取得著作權人之授權。
    The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

    2.本網站之製作,已盡力防止侵害著作權人之權益,如仍發現本網站之數位內容有侵害著作權人權益情事者,請權利人通知本網站維護人員(nccur@nccu.edu.tw),維護人員將立即採取移除該數位著作等補救措施。
    NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - Feedback