政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/51591
English  |  正體中文  |  简体中文  |  Post-Print筆數 : 27 |  全文笔数/总笔数 : 109948/140897 (78%)
造访人次 : 46097226      在线人数 : 848
RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜寻范围 查询小技巧:
  • 您可在西文检索词汇前后加上"双引号",以获取较精准的检索结果
  • 若欲以作者姓名搜寻,建议至进阶搜寻限定作者字段,可获得较完整数据
  • 进阶搜寻
    政大機構典藏 > 資訊學院 > 資訊科學系 > 學位論文 >  Item 140.119/51591


    请使用永久网址来引用或连结此文件: https://nccur.lib.nccu.edu.tw/handle/140.119/51591


    题名: 以型態組合為主的關鍵詞擷取技術在學術寫作字彙上的研究
    A pattern approach to keyword extraction for academic writing vocabulary
    作者: 邵智捷
    Shao, Chih Chieh
    贡献者: 劉吉軒
    Liu, Jyi Shane
    邵智捷
    Shao, Chih Chieh
    关键词: 關鍵字擷取
    英語學習
    學術字彙
    學術字彙列表
    詞性標籤型態
    keyword extraction
    English learning
    academic vocabulary
    academic word list
    AWL
    PoS tag patterns
    日期: 2009
    上传时间: 2011-10-11 16:57:35 (UTC+8)
    摘要: 隨著時間的推移演進,人們瞭解到將知識經驗著作成文獻典籍保存下來供後人研究開發的重要性。時至今日,以英語為主的學術寫作論文成為全世界最主要的研究交流媒介。而對於英語為非母語的研究專家而言,在進行英語學術寫作上常常會遇到用了不適當的字彙或搭配詞導致無法確切的傳達自己的研究成果,或是在表達上過於貧乏的問題,因此英語學術寫作字彙與搭配詞的學習與使用就顯得相當重要。

    在本研究中,我們藉由收集大量不同國家以及不同研究領域的學術論文為基礎,建構現實中實際使用的語料庫,並且建立數種詞性標籤型態,使用關鍵詞擷取關鍵詞擷取(Keyword Extraction)技術從中擷取出學術著作中常用的學術寫作字彙候選詞,當作是學術常用寫作字彙之初步結果,隨即將候選詞導入關鍵詞分析的指標形態模型,將候選詞依照指標特徵選出具有代表指標意義的進一步候選詞。

    在實驗方面,透過對不同範圍的樣本資料進行篩選,並導入統計上的方法對字彙進行不同領域共通性的分析檢證,再加上輔助篩選的機制後,最後求得名詞和動詞分別在學術寫作中常用的字彙,也以此字彙為基礎,發掘出語料庫中常用的搭配詞組合,提出以英語為外國語的研究學者以及學生在學術寫作上的常用字彙與搭配詞組合作為參考,在學術寫作上能夠提供更多樣性且正確的研究論述的協助。
    With the evolution over time, people start to know the importance of taking their knowledge and experience into literature texts and preserving them for future research. Until now, academic writing research papers mainly in English become the world’s leading communication media all over the world. For those non-native English researchers, they often encounter with the inappropriate vocabularies or collocations which causes them not to pass on their idea accurately or to express their research poorly. As a result, it’s very important to know how to learn or to use the correct academic writing in English vocabularies and collocations.

    In this study, we constructed the real academic thesis corpus which includes different countries and fields of academic research. The keyword extraction technique based on the several Part-of-Speech tag patterns is used for capturing the common academic writing vocabulary candidates in the academic works to be the initial result of the common vocabulary of academic writing. The candidate words would be introduced to the index analysis model of keyword and be picked out to the further meaningful candidate words according to the index characteristics.

    For the experiments, the sample data with different fields would be filtered and the vocabularies on different fields of commonality would be analyzed and verified through statistical methods. Moreover, the auxiliary filter mechanism would also be applied to get the common vocabularies in academic writing with nouns and verbs. Based on these vocabularies, we could discover the common combination with the words in the academic thesis corpus and provide them to the non-native English researchers and students as a reference with the common vocabularies and collocations in academic writing. Hopefully the study could help them to write more rich and correct research papers in the future.
    參考文獻: [1] 郭志華. 學術寫作字彙特色分析. URL: http://ir.lib.nctu.edu.tw/handle/987654321/19252
    [2] Biber, D., Johansson, S., Leech, G., Conrad, S., & Finegan, E. (1999). Longman grammar of spoken and written English. London: Longman.
    [3] Chen, C. Y. & Tang, Y. T. (2004). Collocation errors of Taiwanese college students: Oral or written production. In The proceedings of the Eighth International Symposium on English Teaching(pp. 483- 494). Taipei, Taiwan: The Crane Publishing Co.
    [4] McEnery T., & Wilson, A. (Eds.). (2001). Corpus linguistics. Edinburgh: Edinburgh University Press.
    [5] Mudraya, O. (2006). Engineering English: A lexical frequency instructional model. English for Specific Purposes, Vol. 25, 235-256.
    [6] Biber, D. (1998). Variation across speech and writing. Cambridge: Cambridge University Press.
    [7] Conrad, C. M. (1996). Investigating Academic Text With Corpus-Based Techniques: An Example From Biology. Linguistics and Education 8, pp. 299-326.
    [8] Thompson, P., & Tribble, C. (2001). Looking at Citations: Using Corpora in English for Academic Purposes. Language Learning & Technology, Vol.5, Num. 3 pp. 91-105.
    [9] Biber, D., Conrad, S., & Reppen, R. (1998). Corpus Linguistics: Investigating language structure and use. Cambridge: Cambridge University Press.
    [10] Ercan, G., & Cicekli, I. (2007). Using Lexical Chains for Keyword Extraction. Information Processing & Management, Vol.43, Issue 6, pp. 1705-1714.
    [11] Matsuo, Y., Ishizuka, M. (2003). Keyword Exraction from a Single Document using Word Co-occurrence Statistical Information. International Journal on Artificial Intelligence Tools. World Scientific Publishing Company.
    [12] Giarlo, M. J. (2005). A Comparative Analysis of Keyword Extraction Techniques. Rutgers, The State University of New Jersey.
    [13] 魏智強. (2006). 自動化問答系統之研製. 私立中華大學資訊工程研究所碩士論文.民國九十五年八月.
    [14] 王俊弘, 劉昭麟, 高照明. (2003). 電腦輔助英文字彙出題系統之研究. 2003人工智慧,模糊系統及灰色系統聯合研討會論文集.
    [15] Hulth, A. (2003). Improved Automatic Keyword Extraction Given More Linguistic Knowledge. Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing, Sapporo, July, 2003, pp. 216-223.
    [16] Turney T. D. (2000). Learning algorithms for keyphrase extraction. Information Retrieval, 2(4):303–336.
    [17] Frank E., Paynter G. W., Witten I. H. (1999). Domain-specific keyphrase extraction. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI’99), pages 668–673, Stockholm, Sweden.
    [18] Dutta, B., Majumder K. & Sen, B. K. (2009). An analytical model for investigation of some characteristics of the keywords of the subject fermi liquid: a case study. Annals of Library and Information Studies, Vol. 56, December 2009, pp. 273-290
    [19] Nation, P. (2001). Learning vocabulary in another language. Cambridge: Cambridge University Press.
    [20] Coxhead, A., & Nation, P. (2001). The specialized vocabulary of English for academic purposes. In J. Flowerdew & M. Peacock (Eds.), Research perspectives on English for academic purpose (pp.252-267). Cambridge: Cambridge University Press.
    [21] West, M. (1953). A general service list of English words. London: Longmans, Green.
    [22] Coxhead, A. (2000). The Academic Word List: A Corpus-based Word List for Academic Purposes. TESOL quarterly, 2000.
    [23] 台大教育視聽館 Academic Vocabulary, URL : http://efreeway.avcenter.ntu.edu.tw/freeway /postgraduates/vocab/vocab_index.html
    [24] 廖柏森. (2008). 英文研究論文寫作 - 搭配詞指引 : 眾文圖書.
    [25] Benson, M., Benson, E., & Ilson, R. (2007). The BBI dictionary of English word combinations. 台北 : 書林.
    [26] 黃茹玉. (2007). 探討應用語言學期刊論文中學術字彙之使用. 國立清華大學外國語文學系碩士班外語教學組碩士論文. 民國九十六年六月.
    [27] Chuang, T. C., Jian, J. J., Chang, Y. C. & Chang, S. C. (2005). Collocational Translation Memory Extraction Based on Statistical and Linguistic Information. Computational Linguistics and Chinese Language Processing Vol. 10, No. 3, September 2005, pp. 329-346.
    [28] Nesselhauf, N (2003). The use of collocations by advanced learners of English and some implications for teaching. Applied Linguistics, 24, 223- 242.
    [29] Bird, S. (2006) .The Natural Language Toolkit, Proceedings of the COLING/ACL on Interactive presentation sessions table of contents 2006. Sydney, Australia. pp.69 - 72
    [30] Lucas, N., Cremilleux, B. & Turmel, L. (2003). Signalling well-written academic articles in an English corpus by text mining techniques. Proceedings Corpus Linguistics 2003. pp. 465-474.
    [31] Mantel, N. (1963). Chi-square tests with one degree of freedom; extensions of the Mantel-Haenszel procedure. Journal of the American Statistical Association, Vol. 58, No. 303. pp. 690-700
    描述: 碩士
    國立政治大學
    資訊科學學系
    92753025
    98
    資料來源: http://thesis.lib.nccu.edu.tw/record/#G0927530254
    数据类型: thesis
    显示于类别:[資訊科學系] 學位論文

    文件中的档案:

    档案 大小格式浏览次数
    53025401.pdf635KbAdobe PDF21749检视/开启


    在政大典藏中所有的数据项都受到原著作权保护.


    社群 sharing

    著作權政策宣告 Copyright Announcement
    1.本網站之數位內容為國立政治大學所收錄之機構典藏,無償提供學術研究與公眾教育等公益性使用,惟仍請適度,合理使用本網站之內容,以尊重著作權人之權益。商業上之利用,則請先取得著作權人之授權。
    The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

    2.本網站之製作,已盡力防止侵害著作權人之權益,如仍發現本網站之數位內容有侵害著作權人權益情事者,請權利人通知本網站維護人員(nccur@nccu.edu.tw),維護人員將立即採取移除該數位著作等補救措施。
    NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 回馈