English  |  正體中文  |  简体中文  |  Post-Print筆數 : 27 |  Items with full text/Total items : 110944/141864 (78%)
Visitors : 47923630      Online Users : 1009
RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version
    政大機構典藏 > 商學院 > 資訊管理學系 > 學位論文 >  Item 140.119/31091
    Please use this identifier to cite or link to this item: https://nccur.lib.nccu.edu.tw/handle/140.119/31091


    Title: 應用資訊擷取技術於企業評價財務項資料之取得
    An Application of Information Extraction in Collecting Financial Data for Business Valuation
    Authors: 賴哲霆
    Lai,Jhe-Ting
    Contributors: 林我聰
    諶家蘭

    Lin, Woo-Tsong
    Seng, Jia-Lang

    賴哲霆
    Lai,Jhe-Ting
    Keywords: 資訊擷取
    企業評價
    財務項資料
    Information Extraction
    Business Valuation
    Financial Data
    Date: 2006
    Issue Date: 2009-09-14 09:14:11 (UTC+8)
    Abstract: 由於近幾年來網際網路電子資源的數量大量成長下,搜尋引擎技術的誕生為使用者帶來檢索資料文件上極高的便利與效率。但網路資源和使用者大量成長下,現有的關鍵字檢索技術已無法滿足使用者需求。然而「資訊擷取」就是將從檢索文件中擷取重要特定訊息或產生資訊間特定關係的一種技術。其不僅從文件中能過濾不必要的資訊,而且產生有興趣或特定的重要訊息和摘要。
    企業評價即為一套收集、分析與應用財務或非財務資訊來評價企業的價值,其評估的結果可做為企業決策和無形資產買賣訂價之依據。目前在國內企業的財務報表、財務附註和財經新聞內容皆有與企業評價所需重要訊息和資料,並以網頁和PDF格式呈現。因此,本研究將對國內企業財務報表、財務附註和財經新聞為資料來源,以企業評價概念基礎下建立中文財務項資料的資訊擷取系統。從這些不同的異質資料來源中,擷取正確的財務項資料與其所對應之企業評價模型,以達成自動擷取企業評價資料。使用者能在最短的時間內取得相關有效評價資訊和學習評價模型,使資訊處理品質能夠提昇正確性和效率性。
    Due to an increase in the wealth of electronic resources on the Internet in the past several years, the birth of the search engine has brought the utmost convenience and efficiency for users. However, searching for data by keyword retrieval techniques in information retrieval is not contented with some users’ specific demands due to a large number of network resources and users on the Internet. Information extraction (IE) is an improvement method which extracts the important specific event or produces specific relations among information from documents. IE can not only filter unnecessary information in any documents but also produce specific important messages and summaries that users are interested in.
    Business valuation is collecting, analyzing, and applying to financial or non- financial integral information to appraise the business value. The evaluated results are used in the commerce pricing for the business decision and intangible assets. There are specific information and events about business valuation stored in the Chinese financial statements, notes to financial statements, and financial news of Taiwan’s companies at present and data is presented by the HTML and PDF files. Hence, we developed an information extraction system of Chinese financial data for business valuation from the domestic business financial statements, notes to financial statements, and financial news as our data sources. We extracted the correct financial data and their corresponding business valuation model to achieve an automatic extraction in the financial data from these different heterogeneous data sources. Users can collect the relevant valid valuation information and learn valuation models concepts within a very short time to improve accuracy and efficiency in information processing quality.
    Reference: 1.卜小蝶 (1996)。圖書資訊檢索技術。文華圖書館管理資訊股份有限公司。
    2.中央研究院資訊科學所中文詞知識庫小組網站(Chinese Knowledge and Information Processing Group Website)。http://ckip.iis.sinica.edu.tw/CKIP
    3.朱怡霖 (2002)。中文斷詞及專有名詞辨識之研究。國立台灣大學資訊工程研究所碩士論文,台北市。
    4.吳岱儒 (2003)。財務管理。全華科技圖書股份有限公司。
    5.吳啟銘 (2001)。企業評價:個案實證分析。智勝文化事業有限公司。
    6.洪國賜、盧聯生 (2001)。財務報表分析。三民書局。
    7.黃佳新 (2004)。關鍵字擷取與文件分類因子分析。國立清華大學工業工程與管理系碩士論文,新竹市。
    8.黃燕萍 (1999)。中文社會新聞文件資訊擷取。國立雲林科技大學資訊管理系碩士論文,雲林縣。
    9.葉政輝 (2002)。以語料為基礎的中文專有名詞的之研究。國立交通大學資訊科學所碩士論文,新竹市。
    10.Atlam, El-S., Fuketa, M., Kashiji, S., Nakata, H., & Aoe, J. (2002). A new method for construction filed association terms using co-occurrence words and declinable words information. IEEE International Conference on Systems, Man and Cybernetics, 4, pp. 1217-1224.
    11.Baeza-Yates, R. & Ribeiro-Neto, B. (1999). Modern information retrieval. Addision Wesley Longman Publishing Co. Inc.
    12.Cercone, N., Huang, X., Peng, F., & Schurmans, D. (2003). Applying machine learning to text segmentation for information retrieval. Information Retrieval, 6(3), pp. 333-362.
    13.Chen, A., Gey, F. C., He, J., Meggs, J., & Xu, L. (1997). Chinese Text Retrieval Without Using a Dictionary. Proceedings of the 20th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 42–49.
    14.Chen, F. Y., Chen, K. J., Huang, C. R., & Tsai, P. F. (1999). Sinica treebank. Computational Linguistics and Chinese Language Processing, 4(2), pp. 87-104.
    15.Chen, K. J. & Bai, M. H. (1998). Unknown word detection for Chinese by a corpus-based learning method. International Journal of Computational Linguistics and Chinese Language Processing, 3(1), pp. 27-44.
    16.Chen, K. J. & Liu, S. H. (1992). Word identification for Mandarin Chinese sentences. Proceedings of the 14th Conference on Computational Linguistics, 1, pp. 101-107.
    17.Chen, K. J. & Ma, W. Y. (2001). Construction and management for Chinese corpus. Proceedings of Research on Computational Linguistics Conference, pp.175-191.
    18.Chen, K. J. & Ma, W. Y. (2002). Unknown word extraction for Chinese documents. Proceedings of the 19th International Conference on Computational Linguistics, 1, pp. 1-7.
    19.Chen, K. J. & Ma, W. Y. (2003). A bottom-up merging algorithm for Chinese unknown word extraction. Proceedings of SIGHAN, pp. 31-38
    20.Chen, K. J. & Ma, W. Y. (2005). Design of CKIP Chinese word segmentation system. Chinese and Oriental Languages Information Processing Society, 14(3), pp. 235-249.
    21.Chen, K. J. & Tsai, Y. F. (2003). Context-rule model for pos tagging. Proceedings of PACLIC 17, pp.146-151.
    22.Chien, L. F. & Pu, H. T. (1996). Important issues on Chinese retrieval. Computational Linguistics and Chinese Language Processing, 1(1), pp.205-221.
    23.Fu, G. & Luke, K. K. (2003). A two-stage statistical word segmentation system for Chinese. Proceedings of the Second SIGHAN Workshop on Chinese Language Processing, 17, Association for Computational Linguistics, pp. 156-159.
    24.Gao, J., Li, M., & Huang, C. N. (2003). Improve source-channel models for Chinese word segmentation. Proceedings of the 41st Annual Meeting on Association for Computational Linguistics, 1(3), pp. 272-279.
    25.Goldstein, R.C. & Storey, V.C. (1994). Materialization. IEEE Transactions on Knowledge and Data Engineering, 6(5), pp.835-842.
    26.Han, J., Cai, Y. & Cercone N., (1993). Data-driven discovery of quantitative rules in relation databases, IEEE Transactions on Knowledge and Data Engineering, 5(1), pp. 29-40.
    27.Hsieh, Y. M., Yang, D. C., & Chen, K. J. (2006). Improve parsing performance by self-learning. Proceedings of ROCLING XVIII, pp 63-76.
    28.Krupl, B., Herzog, M., & Gatterbauer, W. (2005). Using visual cues for extraction of tabular data from arbitrary HTML documents. Special Interest Tracks and Posters of the 14th International Conference on World Wide Web, pp. 1000-1001.
    29.Lee, R. C. T., Chang, R. C., Tseng, S. S. & Tsai, Y. T. (1999). Introduction to the design and analysis of algorithm (1). UNALIS Corp., pp. 419-423.
    30.Li, W., Wong, K. F., & Yuan, C. (2003). A design of temporal event extraction from Chinese financial news. International Journal of Computer Processing of Oriental Languages, 16(1), pp. 21-39.
    31.Liu, J., Nissim, D., & Thomas, J. (2002). Equity valuation using multiples. Journal of Account Research, 40(1).
    32.Liu, T. & Wang, Z. (2005). Chinese unknown word identification based on local bi-gram model. International Journal of Computer Processing of Oriental Languages, 18(3), pp. 185-196.
    33.Liu, Y., Mitra, P., Giles, C.L., & Bai, K.(2006). Automatic extraction of table metadata from digital documents. Digital Libraries,2006. JCDL’06. Proceedings of the 6th ACM/IEEE-CS Joint Conference on.
    34.Lochovsky, F. H. & Wang, J. (2003). Data extraction and label assignment for Web database. Proceedings of the 12th International Conference on World Wide Web, pp. 187-196.
    35.Maier D. (1978). The complexity of some problems on subsequences and supersequences. Journal of the ACM, 25(2), pp. 322-336.
    36.Manning, C.D., Raghavan P., & Schutze, H. (2007). An introduction to information retriveal. Cambrige University Press Camvidge.England.
    37.Nguyen, N. G., Hanny, Y. L. & Vo, T. T. (2005). An information extraction engine for Web discussion forums. Special Interest Tracks and Posters of the 14th International Conference on World Wide Web, pp 978-979.
    38.Peng, F., Huang, X., Schuurmans, D., & Cercone, N. (2002). Investigating the relationship between word segmentation performance and retrieval performance in Chinese IR. Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 369-370.
    39.Rosenfeld,B., Feldman, R., & Aumann, Y. (2002). Structural extraction from visual layout of documents. Proceedings of the eleventh international conference on Information and knowledge management, pp. 203-210.
    40.Teahan, W.J., McNab, R., Wen, Y., & Witten, I. H. (2001). A compression-based algorithm for Chinese word segmentation. Computational Linguistics, 26(3), pp. 375–393.
    41.Tseng, H. & Chen, K. J. (2002). Design of Chinese morphological analyzer. Proceeding of the First SIGHAN Workshop on Chinese Language Process, 18, pp. 1-7.
    42.Wang, H. (2002). A study on noun sense disambiguation based on syntagmatic features. Computational Linguistics and Chinese Language Processing, 7(2), pp. 77-88.
    43.Wong, K. & Xia, Y. (2005). An overview of temporal information extraction. International Journal of Computer Oriental Languages, 18(2), pp.137-152
    44.You, J.M. & Chen, K.J. (2004). Automatic semantic role assignment for a tree structure. Proceedings of the 3rd SIGHAN Workshop on Chinese Language Processing, ACL-04, Barcelona.
    45.Zhai, Y. & Liu, B.(2005). Web data extraction based on partial tree alignment. Proceedings of the 14th international conference on World Wide Web, pp.76-85.
    46.Zhang, J., Gao, J., & Zhou, M. (2000). Extraction of Chinese compound words -An experimental study on a very large corpus. Proceedings of the Second Workshop on Chinese Language Processing: Held in Conjunction with the 38th Annual Meeting of the Association for Computational Linguistics, 12, pp. 132-139.
    47.Zhou, G. & Su J. (2003). Chinese efficient analyser integrating word segmentation, Part-Of-Speech Tagging, Partial Parsing and Full ParsingParsing. Proceedings of the Second SIGHAN Workshop on Chinese Language Processing, 17, pp. 78-83.
    Description: 碩士
    國立政治大學
    資訊管理研究所
    94356025
    95
    Source URI: http://thesis.lib.nccu.edu.tw/record/#G0094356025
    Data Type: thesis
    Appears in Collections:[資訊管理學系] 學位論文

    Files in This Item:

    File SizeFormat
    index.html0KbHTML2105View/Open


    All items in 政大典藏 are protected by copyright, with all rights reserved.


    社群 sharing

    著作權政策宣告 Copyright Announcement
    1.本網站之數位內容為國立政治大學所收錄之機構典藏,無償提供學術研究與公眾教育等公益性使用,惟仍請適度,合理使用本網站之內容,以尊重著作權人之權益。商業上之利用,則請先取得著作權人之授權。
    The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

    2.本網站之製作,已盡力防止侵害著作權人之權益,如仍發現本網站之數位內容有侵害著作權人權益情事者,請權利人通知本網站維護人員(nccur@nccu.edu.tw),維護人員將立即採取移除該數位著作等補救措施。
    NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - Feedback