政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/57637
English  |  正體中文  |  简体中文  |  Post-Print筆數 : 11 |  全文笔数/总笔数 : 89683/119504 (75%)
造访人次 : 23941038      在线人数 : 138
RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜寻范围 查询小技巧:
  • 您可在西文检索词汇前后加上"双引号",以获取较精准的检索结果
  • 若欲以作者姓名搜寻,建议至进阶搜寻限定作者字段,可获得较完整数据
  • 进阶搜寻


    请使用永久网址来引用或连结此文件: http://nccur.lib.nccu.edu.tw/handle/140.119/57637


    题名: Two Novel Feature Selection Approaches for Web Page Classification,
    作者: Chen, Chih-Ming;Lee, Hahn-Ming;Chang, Yu-Jung
    陳志銘
    贡献者: 政大圖檔所
    关键词: Discriminating power measure;Feature selection;Fuzzy decision making;Web page classification
    日期: 2009-01
    上传时间: 2013-04-18
    摘要: To help the growing qualitative and quantitative demands for information from the WWW, efficient automatic Web page classifiers are urgently needed. However, a classifier applied to the WWW faces a huge-scale dimensionality problem since it must handle millions of Web pages, tens of thousands of features, and hundreds of categories. When it comes to practical implementation, reducing the dimensionality is a critically important challenge. In this paper, we propose a fuzzy ranking analysis paradigm together with a novel relevance measure, discriminating power measure (DPM), to effectively reduce the input dimensionality from tens of thousands to a few hundred with zero rejection rate and small decrease in accuracy. The two-level promotion method based on fuzzy ranking analysis is proposed to improve the behavior of each relevance measure and combine those measures to produce a better evaluation of features. Additionally, the DPM measure has low computation cost and emphasizes on both positive and negative discriminating features. Also, it emphasizes classification in parallel order, rather than classification in serial order. In our experimental results, the fuzzy ranking analysis is useful for validating the uncertain behavior of each relevance measure. Moreover, the DPM reduces input dimensionality from 10,427 to 200 with zero rejection rate and with less than 5% decline (from 84.5% to 80.4%) in the test accuracy. Furthermore, to consider the impacts on classification accuracy for the proposed DPM, the experimental results of China Time and Reuter-21578 datasets have demonstrated that the DPM provides major benefit to promote document classification accuracy rate. The results also show that the DPM indeed can reduce both redundancy and noise features to set up a better classifier.
    關聯: Expert Systems with Applications, 36(1), 260-272
    数据类型: article
    DOI 連結: http://dx.doi.org/10.1016/j.eswa.2007.09.008
    DOI: 10.1016/j.eswa.2007.09.008
    显示于类别:[圖書資訊與檔案學研究所] 期刊論文

    文件中的档案:

    档案 描述 大小格式浏览次数
    260-272.pdf296KbAdobe PDF584检视/开启


    在政大典藏中所有的数据项都受到原著作权保护.


    社群 sharing

    著作權政策宣告
    1.本網站之數位內容為國立政治大學所收錄之機構典藏,無償提供學術研究與公眾教育等公益性使用,惟仍請適度,合理使用本網站之內容,以尊重著作權人之權益。商業上之利用,則請先取得著作權人之授權。
    2.本網站之製作,已盡力防止侵害著作權人之權益,如仍發現本網站之數位內容有侵害著作權人權益情事者,請權利人通知本網站維護人員(nccur@nccu.edu.tw),維護人員將立即採取移除該數位著作等補救措施。
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 回馈