English  |  正體中文  |  简体中文  |  Post-Print筆數 : 27 |  Items with full text/Total items : 91280/121421 (75%)
Visitors : 25421194      Online Users : 281
RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version
    Please use this identifier to cite or link to this item: http://nccur.lib.nccu.edu.tw/handle/140.119/115382

    Title: 數據驅動的幾何學習
    Authors: 周珮婷
    Contributors: 統計學系
    Keywords: 距離;數據雲幾何;機器學習
    Distance;DCG tree;Machine Learning
    Date: 2014
    Issue Date: 2017-12-25 15:18:14 (UTC+8)
    Abstract: 高維度變量提供機器學習和分類問題詳細的資料訊息。這些共變數之間的關係對研究人員是未知的。在古典與現代的機器學習文獻中,這問題較少被討論;大多數流行的算法為使用一些降維的方法,甚至強加一個內置的複雜性懲罰。這是一種對高維資料浪費的態度。相反的,我們應該可以利用這種高維變數間潛在的相互關係,而不是任意降維。在本研究中,我們利用上述所提到的概念,首先計算數據點之間的相似性,利用等距演化樹(Ultrametric tree),從所有相關的共變數,得到數據幾何形式模式的信息。然後,我們利用這些模式去建立監督和半監督式的學習。這種計算方法主要是基於一個新的聚類方法,數據雲幾何(DCG),它是一種非監督式學習。我們的數據驅動的學習方法是集中在如何找出適當的距離來表示數據的幾何關係,以促進有效率的找到整體特徵矩陣作為學習的中心問題。
    High dimensional covariate information provides a detailed description of any individuals involved in a machine learning and classification problem. The inter-dependence patterns among these covariate vectors may be unknown to researchers. This fact is not well recognized in classic and modern machine learning literature; most model-based popular algorithms are implemented using some version of the dimension-reduction approach or even impose a built-in complexity penalty. This is a defensive attitude toward the high dimensionality. In contrast, an accommodating attitude can exploit such potential inter-dependence patterns embedded within the high dimensionality. In this research project, we implement this latter attitude throughout by first computing the similarity between data nodes and then discovering pattern information in the form of Ultrametric tree geometry among almost all the covariate dimensions involved. We then make use of these patterns to build supervised and semi-supervised learning algorithms. The computations for such discovery are primarily based on the new clustering technique, Data Cloud Geometry (DCG), a non-supervised learning algorithm. Our data-driven learning approach is focused on the central issue of how to adaptively evolve a simple empirical distance into an effective one in order to facilitate an efficient global feature-matrix for learning purposes.
    Relation: 執行起迄:2014/10/01~2015/07/31
    Data Type: report
    Appears in Collections:[統計學系] 國科會研究計畫

    Files in This Item:

    File SizeFormat
    103-2118-M-004-006.pdf1260KbAdobe PDF131View/Open

    All items in 政大典藏 are protected by copyright, with all rights reserved.

    社群 sharing

    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - Feedback