English  |  正體中文  |  简体中文  |  Post-Print筆數 : 27 |  Items with full text/Total items : 95906/126496 (76%)
Visitors : 31695681      Online Users : 589
RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version
    Please use this identifier to cite or link to this item: http://nccur.lib.nccu.edu.tw/handle/140.119/53394


    Title: 基因體資料中的特徵選取(II)
    Other Titles: A Statistical Procedure of Feature Selection in a Genomic Study
    Authors: 薛慧敏
    Contributors: 國立政治大學統計學系
    行政院國家科學委員會
    Keywords: 生物技術;Joint confidence region;Optimal sample fraction of case to control;Retrospective logistic regression model;Sample size determination;Two-stage sequential procedure
    Date: 2008
    Issue Date: 2012-08-30 09:59:05 (UTC+8)
    Abstract: 96學年度 檢定線性判別分析中單一變數的顯著性 在二元分類上,AUC(接受者操作特徵函數線下面積)常用來評估分類準則的判別力。當資料中包含大量變數時,如基因表現陣列資料或蛋白質譜資料,則變數選取為必要且重要的工作。此時,我們假設資料中有兩個變數,並且考慮線性判別函數。在常態假設下,經過推導得知,加入第二個變數所增進的AUC增加量將與兩個變數的有效作用量(effective size)的比值以及變數間的相關係數有密切關係。本年度的研究將針對此加入變數的增益效果的顯著性,提出相關統計檢定方法。第一個方法為在常態分配假設下的參數(parametric)檢定方法,第二個方法則依據AUC的無母數估計量以及以重抽樣(resampling)方法來決定對應的臨界值。我們將運用電腦模擬來研究此些方法的檢定力。 未來期望將這些檢定方法運用在有多個變數的資料上,以進行變數選取工作。 97學年度 線性判別分析的特徵選取:在基因體資料上的運用 在基因體實驗中,如基因表現陣列資料或蛋白質譜資料,我們可以同時獲得大量的特徵變數的觀測值。從資料中偵測出有顯著差異表現量的特徵,與由此進一步發展分類準則為實驗的倆個主要的目的。本年度計畫將以AUC(接受者操作特徵函數線下面積)為準則,發展一個特徵選取策略。利用前年度研究計劃發展的檢定方法,我們將依序檢定經過排序的特徵,並且選取可顯著增加AUC的特徵於後續的分類準則中。此策略將被運用在一組實際的蛋白質質譜儀資料上,我們將透過這組實際資料來比較不同的特徵選取方法的表現。
    YEAR 2007 Testing the significance of a variate in a linear discriminant function In developing a binary classification rule, the AUC (area under the receiver operating characteristics) is a commonly used criterion in the assessment of discriminating power. When the data set includes numerous possible variates, such as the gene expression arrays or protein mass spectrometry, variable selection is an essential and important step. For simplicity, we assume two variates, y1, y2, are available in the data set and consider their linear discriminant functions, c1y1+c2y2, in this project. Under normality, the increment of the AUC of the second variate is shown to depend on the effective sizes of the two variates and the correlation between the two variates. For testing the significance of inclusion of the variate, we will propose a parametric statistical procedure under normality. Moreover, we will also develop an alternative procedure, which based on a nonparametric estimation of AUC and a re-sampling method for determination of the critical values. The performance of the test will be investigated through an empirical study. YEAR 2008. Feature selection in linear discriminant analysis: application to genomic data In a genomic experiment, such as gene expression arrays or protein mass spectrometry, a large number of features are assayed simultaneously. Identification of differentially expressed features and development of a classification rule based on the selected features for further prediction are two main and important objectives. This project aims to develop a strategy of feature selection based on the AUC criterion. The features are ordered and sequentially tested by using the parametrical testing procedure developed in the project of year 2007. The features, which have significant increment in AUC, are selected for classification. The strategy will be applied on a real protein mass spectrometry data set. This strategy will be compared with other existing methods.
    Relation: 基礎研究
    學術補助
    研究期間:9708~ 9807
    研究經費:347仟元
    Data Type: report
    Appears in Collections:[統計學系] 國科會研究計畫

    Files in This Item:

    File SizeFormat
    96211M007.pdf341KbAdobe PDF541View/Open


    All items in 政大典藏 are protected by copyright, with all rights reserved.


    社群 sharing

    著作權政策宣告
    1.本網站之數位內容為國立政治大學所收錄之機構典藏,無償提供學術研究與公眾教育等公益性使用,惟仍請適度,合理使用本網站之內容,以尊重著作權人之權益。商業上之利用,則請先取得著作權人之授權。
    2.本網站之製作,已盡力防止侵害著作權人之權益,如仍發現本網站之數位內容有侵害著作權人權益情事者,請權利人通知本網站維護人員(nccur@nccu.edu.tw),維護人員將立即採取移除該數位著作等補救措施。
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - Feedback