政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/53394
English  |  正體中文  |  简体中文  |  Post-Print筆數 : 27 |  全文笔数/总笔数 : 109952/140903 (78%)
造访人次 : 46033707      在线人数 : 758
RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜寻范围 查询小技巧:
  • 您可在西文检索词汇前后加上"双引号",以获取较精准的检索结果
  • 若欲以作者姓名搜寻,建议至进阶搜寻限定作者字段,可获得较完整数据
  • 进阶搜寻


    请使用永久网址来引用或连结此文件: https://nccur.lib.nccu.edu.tw/handle/140.119/53394


    题名: 基因體資料中的特徵選取(II)
    其它题名: A Statistical Procedure of Feature Selection in a Genomic Study
    作者: 薛慧敏
    贡献者: 國立政治大學統計學系
    行政院國家科學委員會
    关键词: 生物技術;Joint confidence region;Optimal sample fraction of case to control;Retrospective logistic regression model;Sample size determination;Two-stage sequential procedure
    日期: 2008
    上传时间: 2012-08-30 09:59:05 (UTC+8)
    摘要: 96學年度 檢定線性判別分析中單一變數的顯著性 在二元分類上,AUC(接受者操作特徵函數線下面積)常用來評估分類準則的判別力。當資料中包含大量變數時,如基因表現陣列資料或蛋白質譜資料,則變數選取為必要且重要的工作。此時,我們假設資料中有兩個變數,並且考慮線性判別函數。在常態假設下,經過推導得知,加入第二個變數所增進的AUC增加量將與兩個變數的有效作用量(effective size)的比值以及變數間的相關係數有密切關係。本年度的研究將針對此加入變數的增益效果的顯著性,提出相關統計檢定方法。第一個方法為在常態分配假設下的參數(parametric)檢定方法,第二個方法則依據AUC的無母數估計量以及以重抽樣(resampling)方法來決定對應的臨界值。我們將運用電腦模擬來研究此些方法的檢定力。 未來期望將這些檢定方法運用在有多個變數的資料上,以進行變數選取工作。 97學年度 線性判別分析的特徵選取:在基因體資料上的運用 在基因體實驗中,如基因表現陣列資料或蛋白質譜資料,我們可以同時獲得大量的特徵變數的觀測值。從資料中偵測出有顯著差異表現量的特徵,與由此進一步發展分類準則為實驗的倆個主要的目的。本年度計畫將以AUC(接受者操作特徵函數線下面積)為準則,發展一個特徵選取策略。利用前年度研究計劃發展的檢定方法,我們將依序檢定經過排序的特徵,並且選取可顯著增加AUC的特徵於後續的分類準則中。此策略將被運用在一組實際的蛋白質質譜儀資料上,我們將透過這組實際資料來比較不同的特徵選取方法的表現。
    YEAR 2007 Testing the significance of a variate in a linear discriminant function In developing a binary classification rule, the AUC (area under the receiver operating characteristics) is a commonly used criterion in the assessment of discriminating power. When the data set includes numerous possible variates, such as the gene expression arrays or protein mass spectrometry, variable selection is an essential and important step. For simplicity, we assume two variates, y1, y2, are available in the data set and consider their linear discriminant functions, c1y1+c2y2, in this project. Under normality, the increment of the AUC of the second variate is shown to depend on the effective sizes of the two variates and the correlation between the two variates. For testing the significance of inclusion of the variate, we will propose a parametric statistical procedure under normality. Moreover, we will also develop an alternative procedure, which based on a nonparametric estimation of AUC and a re-sampling method for determination of the critical values. The performance of the test will be investigated through an empirical study. YEAR 2008. Feature selection in linear discriminant analysis: application to genomic data In a genomic experiment, such as gene expression arrays or protein mass spectrometry, a large number of features are assayed simultaneously. Identification of differentially expressed features and development of a classification rule based on the selected features for further prediction are two main and important objectives. This project aims to develop a strategy of feature selection based on the AUC criterion. The features are ordered and sequentially tested by using the parametrical testing procedure developed in the project of year 2007. The features, which have significant increment in AUC, are selected for classification. The strategy will be applied on a real protein mass spectrometry data set. This strategy will be compared with other existing methods.
    關聯: 基礎研究
    學術補助
    研究期間:9708~ 9807
    研究經費:347仟元
    数据类型: report
    显示于类别:[統計學系] 國科會研究計畫

    文件中的档案:

    档案 大小格式浏览次数
    96211M007.pdf341KbAdobe PDF2598检视/开启


    在政大典藏中所有的数据项都受到原著作权保护.


    社群 sharing

    著作權政策宣告 Copyright Announcement
    1.本網站之數位內容為國立政治大學所收錄之機構典藏,無償提供學術研究與公眾教育等公益性使用,惟仍請適度,合理使用本網站之內容,以尊重著作權人之權益。商業上之利用,則請先取得著作權人之授權。
    The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

    2.本網站之製作,已盡力防止侵害著作權人之權益,如仍發現本網站之數位內容有侵害著作權人權益情事者,請權利人通知本網站維護人員(nccur@nccu.edu.tw),維護人員將立即採取移除該數位著作等補救措施。
    NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 回馈