English  |  正體中文  |  简体中文  |  Post-Print筆數 : 27 |  Items with full text/Total items : 111316/142225 (78%)
Visitors : 48392964      Online Users : 793
RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version
    政大機構典藏 > 商學院 > 統計學系 > 學位論文 >  Item 140.119/30917
    Please use this identifier to cite or link to this item: https://nccur.lib.nccu.edu.tw/handle/140.119/30917


    Title: 使用Meta-Learning在蛋白質質譜資料特徵選取之探討
    Feature Selection via Meta-Learning on Proteomic Mass Spectrum Data
    Authors: 陳詩佳
    Contributors: 郭訓志
    陳詩佳
    Keywords: 特徵選取
    串聯法
    蛋白質質譜
    支持向量機
    Date: 2006
    Issue Date: 2009-09-14
    Abstract: 癌症高居國人十大死因之首,由於癌症初期病患接受適時治療的存活率較高,因此若能「早期發現,早期診斷,早期治療」則可降低死亡率。本研究主要針對「表面強化雷射解析電離飛行質譜技術」(Surface-Enhanced Laser Desorption / Ionization Time-of-Flight Mass Spectrometry,SELDI-TOF-MS)所蒐集而來的攝護腺癌症蛋白質質譜之事前處理資料進行分析。目的是希望藉由Meta-Learning的方式結合分類器,並以逐步特徵選取之,期望以較少且具代表的特徵變數將資料分類,以達到較高的正確率。本文利用正確率決定逐步特徵選取時變數加入的順序,並進一步以Elastic Net與判定係數作為特徵變數排序依據,以改善變數間共線性高的問題。並且考慮投票法(多數表決法與權重投票法)以及串聯法(cascading):多個分類器串聯與單一分類器串聯。研究發現,以判定係數刪選特徵變數加入的先後順序並以支持向量機(Support Vector Machine,SVM)串聯的特徵選取結果在各分類下皆有良好表現,為較佳的特徵選取方式。

    關鍵字:特徵選取、串聯法、蛋白質質譜、meta-learning、支持向量機
    Reference: 牛頓雜誌編輯部,「孜孜不倦地實驗,也會找到新發現;訪問日本島津製
    作所田中耕一研究員」,牛頓雜誌國際中文版第235期,2003年3月號。
    牛頓雜誌編輯部,「我的新挑戰!訪問日本島津製作所田中耕一紀念質量
    分析研究所」,牛頓雜誌國際中文版第242期,2003年10月號。
    行政院衛生署,「中華民國九十四年台灣地區死因統計結果摘要」。
    網址:http://www.doh.gov.tw/statistic/data/死因摘要/94年/94.htm
    行政院衛生署,國民健康局,「94年度衛生教育宣導主軸-癌症預防」。
    網址:http://www.bhp.doh.gov.tw/BHP/index.jsp
    行政院衛生署,「中華民國九十四年臺灣地區主要癌症死亡原因」。
    網址:http://www.doh.gov.tw/statistic/data/死因摘要/94年/表8.xls
    全國癌症病友服務中心,「攝護腺癌(90.02.01)衛教手冊之十八」。
    網址:http://www2.cch.org.tw/OURHOME/booklet/booklet18.htm
    徐竣建,「重疊法應用於蛋白質質譜資料」,國立政治大學統計系研究所碩士論文,2006年,指導教授:余清祥博士。
    國泰綜合醫院,癌症資訊網,「攝護腺癌症簡介」。
    網址:http://www1.cgh.org.tw/content/healthy/cancerx/newpage19.htm
    黃仁澤,「對於高維度資料進行特徵選取─應用於分類蛋白質質譜儀資料」,國立政治大學統計系研究所碩士論文,2005年,指導教授:郭訓志博士、薛慧敏博士。
    葉勝宗,「使用AUC特徵選取法在蛋白質質譜資料分析之應用」,國立政治大學統計系研究所碩士論文,2006年,指導教授:張源俊博士,郭訓志博士。
    陳敏鋑,「認識癌症」,癌症關懷季刊,德桃基金會。
    網址:http://med.mc.ntu.edu.tw/~onc/Lecture/cancer1.html
    賴基銘,「癌症篩檢未來的展望:SELDI血清蛋白指紋圖譜的應用」,國家
    衛生研究院電子報第52期,2004年6月25日。
    Adam, B.L., Qu, Y., Davis, J.W., Ward, M.D., Clements, M.A., Cazares, L.H.,
    Semmes, O.J., Schellhammer, P.F., Yasui, Y., Feng, Z. and Wright, G.L. Jr.
    (2002) “Serum Protein Fingerprinting Coupled with a Pattern- matching
    Algorithm Distinguishes Prostate Cancer from Benign Prostate Hyperplasia
    and Healthy Men.” Cancer Research, Vol. 62, No. 13, pp. 3609-14.
    Alpaydin, E. and Kaynak, C. (1998), “Cascading Classifiers.” Kybernetika, Vol. 34, No. 4, pp. 369-374.
    Alpaydin, E. and Kaynak, C. (2000) “MultiStage Cascading of Multiple Classifiers: One Man’s Noise is Another Man’s Data.” In Seventeenth International Conference on Machine Learning, ed. P. Langley, pp. 455-462. San Francisco: Morgan Kaufmann.
    Alpaydin, E. (2004), Introduction to Machine Learning, MIT Press.
    Bryan,J. G. (1951), “The Generalized Discriminant Function: Mathematical
    Foundations and Computational Routine.” Harvard Educational Review,
    Vol. 21, pp. 90-95.
    Breiman, L. (1996) “Bagging Predictor.” Machine Learning, Vol. 24, No. 2, pp.123-140.
    Burbidge, R., Trotter, M., Buxton, B. F. and Holden, S. B. (2001), “Drug Design by Machine Learning: Support Vector Machine for Pharmaceutical Data Analysis.” Computers and Chemistry, Vol. 26, pp. 5-14.
    Chang, Y. C. and Lin, S. C. (2004), “Synergy of Logistic Regression and Support Vector Machine in Multiple-Class Classification.” LNCS, Vol. 3177, pp.132-141.
    Chen, G., Gharib, T. G., Huang, C. C., Thomas, D. G., Shedden, K. A., Taylor, Jeremy M. G., Kardia, Sharon L.R., Misek, D. E., Giordano, T. J., Tannettoni, M. D., Orringer, M.B., Hanash, S. M. and Beer, D. G.. (2002) “Proteomic Analysis of Lung Adenocarcinoma: Identification of a Highly Expressed Set of Proteins in Tumors.” Clinical Cancer Research, Vol. 8, pp. 2298-2305.
    Draper, N. R. and Smith, H. (1981), Applied Regression Analysis, 2nd Edn. Wiley, New York.
    Dudani, S. A. (1976) “The distance-weighted k-nearest-neighbor rule.”
    IEEE Transactions on Systems, Man, and Cybernetics, 6(4):325-327.
    Fisher, R. A. (1936), “The Use of Multiple Measurements in Taxonomic
    Problems.” Annals of Eugenics, Vol. 7, pp. 179-188.
    Fix, E. and Hodges, J. L. (1951), “Discriminatory Analysis-Nonparametric
    Discrimination: Consistency Properties.” Report No. 4, US Air Force School of Aviation Medicine, Random Field, Texas. [Published in Agrawala (1997),
    Silverman and Jones (1989) and Dasarathy (1991).]
    Furey, T., Schummer, M., Duffy, N., Bednarski, D., Haussler, D. and Cristiannini, N.
    (2000), “Support Vector Machine Classification and Validation of Caner Tissue Samples Using Microarray Expression Data.” Bioinformatics, Vol. 16, pp. 906-914.
    Guyon, I., Weston, J. and Barnhill, S. “Gene selection for cancer classification using support vector machines.” Machine Learning, 46(1): 389-422
    Hastie, T., Tibshirani, R. and Friedman, J. (2001) The Elements of Statistical Learning. Springer.
    Holland, J.H. (1994) Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial
    Intelligence, 3rd edn. Cambridge, MA: MIT Press.
    Johnson, R. A. and Wichern, D. W. (2002), Applied Multivariate Statistical Analysis, Prentice-Hall, Inc. Upper Saddle River, NJ, USA.
    Kohonen, Y. (1982) “Self-Organizing Formation of Topologically Correct Feature Maps.” Biological Cybernetics, Vol. 43, pp. 59-69.
    Kohonen, T. (1990) “The Self-Organizing Map”, Proc Inst Electrical Electronics Eng, Vol. 78, pp. 1464-1480.
    Lilien, R.H., Farid, H. and Donald, B.R. (2003), “Probabilistic Disease Classification of Expression-Dependent Proteomic Data from Mass Spectrometry of Human Serum.” Journal of Computational Biology, Vol. 10, No. 6, pp.925-946.
    Osuna, E., Freund, R. and Girosi, F. (1997), “Training Support Vector Machines: An Application to Face Detection.” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 130-136.
    Petricoin, E.F., Ardekani, A.M., Hitt, B.A., Levine, P.J., Fusaro, V.A., Steinberg, S.M., Mills, G.B., Simone, C., Fishman, D.A., Kohn, E.C. and Liotta, L.A. (2002) “Use of Proteomic Patterns in Serum to Identify Ovarian Cancer.” Lancet, Vol. 359, Iss. 9306, pp. 572-577.
    Qu, Y., Adam, B.L., Thornquist, M., Potter, J.D., Thompson, M.L., Yasui, Y., Davis, J., Schellhammer,P. F., Cazares,L., Clements,M.A., Wright, Jr.G.L. and Feng, Z. (2003), “Data Reduction Using a Discrete Wavelet Transform in Discriminant Analysis of Very High Dimensionality Data.” Biometrics, Vol. 59, pp, 143–151.
    Rao, C. R. (1948), “The Utilization of Multiple Measurements in Problems of
    Model Uncertainty in Generalized Linear Models.” Journal of The Royal Statistical Society series B, Vol. 10, pp. 159-203.
    Ripley, B. D. (1996), Pattern Recognition and Neural Networks, Cambridge: Cambridge University Press.
    Sauve, A. C. and Speed, T. P. (2004) “ Normalization, Baseline Correction and
    Alignment of High-Throughput Mass Spectrometry Data.” Proceedings
    Gensips 2004
    Schölkopf, B. Herbrich, R. and Smola, A. J. (2001) “The General Representer Theorem.” LNAI, Vol. 2111, pp. 416-426.
    Tong, S. and Koller, D. (2002), “Support vector machine active learning with
    applications to text classification.” The Journal of Machine Learning Research, Vol.2, pp.45-66.
    Trafalis, T. B. and Ince, H. (2000), “Support Vector Machine for Regression
    and Application to Financial Forecasting.” Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks, Vol. 6, pp.6348-6353.
    Vapnik, V. N. (1995), The Nature of Statistical Learning Theory, Springer, New York.
    Wolpert, D. H. (1992), “Stacked Generalization.” Neural Networks, Vol.5, pp241-259.
    Wu, B., Abbott, T., Fishman, D., McMurray W., Mor, G., Stone, K., Ward, D., Williams, K. and Zhao, H. (2003), “Comparison of Statistical Methods for Classification of Ovarian Cancer Using Mass Spectrometry Data.” Bioinformatics, Vol. 19, No. 13, pp. 1636-1643.
    Zhang, X., Mesirov, J. P. and Waltz, D. L. (1992) “Hybird System for Protein Secondary Structure Prediction.” NCBI, Vol. 255, No.4, pp.1049-1063.
    Zou, H. and Hastie, T. (2004) “Regularization and Variable Selection via the Elastic Net.” Journal of the Royal Statistical Society, Series B, Vol. 67, pp. 301-320.
    Description: 碩士
    國立政治大學
    統計研究所
    94354014
    95
    Source URI: http://thesis.lib.nccu.edu.tw/record/#G0094354014
    Data Type: thesis
    Appears in Collections:[統計學系] 學位論文

    Files in This Item:

    File SizeFormat
    index.html0KbHTML2252View/Open


    All items in 政大典藏 are protected by copyright, with all rights reserved.


    社群 sharing

    著作權政策宣告 Copyright Announcement
    1.本網站之數位內容為國立政治大學所收錄之機構典藏,無償提供學術研究與公眾教育等公益性使用,惟仍請適度,合理使用本網站之內容,以尊重著作權人之權益。商業上之利用,則請先取得著作權人之授權。
    The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

    2.本網站之製作,已盡力防止侵害著作權人之權益,如仍發現本網站之數位內容有侵害著作權人權益情事者,請權利人通知本網站維護人員(nccur@nccu.edu.tw),維護人員將立即採取移除該數位著作等補救措施。
    NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - Feedback