政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/3862
English  |  正體中文  |  简体中文  |  Post-Print筆數 : 27 |  全文笔数/总笔数 : 109951/140892 (78%)
造访人次 : 46201158      在线人数 : 523
RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜寻范围 查询小技巧:
  • 您可在西文检索词汇前后加上"双引号",以获取较精准的检索结果
  • 若欲以作者姓名搜寻,建议至进阶搜寻限定作者字段,可获得较完整数据
  • 进阶搜寻


    请使用永久网址来引用或连结此文件: https://nccur.lib.nccu.edu.tw/handle/140.119/3862


    题名: 統計機器學習及其應用-病例分類與資料縮減研究-應用蛋白質資料庫檢測癌症(2/2)
    其它题名: Disease Classification and Data Reduction--- Application to Cancer Detection Based on Proteomic
    作者: 余清祥
    关键词: 資料縮減;分類;病例診斷;模擬
    Data reduction;Classification;Diagnosis;Simulation
    日期: 2005
    上传时间: 2007-04-18 16:36:53 (UTC+8)
    出版者: 臺北市:國立政治大學統計學系
    摘要: 在資料庫內容龐大紛雜的現代社會中,時效性往往是最重要的考量因素,以期在最短的時間內獲取近似、可接受的解答,為後續發展提供即時的建議。例如:醫師根據癌症病患的檢體報告,儘快判斷病患是否需要立即實施手術、化學治療,或甚至不需要任何治療、但須持續追蹤觀察。因為資料量的縮減通常代表較低的分析時間與成本,縮減資料自然成為講求時效及近似解答的最佳選擇之一,其中常見的方法包括直方圖(Histogram)、歧異值分解(Singular Value Decomposition)、索引樹(Index Tree)、抽樣、小波(Wavelet)等等。本計畫將使用攝護腺病人的蛋白質體資料庫(Proteomic data),其中病例個數約300人、變數個數卻接近5 萬個,以正確的病例分類為目標,比較幾種常見資料縮減方法的優劣。本計畫將預計分為三年進行:第一年使用人工篩選(錯誤較少、變數較少)過的蛋白質質譜儀數據,考慮以Support Vector Machine (SVM)、類神經網路、Classification and Regression Tree (CART)、羅吉士迴歸四種常見的分類方法,尋求在二元、分類標準下的最佳分類方法;第二年使用變數個數約5 萬個的原始資料,以二元分類為目標,配合之前較佳的分類方法,尋求可篩選出最多訊息的資料縮減方法;第三年則嘗試合併每位病人兩份檢體結果,以多元分 類為目標,獲得正確的病例診斷。
    It is often needed to get quick approximate answers from large databases (i.e., data reduction), since obtaining answers quickly is important and it is acceptable to sacrifice the accuracy of the answer for speed. The reduction process is important in the exploratory data analysis, particularly when interactive response times are critical. For example, doctors need to decide from the medical exam if cancer patients need surgeries, chemical therapies, or thorough physical exam. Popular data reduction methods include histogram, singular value decomposition (SVD), index tree, sampling, and wavelet. We will use data from prostate cancer patients (Proteomic data), which include records of about 300 patients and almost 50,000 variables. Our goal is to include the data reduction methods to minimize the classification error. The project will be divided into three years. The focus of the first year is to explore the performance of frequently used classification methods, such as support vector machine (SVM), neural network, classification and regression tree, and logistic regression. We shall use the pre-processed data with only 779 variables and possible errors corrected manually, and the goal of the first year is binary classification. Data reduction methods will be considered in the second year and the raw data (about 48,000 variables and errors not corrected) will be used as well. The focus will be on the diagnosis of patients and we shall consider methods of combining samples from the same patient.
    描述: 核定金額:323000元
    数据类型: report
    显示于类别:[統計學系] 國科會研究計畫

    文件中的档案:

    档案 描述 大小格式浏览次数
    942118M004001.pdf462KbAdobe PDF21292检视/开启


    在政大典藏中所有的数据项都受到原著作权保护.


    社群 sharing

    著作權政策宣告 Copyright Announcement
    1.本網站之數位內容為國立政治大學所收錄之機構典藏,無償提供學術研究與公眾教育等公益性使用,惟仍請適度,合理使用本網站之內容,以尊重著作權人之權益。商業上之利用,則請先取得著作權人之授權。
    The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

    2.本網站之製作,已盡力防止侵害著作權人之權益,如仍發現本網站之數位內容有侵害著作權人權益情事者,請權利人通知本網站維護人員(nccur@nccu.edu.tw),維護人員將立即採取移除該數位著作等補救措施。
    NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 回馈