English  |  正體中文  |  简体中文  |  Post-Print筆數 : 27 |  Items with full text/Total items : 111316/142225 (78%)
Visitors : 48387018      Online Users : 763
RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version
    政大機構典藏 > 資訊學院 > 資訊科學系 > 學位論文 >  Item 140.119/56889
    Please use this identifier to cite or link to this item: https://nccur.lib.nccu.edu.tw/handle/140.119/56889


    Title: 在高度分散式環境下對高維度資料建立索引
    Indexing high-dimensional data in highly distributed environments
    Authors: 黃齡葦
    Huang, Ling Wei
    Contributors: 陳良弼
    Chen , Arbee L.P.
    黃齡葦
    Huang, Ling Wei
    Keywords: 索引
    KNN查詢
    高度分散式環境
    參考點
    Index
    P2P
    Distributed Enviroments
    Reference Point
    Date: 2012
    Issue Date: 2013-02-01 16:53:38 (UTC+8)
    Abstract: 目前,隨著資料急速地增加,大規模可擴充性的高度分散式資料庫服務已逐漸成為一種趨勢。在資料如此分散的環境下,如何讓資料的查詢更有效率,建立一個好的索引扮演著相當重要的角色,加上越來越多的資料庫程式應用像是生物、圖像、音樂和視訊等等,皆是處理高維度的資料,而在這些應用程式中,經常需要做相似資料的查詢,但是在高維度的資料且分散式的資料做相似資料的查詢,需耗費大量的時間與運算成本。
    基於在高度分散式的環境下,針對高維度的資料有效地做KNN的查詢。我們提出一個利用reference point[2,13]的作法RP-CAN( Reference Point-Content Addressable Network )來改善查詢的效率。RP-CAN 主要是結合CAN [14] 的路由協定和使用reference point建立索引的方式來幫助在高度分散式環境下有效率的對高維的資料做查詢處理。
    最後會實作出我們所提出的RP-CAN索引並與RT-CAN[1]做比較。我們發現我們所提出的RP-CAN索引在高維度資料作KNN的查詢時比RT-CAN索引來的有效率。
    There has been an increasing interest in deploying a storage system in a highly distributed environment because of the rapid increasing data. And many database applications such as time series, biological and multimedia database, handle high-dimensional data. In these systems, k nearest-neighbors query is one of the most frequent queries but costly operation that is to find objects in the high-dimensional database that are similar to a given query object. As in conventional DBMS, indexes can indeed improve query performance but cannot deploy directly in highly distributed systems because the environment has become more complex. To efficiently support k nearest-neighbors query, a high-dimensional indexing strategy, is developed for the highly distributed environment.
    In this paper, we propose an efficient indexing strategy, RP-CAN( Reference Point-Content Addressable Network ), to improve the performance of the k nearest-neighbors query in a highly distributed environment. In the end of this paper, we designed an experiment to demonstrate that the performance of RP-CAN is better than RT-CAN in high dimensional space. Thus, our RP-CAN index could efficiently handle the high dimensional data.
    Reference: [1] Jinbao Wang, Sai Wu, Hong Gao, Jianzhong Li, Beng Chin Ooi: Indexing multi-dimensional data in a cloud system. SIGMOD Conference 2010: 591-602
    [2] H. V. Jagadish, Beng Chin Ooi, Kian-Lee Tan, Cui Yu, Rui Zhang: iDistance: An adaptive B+-tree based indexing method for nearest neighbor search. ACM Trans. Database Syst. 30(2): 364-397 (2005)
    [3] Cui Yu, Beng Chin Ooi, Kian-Lee Tan, H. V. Jagadish: Indexing the Distance: An Efficient Method to KNN Processing. VLDB 2001: 421-430
    [4] Nikolaos Kouiroukidis, Georgios Evangelidis: The Effects of Dimensionality Curse in High Dimensional kNN Search. Panhellenic Conference on Informatics 2011: 41-45
    [5] Hoang Tam Vo, Chun Chen, Beng Chin Ooi: Towards Elastic Transactional Cloud Storage with Range Query Support. PVLDB 3(1): 506-517 (2010)
    [6] Sai Wu, Dawei Jiang, Beng Chin Ooi, Kun-Lung Wu: Efficient B-tree Based Indexing for Cloud Data Processing. PVLDB 3(1): 1207-1218 (2010)
    [7] H. V. Jagadish, B. C. Ooi, and Q. H. Vu. Baton: A balanced tree structure for peer-to-peer networks. In VLDB 2005.
    [8] A. Rowstron and P. Druschel. Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems. In International Conference on Distributed Systems Platforms 2001.
    [9] I. Stoica, R. Morris, D. R. Karger, M. F. Kaashoek, and H. Balakrishnan. Chord: A scalable peer-to-peer lookup service for internet applications. In SIGCOMM 2001.
    [10] Ben Y. Zhao, Ling Huang, Jeremy Stribling, Sean C. Rhea, Anthony D. Joseph, John Kubiatowicz: Tapestry: a resilient global-scale overlay for service deployment. IEEE Journal on Selected Areas in Communications 2004: 41-53
    [11] http://www.napster.com/.
    [12] http://www.darkridge.com/~jpr5/doc/gnutella.html.
    [13] https://freenetproject.org/.
    [14] Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard M. Karp, Scott Shenker: A scalable content-addressable network. SIGCOMM 2001: 161-172
    [15] Wenyuan Cai, Shuigeng Zhou, Linhao Xu, Weining Qian, Aoying Zhou: C2: A New Overlay Network Based on CAN and Chord. GCC (1) 2003: 42-50
    [16] H. V. Jagadish, Beng Chin Ooi, Quang Hieu Vu, Rong Zhang, Aoying Zhou: VBI-Tree: A Peer-to-Peer Framework for Supporting Multi-Dimensional Indexing Schemes. ICDE 2006: 34
    [17] Kaushik Chakrabarti, Sharad Mehrotra: Local Dimensionality Reduction: A New Approach to Indexing High Dimensional Spaces. VLDB 2000: 89-100
    [18] Paolo Ciaccia, A. Nanni, Marco Patella: A Query-sensitive Cost Model for Similarity Queries with M-tree. Australasian Database Conference 1999: 65-76
    [19] Beyer K, Goldstein J, Ramakrishnan R, Shaft U. When is nearest neighbors meaningful? In: Beeri C, Buneman P, eds. Proc. of the 7th Int`l Conf. on Database Theory. LNCS 1540, Jerusalem: Springer-Verlag, 1999. 217-235.
    [20] Stefan Berchtold, Daniel A. Keim, Hans-Peter Kriegel: The X-tree : An Index Structure for High-Dimensional Data VLDB 1996: 28-39
    [21] Kaushik Chakrabarti, Sharad Mehrotra: The Hybrid Tree: An Index Structure for High Dimensional Feature Spaces. ICDE 1999: 440-447.
    [22] Antonin Guttman: R-Trees: A Dynamic Index Structure for Spatial Searching. SIGMOD Conference 1984: 47-57
    [23] J. B. MacQueen: Some Methods for classification and Analysis of Multivariate Observations. Proc. of the fifth Berkeley Symposium on Mathematical Statistics and Probability.1967: 281-297.
    [24] Li M, Lee W-C, Sivasubramaniam A. Semantic small world: An overlay network for peer-to-peer search. In: Proceedings of International Conference on Network Protocols (ICNP). Washington D.C.: IEEE Computer Society, 2004, 228–238
    [25] Li M, Lee W-C, Sivasubramaniam A, et al. Ssw: a small world based overlay for peer-to-peer search. IEEE Transaction on Parallel and Distributed Systems, 2008, 19(2): 735–749
    [26] David Novak , Pavel Zezula, M-Chord: a scalable distributed similarity search structure, Proceedings of the 1st international conference on Scalable information systems, 2006, p.19-es
    [27] Li M, Lee W-C, Sivasubramaniam A, J. Zhou. Supporting K nearest neighbors query on high-dimensional data in P2P systems. Frontiers of Computer Science in China, 2008, 2(3):p. 234-247 .
    [28] Tang, Y. Xu, J. Zhou, S. Lee, W. Deng, D. Wang, Y. A Lightweight Multi-Dimensional Index for Complex Queries Over DHTs. IEEE Transactions on Parallel and Distributed Systems,2011, 22: p. 2046-2054
    Description: 碩士
    國立政治大學
    資訊科學學系
    99753036
    101
    Source URI: http://thesis.lib.nccu.edu.tw/record/#G0099753036
    Data Type: thesis
    Appears in Collections:[資訊科學系] 學位論文

    Files in This Item:

    File SizeFormat
    index.html0KbHTML2290View/Open


    All items in 政大典藏 are protected by copyright, with all rights reserved.


    社群 sharing

    著作權政策宣告 Copyright Announcement
    1.本網站之數位內容為國立政治大學所收錄之機構典藏,無償提供學術研究與公眾教育等公益性使用,惟仍請適度,合理使用本網站之內容,以尊重著作權人之權益。商業上之利用,則請先取得著作權人之授權。
    The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

    2.本網站之製作,已盡力防止侵害著作權人之權益,如仍發現本網站之數位內容有侵害著作權人權益情事者,請權利人通知本網站維護人員(nccur@nccu.edu.tw),維護人員將立即採取移除該數位著作等補救措施。
    NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - Feedback