English  |  正體中文  |  简体中文  |  Post-Print筆數 : 27 |  Items with full text/Total items : 94188/124659 (76%)
Visitors : 29601914      Online Users : 285
RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version
    Please use this identifier to cite or link to this item: http://nccur.lib.nccu.edu.tw/handle/140.119/122030


    Title: Data Science as a Foundation towards Open Data and Open Science: The Case of Taiwan Indigenous Peoples Open Research Data (TIPD)
    Authors: 林季平
    Lin, Ji-Ping
    Contributors: 社會系
    Keywords: big data, data science, open data, open science, TIPD, TIPs
    Date: 2017-03
    Issue Date: 2019-01-21 14:03:44 (UTC+8)
    Abstract: The research is an outcome of the joint research program by Academia Sinica and the Council of Indigenous Affairs in 2013-2017. The aims of this paper are threefold: (1) to demonstrate the methods of data science in constructing the Taiwan Indigenous Peoples (TIPs) Open Research Data database (TIPD, see http://TIPD.sinica.edu.tw, and https://osf.io/e4rvz/, identifiers: DOI 10.17605/OSF.IO/E4RVZ, ARK c7605/osf.io/e4rvz) based on Taiwan Household Registration (THR) administrative data; (2) to illustrate automated and semiautomated data processing as methods for constructing effective open data; and (3) to demonstrate appropriate utilization of “old-school” data formats such as multi-dimensional tables as an effective means to overcome legal and ethical issues. The research extracts valuable information embedded in micro data of THR and enriches the extracted information through the processes of cleaning, cleansing, crunching, reorganizing, and reshaping the source data. The data enrichment processes produce a number of data sets that contain no individual information but retain most of the source data information. The enriched data sets thus can be open to the public as open data. The open data are systematically constructed mainly in an automated and partly in a semi-automated way through the integration of optimized hardware, compiler & script programming languages, computing software, and system script languages. Major outputs of TIPD amount to 31,000 files in number, totaling around 79 GB in size. TIPD consists of three categories of open research data: (1) categorical data, (2) household structure and characteristics data, and (3) population dynamics data. The potential contributions of TIPD are moves from “closed” to “open” , from “the elite” to “the ordinary”, from “local” to “global”, and from “macro and static” to “micro and dynamic” research.
    Relation: Proceedings of 2017 International Symposium on Grids & Clouds, PoS (Proceedings of Science), Academia Sinica, pp.-
    Data Type: book/chapter
    Appears in Collections:[社會學系] 專書/專書篇章

    Files in This Item:

    File Description SizeFormat
    index.html0KbHTML160View/Open


    All items in 政大典藏 are protected by copyright, with all rights reserved.


    社群 sharing

    著作權政策宣告
    1.本網站之數位內容為國立政治大學所收錄之機構典藏,無償提供學術研究與公眾教育等公益性使用,惟仍請適度,合理使用本網站之內容,以尊重著作權人之權益。商業上之利用,則請先取得著作權人之授權。
    2.本網站之製作,已盡力防止侵害著作權人之權益,如仍發現本網站之數位內容有侵害著作權人權益情事者,請權利人通知本網站維護人員(nccur@nccu.edu.tw),維護人員將立即採取移除該數位著作等補救措施。
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - Feedback