政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/142127
English  |  正體中文  |  简体中文  |  Post-Print筆數 : 27 |  全文笔数/总笔数 : 112704/143671 (78%)
造访人次 : 49780335      在线人数 : 593
RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜寻范围 查询小技巧:
  • 您可在西文检索词汇前后加上"双引号",以获取较精准的检索结果
  • 若欲以作者姓名搜寻,建议至进阶搜寻限定作者字段,可获得较完整数据
  • 进阶搜寻
    政大機構典藏 > 資訊學院 > 資訊科學系 > 學位論文 >  Item 140.119/142127


    请使用永久网址来引用或连结此文件: https://nccur.lib.nccu.edu.tw/handle/140.119/142127


    题名: 從歷史文件到社會人際脈動:基於歷時性文本進行時序知識圖譜構建
    From Historical Documents to Social Interpersonal Networks: Temporal Knowledge Graph Construction based on Diachronic Documents
    作者: 李婕瑜
    Lee, Chieh-Yu
    贡献者: 黃瀚萱
    Huang, Hen-Hsen
    李婕瑜
    Lee, Chieh-Yu
    关键词: 自然語言處理
    知識圖譜
    時序知識圖譜
    人際關係抽取
    鏈結預測
    數位人文
    Natural language processing
    Knowledge graph
    Temporal knowledge graph
    Interpersonal relations extraction
    Link prediction
    Digital humanities
    日期: 2022
    上传时间: 2022-10-05 09:15:43 (UTC+8)
    摘要: 公眾人物的社交網絡,以許多對社會具有高度影響力之人所組成,透
    過捕捉人物間的關係變化,對觀察特定時間點的社會情勢是不可或缺。利
    用動態社交網絡可進一步判斷隨時間變化的人物關係,提供一個嶄新視角
    梳理社會脈動。

    然從非結構化文字到時序圖譜,須經過多樣任務。在架構上,首先從
    實體辨識得到篇章內的人物後,再進行關係抽取,而在文本預處理上,本
    文提出一種階層式句子壓縮,用於協助關係抽取模型從長文檔中提取人與
    人之間的關係屬性。因考慮關係提取錯誤之可能性及文本未提及之關係,
    本文提出一種圖譜校正方式,來優化關係提取模型所提取出的歷年元組。
    最後利用歷年事實元組建構時序知識圖譜,本文改善節點間的訊息傳播層
    數以及加入文本相關資訊來輔助圖譜預測下個時間單位人物節點間的關係。

    本文研究旨在建立一種從歷史文件、書信構建時序知識圖譜的架構,
    可用於分析、預測動態人際關係,以應用各種跨領域學科,例如:政治和
    歷史領域,協助達到更具效率且精準的研究。
    The social network of public figures delivers rich information for the interpersonal relationships among influential people in a society. The temporal social network can further depict the change of their relationships over time and provide a new perspective to look into the dynamics of a society.

    This work demonstrates a novel system for temporal social network construction from textual data such as historical documents. A hierarchical sentence compression is proposed to support extracting interpersonal relationships among character from long documents. Then, we consider the error from relation extraction and the relations not mentioned in the documents, graph correction method is applied to optimize the outputs. Furthermore, we use historic facts to construct a temporal knowledge graph to predict the relationship between character in the next time unit. We make an adjustment for the number of hops in aggregation and add text information to improve the precision of predicting the relationship.

    The purpose of this study is to establish a framework for constructing a temporal knowledge graph from historical documents, which can be used to analyze and predict dynamic interpersonal relationships to apply various interdisciplinary researches, such as politics and history.
    參考文獻: [1] Farhad Abedini, Mohammad Reza Keyvanpour, and Mohammad Bagher Menhaj.Correction tower: A general embedding method of the error recognition for the knowledgegraph correction. Int. J. Pattern Recognit. Artif. Intell., 34:2059034:1–2059034:38, 2020.
    [2] Ivana Balažević, Carl Allen, and Timothy M Hospedales. Tucker: Tensor factorizationfor knowledge graph completion. In Empirical Methods in Natural Language Processing,2019.
    [3] Anson Bastos, Abhishek Nadgeri, Kuldeep Singh, Isaiah Onando Mulang, SaeedehShekarpour, Johannes Hoffart, and Manohar Kaul. Recon: relation extraction usingknowledge graph context in a graph neural network. In Proceedings of the Web Conference2021, pages 1673–1685, 2021.
    [4] Iz Beltagy, Matthew E Peters, and Arman Cohan. Longformer: The long-documenttransformer. arXiv preprint arXiv:2004.05150, 2020.
    [5] Vincent D Blondel, Jean-Loup Guillaume, Renaud Lambiotte, and Etienne Lefebvre. Fastunfolding of communities in large networks. Journal of statistical mechanics: theory andexperiment, 2008(10):P10008, 2008.
    [6] Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and OksanaYakhnenko. Translating embeddings for modeling multi-relational data. In C. J. C. Burges,L. Bottou, M. Welling, Z. Ghahramani, and K. Q. Weinberger, editors, Advances in NeuralInformation Processing Systems, volume 26. Curran Associates, Inc., 2013.
    [7] Mingming Chen, Konstantin Kuzmin, and Boleslaw K Szymanski. Community detectionvia maximization of modularity and its variants. IEEE Transactions on ComputationalSocial Systems, 1(1):46–65, 2014.
    [8] Shib Sankar Dasgupta, Swayambhu Nath Ray, and Partha Talukdar. HyTE: Hyperplane-based temporally aware knowledge graph embedding. In Proceedings of the 2018Conference on Empirical Methods in Natural Language Processing, pages 2001–2011,Brussels, Belgium, October-November 2018. Association for Computational Linguistics.
    [9] Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. BERT: Pre-trainingof deep bidirectional transformers for language understanding. In Proceedings of the2019 Conference of the North American Chapter of the Association for ComputationalLinguistics: Human Language Technologies, Volume 1 (Long and Short Papers),pages 4171–4186, Minneapolis, Minnesota, June 2019. Association for ComputationalLinguistics.
    [10] Wenfei Fan, Xueli Liu, Ping Lu, and Chao Tian. Catching numeric inconsistencies ingraphs. In Proceedings of the 2018 International Conference on Management of Data,SIGMOD ’18, page 381–393, New York, NY, USA, 2018. Association for ComputingMachinery.
    [11] Alberto García-Durán, Sebastijan Dumancic, and Mathias Niepert. Learning sequenceencoders for temporal knowledge graph completion. In EMNLP, 2018.
    [12] Alex Graves, Santiago Fernández, and Jürgen Schmidhuber. Bidirectional lstm networksfor improved phoneme classification and recognition. In Proceedings of the 15thInternational Conference on Artificial Neural Networks: Formal Models and TheirApplications - Volume Part II, ICANN’05, page 799–804, Berlin, Heidelberg, 2005.Springer-Verlag.
    [13] Zhijiang Guo, Yan Zhang, and Wei Lu. Attention guided graph convolutional networksfor relation extraction. arXiv preprint arXiv:1906.07510, 2019.
    [14] Iris Hendrickx, Su Nam Kim, Zornitsa Kozareva, Preslav Nakov, Diarmuid Ó Séaghdha,Sebastian Padó, Marco Pennacchiotti, Lorenza Romano, and Stan Szpakowicz. SemEval-2010 task 8: Multi-way classification of semantic relations between pairs of nominals.In Proceedings of the 5th International Workshop on Semantic Evaluation, pages 33–38,Uppsala, Sweden, July 2010. Association for Computational Linguistics.
    [15] Tingsong Jiang, Tianyu Liu, Tao Ge, Lei Sha, Baobao Chang, Sujian Li, and Zhifang Sui.Towards time-aware knowledge graph completion. In Proceedings of COLING 2016, the26th International Conference on Computational Linguistics: Technical Papers, pages1715–1724, Osaka, Japan, December 2016. The COLING 2016 Organizing Committee.
    [16] Woojeong Jin, Meng Qu, Xisen Jin, and Xiang Ren. Recurrent event network:Autoregressive structure inference over temporal knowledge graphs. In EMNLP, 2020.
    [17] Jaehun Jung, Jinhong Jung, and U Kang. T-gap: Learning to walk across time for temporalknowledge graph completion. arXiv preprint arXiv:2012.10595, 2020.
    [18] Jaehun Jung, Jinhong Jung, and U Kang. Learning to walk across time for interpretabletemporal knowledge graph completion. In Proceedings of the 27th ACM SIGKDDConference on Knowledge Discovery amp; Data Mining, KDD ’21, page 786–795, NewYork, NY, USA, 2021. Association for Computing Machinery.
    [19] Timothée Lacroix, Guillaume Obozinski, and Nicolas Usunier. Tensor decompositions fortemporal knowledge base completion. arXiv preprint arXiv:2004.04926, 2020.
    [20] Huiying Li, Yuanyuan Li, Feifei Xu, and Xinyu Zhong. Probabilistic error detecting innumerical linked data. In Proceedings, Part I, of the 26th International Conference onDatabase and Expert Systems Applications - Volume 9261, DEXA 2015, page 61–75,Berlin, Heidelberg, 2015. Springer-Verlag.
    [21] Chin-Yew Lin and Franz Josef Och. Automatic evaluation of machine translation qualityusing longest common subsequence and skip-bigram statistics. In Proceedings of the 42ndAnnual Meeting of the Association for Computational Linguistics (ACL-04), pages 605–612, Barcelona, Spain, July 2004.
    [22] Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy,Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. Roberta: A robustly optimized bertpretraining approach. ArXiv, abs/1907.11692, 2019.
    [23] Yunpu Ma, Volker Tresp, and Erik A. Daxberger. Embedding models for episodicknowledge graphs. J. Web Semant., 59, 2019.
    [24] Mike Mintz, Steven Bills, Rion Snow, and Dan Jurafsky. Distant supervision for relationextraction without labeled data. In Proceedings of the Joint Conference of the 47th AnnualMeeting of the ACL and the 4th International Joint Conference on Natural LanguageProcessing of the AFNLP, pages 1003–1011, 2009.
    [25] Paramita Mirza and Sara Tonelli. Catena: Causal and temporal relation extraction fromnatural language texts. In The 26th international conference on computational linguistics,pages 64–75. ACL, 2016.
    [26] Abhishek Nadgeri, Anson Bastos, Kuldeep Singh, Isaiah Onando Mulang, JohannesHoffart, Saeedeh Shekarpour, and Vijay Saraswat. Kgpool: Dynamic knowledge graphcontext selection for relation extraction. arXiv preprint arXiv:2106.00459, 2021.
    [27] Heiko Paulheim. Identifying wrong links between datasets by multi-dimensional outlierdetection. In WoDOOM, pages 27–38, 2014.
    [28] Heiko Paulheim. Knowledge graph refinement: A survey of approaches and evaluationmethods. Semantic web, 8(3):489–508, 2017.
    [29] Julia Perl, Claudia Wagner, Jerome Kunegis, and Steffen Staab. Twitter as a politicalnetwork: Predicting the following and unfollowing behavior of german politicians. InProceedings of the ACM Web Science Conference, pages 1–2, 2015.
    [30] Chris Quirk and Hoifung Poon. Distant supervision for relation extraction beyond thesentence boundary. arXiv preprint arXiv:1609.04873, 2016.
    [31] Sebastian Riedel, Limin Yao, and Andrew McCallum. Modeling relations and theirmentions without labeled text. In José Luis Balcázar, Francesco Bonchi, Aristides Gionis,and Michèle Sebag, editors, Machine Learning and Knowledge Discovery in Databases,pages 148–163, Berlin, Heidelberg, 2010. Springer Berlin Heidelberg.
    [32] Emanuele Rossi, Ben Chamberlain, Fabrizio Frasca, Davide Eynard, Federico Monti, andMichael Bronstein. Temporal graph networks for deep learning on dynamic graphs, 2020.
    [33] Aravind Sankar, Yanhong Wu, Liang Gou, Wei Zhang, and Hao Yang. Dysat: Deep neuralrepresentation learning on dynamic graphs via self-attention networks. In Proceedings ofthe 13th International Conference on Web Search and Data Mining, pages 519–527, 2020.
    [34] Michael Schlichtkrull, Thomas N Kipf, Peter Bloem, Rianne Van Den Berg, Ivan Titov, andMax Welling. Modeling relational data with graph convolutional networks. In EuropeanSemantic Web Conference, pages 593–607. Springer, 2018.
    [35] Livio Baldini Soares, Nicholas FitzGerald, Jeffrey Ling, and Tom Kwiatkowski.Matching the blanks: Distributional similarity for relation learning. arXiv preprintarXiv:1906.03158, 2019.
    [36] Petar Velickovic, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, andYoshua Bengio. Graph attention networks. stat, 1050:20, 2017.
    [37] Hong Wang, Christfried Focke, Rob Sylvester, Nilesh Mishra, and William Wang. Fine-tune bert for docred with two-step process, 2019.
    [38] ZhongXian Wang, XiangHui He, and XingYan Hu. Chinese sentence compressionalgorithm based on deep analysis of sentence hierarchy in multiple application scenarios.In 2020 3rd International Conference on Advanced Electronic Materials, Computers andSoftware Engineering (AEMCSE), pages 61–66, 2020.
    [39] Shanchan Wu and Yifan He. Enriching pre-trained language model with entity informationfor relation classification. In Proceedings of the 28th ACM International Conference onInformation and Knowledge Management, CIKM ’19, page 2361–2364, New York, NY,USA, 2019. Association for Computing Machinery.
    [40] Wei Xu and Ralph Grishman. A parse-and-trim approach with information significancefor Chinese sentence compression. In Proceedings of the 2009 Workshop on LanguageGeneration and Summarisation (UCNLG+Sum 2009), pages 48–55, Suntec, Singapore,August 2009. Association for Computational Linguistics.
    [41] Jianhao Yan, Lin He, Ruqin Huang, Jian Li, and Ying Liu. Relation extraction withtemporal reasoning based on memory augmented distant supervision. In Proceedings ofthe 2019 Conference of the North American Chapter of the Association for ComputationalLinguistics: Human Language Technologies, Volume 1 (Long and Short Papers),pages 1019–1030, Minneapolis, Minnesota, June 2019. Association for ComputationalLinguistics.
    [42] Jianhao Yan, Lin He, Ruqin Huang, Jian Li, and Ying Liu. Relation extraction withtemporal reasoning based on memory augmented distant supervision. In Proceedings ofthe 2019 Conference of the North American Chapter of the Association for ComputationalLinguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages1019–1030, 2019.
    [43] Bishan Yang, Wen tau Yih, Xiaodong He, Jianfeng Gao, and Li Deng. Embedding entitiesand relations for learning and inference in knowledge bases. CoRR, abs/1412.6575, 2015.
    [44] Daojian Zeng, Kang Liu, Yubo Chen, and Jun Zhao. Distant supervision for relationextraction via piecewise convolutional neural networks. In Proceedings of the 2015conference on empirical methods in natural language processing, pages 1753–1762, 2015.
    [45] Daojian Zeng, Kang Liu, Siwei Lai, Guangyou Zhou, and Jun Zhao. Relationclassification via convolutional deep neural network. In Proceedings of COLING 2014,the 25th International Conference on Computational Linguistics: Technical Papers, pages2335–2344, Dublin, Ireland, August 2014. Dublin City University and Association forComputational Linguistics.
    [46] Ben Zhou, Kyle Richardson, Qiang Ning, Tushar Khot, Ashish Sabharwal, and DanRoth. Temporal reasoning on implicit events from distant supervision. arXiv preprintarXiv:2010.12753, 2020.
    [47] Cunchao Zhu, Muhao Chen, Changjun Fan, Guangquan Cheng, and Yan Zhan. Learningfrom history: Modeling temporal knowledge graphs with sequential copy-generationnetworks. arXiv preprint arXiv:2012.08492, 2020.
    [48] Kangli Zi, Shi Wang, Yu Liu, Jicun Li, Yanan Cao, and Cungen Cao. SOM-NCSCM :An efficient neural Chinese sentence compression model enhanced with self-organizingmap. In Proceedings of the 2021 Conference on Empirical Methods in Natural LanguageProcessing, pages 403–415, Online and Punta Cana, Dominican Republic, November2021. Association for Computational Linguistics.
    描述: 碩士
    國立政治大學
    資訊科學系
    109753133
    資料來源: http://thesis.lib.nccu.edu.tw/record/#G0109753133
    数据类型: thesis
    DOI: 10.6814/NCCU202201527
    显示于类别:[資訊科學系] 學位論文

    文件中的档案:

    档案 描述 大小格式浏览次数
    313301.pdf1553KbAdobe PDF20检视/开启


    在政大典藏中所有的数据项都受到原著作权保护.


    社群 sharing

    著作權政策宣告 Copyright Announcement
    1.本網站之數位內容為國立政治大學所收錄之機構典藏,無償提供學術研究與公眾教育等公益性使用,惟仍請適度,合理使用本網站之內容,以尊重著作權人之權益。商業上之利用,則請先取得著作權人之授權。
    The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

    2.本網站之製作,已盡力防止侵害著作權人之權益,如仍發現本網站之數位內容有侵害著作權人權益情事者,請權利人通知本網站維護人員(nccur@nccu.edu.tw),維護人員將立即採取移除該數位著作等補救措施。
    NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 回馈