陈涛,刘炜,单蓉蓉,朱庆华.知识图谱在数字人文中的应用研究[J].中国图书馆学报,2019,45(6):34~49
Application of Knowledge Graph in Digital Humanities
知识图谱在数字人文中的应用研究
Received:April 07, 2019  
DOI:
Key words:Digital humanities  Knowledge graph  Linked data  Knowledge inference  China Biographical Database (CBDB)
中文关键词:  数字人文  知识图谱  关联数据  知识推理  中国历代人物传记资料库
基金项目:本文系国家社会科学基金项目“数字人文中图像文本资源的语义化建设与开放图谱构建研究”(编号:19BTQ024)的研究成果之一
Author NameAffiliationE-mail
CHEN Tao 上海图书馆/上海科学技术情报研究 上海 200031  
LIU Wei 上海图书馆/上海科学技术情报研究 上海 200031 wliu@libnet.sh.cn,wliu@libnet.sh.cn 
SHAN Rongrong 上海图书馆/上海科学技术情报研究 上海 200031  
ZHU Qinghua 南京大学信息管理学院 江苏 南京 210023  
Hits: 2307
Download times: 1022
Abstract:
Knowledge graph is a technique that uses computers to shore, manage, and present concepts and their relationships. This technique became a research hotspot in industry and academia as soon as it was proposed. However, the concept of knowledge graph was quite chaotic in this field. People often confuse Knowledge Map (KM), Knowledge Graph (KG) and Graph Database (GD). Knowledge map should be regarded more as a metrological method, so there is no detailed discussion in this paper. According to different storage methods, the knowledge graph can be divided into semantic knowledge graph (also called linked data, based on RDF storage), and generalized knowledge graph (due to graph databases). Linked data focuses on the release and linking of knowledge, while the generalized knowledge graph focuses more on the mining and calculation of knowledge. There are both commonalities and differences between the linked data and knowledge graph. This paper analyzes the similarities and differences between the two techniques from the conceptual and technical aspects, and points out that the linked data is the continuation and development of Google's knowledge graph.
In addition, this paper also proposes a system framework for applying knowledge graph to digital humanities research. Simultaneously, we also point out that digital generation, textual conversion, data extraction and intelligent construction are the main stages of research and development in the humanities field. Compared with most humanities research abroad in the textual stage, much humanities research in China are still in the digital stage, which is far from the research stage of smart data.

Based on the theoretical basis of the study of smart data of digital humanities, this paper builds a linked data platform (CBDBLD) of Chinese Biographical Database (CBDB). The seven step method adopted in the platform construction is representative and has been used in many digital humanities research projects, which can guide the semantic construction of domestic digital humanities research. This platform contains more than 420,000 biographical data, about 22.7 million triples, and is associated with open related datasets such as Shanghai Library Authority Name Files and VIAF (Virtual International Authority File). CBDBLD dataset contains ten categories of nearly 500 kinds of social relations. Further, this platform uses the concept of knowledge graph and visualization technology to show the rich relatives and social relations between characters. This platform forms a unique social network, and improves the dynamic interaction ability of user's experience and platform.
Knowledge computing and knowledge reasoning are the core technologies involved in the application of knowledge graph, which are widely studied in the application of generalized knowledge maps. However, little research has been done on linked data and digital humanities. Most of the digital humanities research in China uses linked data technology to publish and display metadata, which can be regarded as the basis of knowledge graph application. Nevertheless, it does not represent the whole knowledge graph research. In this paper, the CBDBLD platform uses a general rule reasoner to support user defined rule based reasoning which implements the mining and presentation of implicit relationships between characters. Although the current reasoning is relatively simple, it provides a new research direction for digital humanities research. The abundant graph mining and graph computing algorithms in the research of generalized knowledge atlas can be applied to the linked data, which is also the future research and practice direction of this paper's authors. 
It can be said that both semantic knowledge graph and generalized knowledge graph can promote the innovation of digital humanities research methods. The combination of the two techniques will become the next hotspot in the field of digital humanities, and brings a new era of digital humanities research. 10 figs. 2 tabs. 25 refs.
中文摘要:
      知识图谱是利用计算机存储、管理和呈现概念及其相互关系的一种技术,一经提出便很快成为工业界和学术界的研究热点,但目前对知识图谱的认知还比较混乱。依据存储方式不同,知识图谱可分为基于RDF存储的语义知识图谱(关联数据)和基于图数据库的广义知识图谱。语义知识图谱(关联数据)侧重于知识的发布和链接,广义知识图谱则更侧重于知识的挖掘和计算,两者之间既有共同点,又有不同之处。本文从概念层面和技术层面详细分析了两者之间的异同,指出语义知识图谱(关联数据)才是谷歌知识图谱的延续和发展。随后,提出了将知识图谱应用于数字人文研究的系统框架,并在此基础上构建了中国历代人物传记资料库的关联数据平台(CBDBLD)。该平台借助知识图谱的理念展现了人物之间丰富的亲属及社会关系,形成了特有的社会关系网络,并可通过设置推理规则来实现人物之间隐性关系的挖掘与呈现。广义知识图谱研究中丰富的图运算和关联数据的结合将会成为数字人文领域研究的下一个热点,从而开启数字人文研究的新时代。图10。表2。参考文献25。
View Full Text   View/Add Comment  Download reader