曾蕾,王晓光,范炜.图档博领域的智慧数据及其在数字人文研究中的角色[J].中国图书馆学报,2018,44(1):17~34
Smart Data from Libraries, Archives and Museums and its Role in the Digital Humanity Researches
图档博领域的智慧数据及其在数字人文研究中的角色
Received:October 09, 2017  
DOI:
Key words:Smart data  Big data  Digital humanities  LAM data  Structured data
中文关键词:  智慧数据  大数据  数字人文  图档博数据  结构化数据
基金项目:
Author NameAffiliationE-mail
ZENG Marcia Lei 美国肯特州立大学信息学院 美国俄亥俄州 mzeng@kent.edu,mzeng@kent.edu 
WANG Xiaoguang 武汉大学信息资源研究中心 湖北武汉430072  
FAN Wei 四川大学公共管理学院信息管理技术系 四川成都610065  
Hits: 4014
Download times: 1557
Abstract:
Along with the rapid development of “Big Data” in recent years, an important yet lesser known concept “Smart Data” has also emerged. In the context of the multiple “V”s of Big Data (Volume, Velocity, Variety, Veracity, and Value), the realization of the last “V”, Value, depends on “Smart Data”, i.e., the ability to achieve big insights from trusted, contextualized, relevant, cognitive, predictive, and consumable data at any scale, great or small. This paper first explains the rationale, definition and connotation of the concept of Smart Data through the results obtained from literature review and case studies, while exploring the current approaches according to the content of the Smart Data Conference held in the U.S. in recent years. Smart Data provides value by dealing with challenges posed by the volume, velocity, variety and veracity of big data (and resulting actionable information), as well as by improving decision making. Smart Data represents the way in which different data sources (including Big Data) are brought together, correlated and contextualized, analyzed and interpreted, in order to feed decision making and action processes. Furthermore, this paper presents the research fields, resources, and methods collected from the documents of American and European digital humanities research projects in the past seven years, alongside the topics and academic domains of the contributions and presentations at the previous five international Digital Humanities conferences. It reveals how the digital humanities have embodied Smart Data and Big Data concepts and approaches, which demonstrate an emerging and significant change in terms of methodology. The evidence indicates that: Smart Data has been, and will continue to be, playing a gigantic role in the field of digital humanities; the data resources owned by libraries, archives and museums (LAMs) are invaluable in all research areas, especially the digital humanities, in the data age. Consequently, this paper assesses the relationship between the fields of digital humanities and libraries, and the relationship between digital humanity research and data resources from LAMs. The paper supplies a number of cases that reveal new ideas for information services, especially the structuralization and semantic enrichment of raw data. In addition to showing how the structured data provided by LAMs can infinitely enrich knowledge graphs and associated datasets and be used for the development of knowledge bases, the article also addresses the data exchange and in depth semantic annotation of images, introducing the International Image Interoperability Framework (IIIF) APIs and a study of semantic based deep image annotation. Lastly, this paper focuses on intangible cultural heritages and the potential for taking digital humanities approaches, technologies, and other channels to promote the construction of digital resources for them, in compliance with the requirements of Smart Data. Overall, the literature review, project analyses, and various case examples used in this paper provide evidence that, by taking the concept and methodology of Big Data and using the approaches of Smart Data, we can turn unstructured data into structured data in the data organization and integration process, producing the kind of data that will be machine processable, reusable for multi purposes, and highly efficient in processing. Thus, LAMs will be able to bring these rich resources into the mainstream of the digital age. In conclusion, in the Semantic Web and Big Data era, LAMs are not only the providers but also the direct beneficiaries of Smart Data. The development of Smart Data can effectively enable the advancement of digital humanities, while also becoming the most important emerging work of LAMs. 7 figs. 1 tab. 41 refs.
中文摘要:
      近年来,随着“大数据”的飞速发展,一个重要却鲜为人知的概念“智慧数据”应运而生,智慧数据已经并将持续在数字人文领域发挥巨大作用。图书馆、档案馆和博物馆(简称“图档博”)所拥有的数据资源是数据时代各个领域,尤其是数字人文领域的无价之宝。如果采纳大数据的模式和思维方式、智慧数据的实现方式,以非结构化数据到结构化数据的组织和整合过程为手段,产生机器可理解并可采取行动的、一源多用、高效率运作的数据,图档博以及相关行业将携带这些丰富的资源进入数字时代的主流。本文在阐释智慧数据的概念、方法论的转变、数字人文及其与图书馆关系的基础上,通过一些范例来展示信息服务的新思路,特别是针对文本型和非文本型原始数据的结构化和语义化处理新方法,由此证明:在语义网和大数据时代,图档博机构不仅是智慧数据的提供者也是直接受益者,智慧数据建设不仅能有效促进数字人文的发展,也将成为图档博机构最重要的新兴工作。
View Full Text   View/Add Comment  Download reader