李惠,陈涛,侯君明,刘丁,朱庆华,刘炜.钩玄提要——古籍目录智能分析工具构建[J].中国图书馆学报,2021,47(4):97~112
Noting the Essentials: An Explorative Tool for Catalog Annotations in Chinese Rare Book Collections
钩玄提要——古籍目录智能分析工具构建
Received:September 29, 2020  Revised:June 25, 2021
DOI:
Key words:Catalogs in Chinese rare book collections  Tiyao  Network model  Explorative tool  Digital humanities
中文关键词:  古籍目录  提要  网络模型  智能分析工具  数字人文
基金项目:
Author NameAffiliation
LI Hui 上海图书馆上海科学技术情报研究所,南京大学信息管理学院 上海 200031 
CHEN Tao 中山大学信息管理学院副教授 广东 广州 510006 
HOU Junming 上海古籍出版社编辑 上海 200001 
LIU Ding 天津工业大学计算机科学与技术学院 天津 300061 
ZHU Qinghua 南京大学信息管理学院 江苏 南京 210023 
LIU Wei 上海图书馆上海科学技术情报研究所 上海 200031 
Hits: 874
Download times: 921
Abstract:
Catalog annotations in Chinese rare book collections,also known as Tiyaos,contain the essential information regarding a book,eg,the author introduction,the summary,the nature and style,the version,and the critique of the corresponding book. In order to write a good Tiyao,even the most eminent scholars spent a lot of time and effort in collecting,collating,reviewing,and annotating large scale book collections However,confronted with large scale rare book collections,even though we pay huge human efforts on writing,editing,and recommending Tiyaos,this task is still time consuming and omissions are inevitable In this paper,we propose a Tiyao centric network model,which integrates rare books,historic figures,and Tiyaos into one tripartite graph This network model is not limited to the language or scale of texts,and can be further applied to large scale catalogues of rare book collections.
Based on this model,we construct a Tiyao explorative tool TiyaoX By using SPARQL,this tool can extract RDF data in related knowledge bases and provide users with information of rare book authors or editors,for instance,occupation,imperial examination,biography,and so on Furthermore,this tool can automatically extract the individual relations embedded in Tiyaos,enrich the existing resources with reliable and valuable relation information; in addition,this tool leverages metadata and content features of Tiyaos to recommend potential interesting and query relevant (eg,Tiyao content,category,book name and author relevant) Tiyaos.
In this paper,we use Siku Quanshu Zongmuas dataset,and on the one hand,we investigate the latent relations among rare books,historic figures,and Tiyaos We also make separate quantitative analysis towards person names and book names in four divisions of this dataset respectively The results demonstrate that Tiyaos in “Ji” division contain most names of historical persons and books,and Tiyaos in “Jing” division contain names least Tiyaos in “Zi” and “Ji” divisions have the maximum overlap of person names,and Tiyaos in “Shi” and “Zi” divisions have the maximum overlap of book names. In the constructed network,most key figures are Confucius scholars,book collectors,and bibliographers,and most important rare books are descriptive catalogues,Confucian classics,and history books On the other hand,we take advantage of descriptive features in Tiyao content (author introduction,content summary,and critics),and combine them with Tiyao meatada as well as three text recommendation strategies (Cosine similarity,LDA + JS distance,and Word2Vec + RWMD) Our objective is to evaluate the impacts of different content features on the accuracy of recommendation module of TiyaoX respectively The experimental results demonstrate that the approach that integrates summary,critique information and Tiyao metadata information as content features,performs best among all results This approach can be extended to knowledge discovery of rare book collections which provides convenience for related professionals,scholars and enthusiasts,and improves efficiency in practice 4 figs 6 tabs 51 refs.
中文摘要:
      古籍目录辨章学术,考镜源流,对古典学术研究具有重要的价值。本文提出古籍提要网络分析模型,用无向三部图整合古籍、人物和提要信息。在此基础上构建古籍目录智能分析工具,不仅可以自动挖掘提要中蕴藏的人物关系,与已有的古代人物知识库相关联,为知识库补充可靠而有价值的关系信息;而且综合考虑提要的元数据和正文的语义特征信息,并将其整合入推荐算法中,能为用户智能地推荐与被检索项内容、部类名、古籍名、古籍责任者相关的提要。以《四库全书总目》为实验数据集,一方面基于提要网络,从人物、古籍、提要三个层面探索不同实体间的内在联系,并就四部提要中出现的人名和古籍名开展定量研究;另一方面从作者简介、内容概述及学术评价这三种提要文本特征入手,结合元数据信息和三种常用的文献推荐算法,评估不同的语义特征对工具推荐功能准确性的影响。实验结果表明,提要文本中的内容概述及学术评价作为语义特征提炼,再结合元数据信息,效果良好,可推广应用到面向古籍的知识发现中。图4。表6。参考文献51。
View Full Text   View/Add Comment  Download reader