李春秋,徐曾旭林,宋宁远,王晓光.基于纳米出版物的中文学位论文语义组织研究[J].中国图书馆学报,2021,47(5):97~115
Nanopublication based Semantic Organization of Chinese Dissertation
基于纳米出版物的中文学位论文语义组织研究
Received:March 11, 2021  
DOI:
Key words:Nanopublication  Semantic organization  Chinese dissertation  Semantic publishing  Information retrieval
中文关键词:  纳米出版物  语义组织  中文学位论文  语义出版  信息检索
基金项目:本文系中央高校基本科研业务费专项资金资助课题项目“基于纳米出版物模式的中文学术论文的语义组织研究”(编号:310422112)的研究成果之一
Author NameAffiliation
LI Chunqiu 北京师范大学政府管理学院 北京 100875 
XU Zengxulin 中国科学院文献情报中心 北京 100190 
SONG Ningyuan 南京大学信息管理学院 江苏 南京 210023 
WANG Xiaoguang 武汉大学信息管理学院 湖北 武汉 430072 
Hits: 731
Download times: 466
Abstract:
With advantages of revealing contents of scientific paper and formally describing scientific concepts at fine granularity,nanopublication has been widely used in the fields of semantic publishing and semantic organization However,due to nanopublication's weakness in specific fields including the representation of assertion semantics and the realization of semantic linkage between assertions,the existing nanopublication failed to reveal semantic features and structure characteristics of scientific papers from multi dimension and multi granularity,thus limiting its application and service In view of this,the research reuses domain ontology,improves nanopublication common model,proposes representation approach to specific domain and type of scientific paper's assertions,and conducts application practices With a focus on the semantic features and linkage of Chinese dissertations in information retrieval domain,the research expands the common structure of nanopublication model,classifies the specific assertion types,and designs description models of nanopublication for Chinese dissertations on information retrieval The research selects certain numbers of Chinese dissertations on information retrieval as experiment samples,and creates RDF named graphs and Turtle data for nanopublication On this basis,empirical research is carried out through case analysis and data set application in order to further verify the usability of the proposed models.
The proposed approach to improve nanopublication and extend description models in the research could provide reference to nanopublication's application in specific domain and semantic organization of Chinese dissertation The proposed model excels in information retrieval by revealing semantic characteristics of specific statements about experiment data such as experiment parameter,experiment model and test collection The model covers the core classes of information retrieval and formalizes their relationships,which provide description model for semantic data to automatically extract assertions and semantic relationships By using term recognition,entity extraction,machine learning and data cleaning,the model proposed in this study helps the assertion extraction and automatic annotation of the Chinese dissertation,and also provides models and methods for automatic construction of nanopublications There are limitations for describing specific semantics in other specific domains when applying the model to creating nanopublications of Chinese dissertations with various structural and semantic features.
Scientific papers in nature language are complex on content semantics. It is difficult to identify experiment tasks and procedures,and necessary experiment assessments are also required Therefore,in future,it is necessary to further establish a large scale,high quality and inter linked scientific paper corpus based on innovated description model of scientific contents to provide a data foundation for extracting and revealing assertions in scientific papers Scientific paper is composed of knowledge units with semantic features and logic relationships. The future application of nanopublication in scientific papers shall focus on formal description and semantic relationships at the fine granularity of knowledge units,with a purpose to construct multi level,multi granularity and multi dimension content datasets of scientific papers 12 figs 17 tabs 17 refs
中文摘要:
      纳米出版物在细粒度揭示科学论文内容、规范描述科学概念等方面具有一定优势,已被运用于语义出版与语义组织等领域。但囿于纳米出版物在表征论断语义、实现论断间语义关联等方面的不足,现有纳米出版物未能多维度、多粒度揭示科学论文的语义特征与结构特点,由此限制了纳米出版物的应用与服务。本研究复用领域本体,通过改进纳米出版物通用模型,提出了适用于特定领域、特定体裁科学论文论断表征的方法,开展应用实践探讨。针对信息检索领域中文学位论文的语义特征与语义关联,本文扩展了纳米出版物通用模型,细化了学位论文纳米出版物的论断类别,构建了中文学位论文纳米出版物模型;并选取信息检索领域的若干中文学位论文为实验对象,生成纳米出版物的RDF命名图及Turtle数据,在此基础上分别基于案例分析和数据集应用开展实证研究,以验证本研究所构建模型的适用性。本研究提出的纳米出版物改进方法与扩展模型,可为纳米出版物在具体领域的应用研究和中文学位论文的语义组织提供借鉴。图12。表17。参考文献17。
View Full Text   View/Add Comment  Download reader