严承希,房小可.开放世界视角:面向多源词表的知识融合框架MtFFO研究[J].中国图书馆学报,2017,43(4):114~129
“MtFFO”:Research of A Framework of Knowledge Fusion on Multi-source Thesauri from Open World Perspectives
开放世界视角:面向多源词表的知识融合框架MtFFO研究
Received:March 08, 2017  Revised:March 29, 2017
DOI:
Key words:Open World Assumption  Thesauri  Metadata  Knowledge fusion
中文关键词:  开放世界假设  词表  元数据  知识融合
基金项目:
Author NameAffiliationE-mail
YAN Chengxi 北京大学信息管理系 北京 100871  
FANG Xiaoke 北京联合大学应用文理学院 北京 100191 xiaoke@buu.edu.cn,xiaoke@buu.edu.cn 
Hits: 2329
Download times: 859
Abstract:

    As a kind of normative and structured knowledge form,thesauri plays a significant role in the fields of information retrieval,enterprise knowledge management and service,as well as automated intelligent decision-making. Traditional closed-world thesauri have more and more severe drawbacks,such as high cost of expert-knowledge construction,low accuracy and narrow coverage of domain knowledge,obsolete knowledge-updating mechanism and weak capacity of knowledge expansion. It has become an important research domain that how to construct a proper,holistic and scientific system based on thesauri,as a powerful aid towards various systems and applications of intelligent information and knowledge management,which aims at knowledge ordering and sharing under the condition of the open interconnection. Related research has discussed in depth the structure,data sources,methods and language of thesauri integration,but lacks comprehensive induction and combing of multi-source fusion technology,theory and methods on thesauri,especially about how to embark on knowledge fusion in the open and interconnected environment.

    Introducing knowledge fusion,this paper,first of all,defines the conception and connotation of the paradigm of “knowledge fusion” in the light of Popper's Three World Theory,and discerns its difference from “data integration” and “information integration”,as to clarify the category and scope of “knowledge fusion” fundamentally; Secondly,from the perspectives of OWA(Open World Assumption),this paper presents a more comprehensive and systematic theoretical framework of knowledge fusion called “MtFFO” in combination with KRAFT framework and other related existing models. The framework mainly contains the external environment,internal core,input unit,key factors and methods. Through feature comparison of different data source (database,metadata,and the table),external Environment Sub-frame of Information Input(EESII)is expounded as a multi-level architecture of knowledge adjustment subsystem. Next the knowledge flow dealt with EESII will receive the pattern matching and semantic recognition as the control and guidance of fusion strategies and quality to achieve optimization of knowledge transformation and composition; the respectively corresponding actual operation process is knowledge mapping and merging,and the former can be divided into two methods,knowledge aggregation based on homogeneity and logical conversion system on the meta-thesaurus,and the latter consists of simply merging,reductively merging,fully merging,or synonymy & hyponymy merging. The best way we recommend is taking target requirement,detailed context,cost and quality of knowledge unit and schema in thesauri as well as feasibility into consideration,combined with the different technology of matching,mapping and merging,and importing expert knowledge and reasonable control,so as to improve the efficiency,accuracy and robustness of the fusion system. Another innovation of the “MtFFO” framework is,to deal with the difficulty of thesauri expansion and enrichment,its introduction of metadata,general knowledge base on the basis of graph model and machine learning algorithm to mix all kinds of domain taxonomy,vocabulary and database together for data mining and knowledge discovery of high quality knowledge unit.

    Although “MtFFO” is merely a theoretical framework without any validation by systems in practice,it can not only improve and develop the current theoretical system of knowledge fusion to a certain extent but also scientifically sort out and integrate a variety of relevant methods and crucial technology,providing thus theoretical basis and technical reference for solving the problems of thesauri interoperability,semantic understanding and self-enriching mechanism. 5 figs. 1 tab. 76 refs.

中文摘要:
      为了解决多源词表的异构性和知识扩展的局限性问题,本文基于知识论中波普尔世界理论论证和辨析知识融合范式的概念及其有效性,并基于开放世界假设提出了面向多源词表融合的框架体系“MtFFO”,重点对外部环境信息输入框架——不同数据单元的多级化调整和交换系统,内部核心系统的知识模式匹配方式、冲突冗余识别方式,知识映射与合并策略,质量控制和知识扩展方法等逐步进行阐述和分析。MtFFO框架不仅是对知识融合方法体系的合理补充,而且为开放环境中多源词表构建和融合提供了一定的理论基础和技术参考。图5。表1。参考文献76。
View Full Text   View/Add Comment  Download reader