Page 253 - Journal of Library Science in China, Vol.47, 2021
P. 253

252
            252   Journal of Library Science in China, Vol.13, 2021


            fields including the representation of assertion semantics and the realization of semantic
            linkage between assertions, the existing nanopublication failed to reveal semantic features
            and structure characteristics of scientific papers from multi-dimension and multi-granularity,
            thus limiting its application and service. In view of this, the research reuses domain ontology,
            improves nanopublication common model, proposes representation approach to specific
            domain and type of scientific paper’s assertions, and conducts application practices. With a
            focus on the semantic features and linkage of Chinese dissertations in information retrieval
            domain, the research expands the common structure of nanopublication model, classifies
            the specific assertion types, and designs description models of nanopublication for Chinese
            dissertations on information retrieval. The research selects certain numbers of Chinese
            dissertations on information retrieval as experiment samples, and creates RDF named graphs
            and Turtle data for nanopublication. On this basis, empirical research is carried out through
            case analysis and data set application in order to further verify the usability of the proposed
            models.
              The proposed approach to improve nanopublication and extend description models in the
            research could provide reference to nanopublication’s application in specific domain and
            semantic organization of Chinese dissertation. The proposed model excels in information
            retrieval by revealing semantic characteristics of specific statements about experiment data
            such as experiment parameter, experiment model and test collection. The model covers the core
            classes of information retrieval and formalizes their relationships, which provide description
            model for semantic data to automatically extract assertions and semantic relationships. By using
            term recognition, entity extraction, machine learning and data cleaning, the model proposed in
            this study helps the assertion extraction and automatic annotation of the Chinese dissertation,
            and also provides models and methods for automatic construction of nanopublications. There
            are limitations for describing specific semantics in other specific domains when applying the
            model to creating nanopublications of Chinese dissertations with various structural and semantic
            features.
              Scientific papers in nature language are complex on content semantics. It is difficult to
            identify experiment tasks and procedures, and necessary experiment assessments are also
            required. Therefore, in future, it is necessary to further establish a large-scale, high-quality and
            inter-linked scientific paper corpus based on innovated description model of scientific contents
            to provide a data foundation for extracting and revealing assertions in scientific papers.
            Scientific paper is composed of knowledge units with semantic features and logic relationships.
            The future application of nanopublication in scientific papers shall focus on formal description
            and semantic relationships at the fine granularity of knowledge units, with a purpose to
            construct multi-level, multi-granularity and multi-dimension content datasets of scientific
            papers.
   248   249   250   251   252   253   254   255   256   257   258