Page 180 - JOURNAL OF LIBRARY SCIENCE IN CHINA 2018 Vol. 43
P. 180

180
            180   Journal of Library Science in China, Vol.9, 2017


            for fine-grained aggregate units of Internet resources to reveal deeply and correlate the scattered
            and various kinds of information snippets, so as to meet the complex information needs of users,
            improve the effectiveness of retrieval and support better knowledge services.
              First and foremost, this paper firstly extracted three types of free Internet resources in the field
            of Library and Information Science, including OA papers, online encyclopedia, and blogs. Then, a
            general framework to split these resources was developed from the perspectives of logical structure
            and formal structure of text manually. In the aspect of logical structure analysis, it was divided into
            four levels: chapter level which is a whole document, section level based on the chapter title given
            by authors, sentence group level including macro analysis and micro analysis and chart level. The
            components of the whole document were fragmented by macro analysis based on the genre theory.
            And the information snippets revealing rhetorical intentions and semantic functions were identified
            using micro analysis further. The relationships between aggregate units of different levels were
            analyzed. Moreover, characteristics and attributes of aggregate units were depicted and classified,
            including 14 elements of access attributes, 3 elements of physical attributes and 2 elements of
            semantic attributes. Corresponding to the categories, a metadata schema was developed. Lastly,
            to examine the effectiveness of metadata schema, Access 2013 was used to design and develop a
            database, and five search tasks from genre level, section level, sentence group level and chart level
            were set up.
              The research results conclude that the logical structures which are implications of the author’s
            intention, have some similarities among different types of Internet resources if they have the
            same topics. It is feasible to apply the logical structures of the journal papers to other Internet
            genres. DC and LOM metadata frameworks can be reused in the metadata schema for fine-
            grained aggregate units of Internet resources, while there are special characteristics needed to be
            revealed. More importantly, search experiments implicate that it is effective to reveal and correlate
            aggregate units scattered in various sources and different granular when using the aggregated
            search database based on the metadata framework proposed in this paper. Aggregated search can
            support information aggregation and maintain at the same time the whole context of entire piece of
            information. Therefore, users can judge the relevance of search results more quickly and find the
            required content more effectively.
              Via apreliminary study of metadata schema of fine-grained aggregation units, this research is a
            useful attempt to apply linguistic theories and methods to organization of Internet resources, and
            also a significant step toward the rising interdisciplinary research field.
              The future researches are to improve the fine-grained aggregation units framework and metadata
            schema through analyzing other emerging Internet genres. Furthermore, vocabulary and syntactic
            features of aggregated units need to be analyzed so as to implement fine-grained aggregation
            search intelligently and construct knowledge repository automatically.
   175   176   177   178   179   180   181   182   183   184   185