Page 181 - JOURNAL OF LIBRARY SCIENCE IN CHINA 2018 Vol. 43
P. 181

181
                            Extended English abstracts of articles published in the Chinese edition of Journal of Library Science in China 2017 Vol.43  181


               Co-word analysis: Limitations and solutions

               LI Gang & BA Zhichao〇
                              〇a*
               Co-word analysis is a content analysis technique based on the assumption that the subject of a
               paper can be summarized in a limited number of key terms. If two terms co-occur within one paper,
               the two research topics they represent are related, and the higher frequency of the co-word means
               stronger correlation in terms pairs. However, the basic work of co-word analysis is still words and
               extremely sensitive to the selection of terms, and the quality of co-word analysis depends on a
               variety of factors, such as the quality of terms and indexes, the high-frequency terms extraction,
               and the adequacy of statistical methods. Therefore, it is necessary to delve into the limitations of
               co-word analysis at different stages to improve and optimize it.
                 The co-word analysis conducted in the present study involved six sequential steps: determination
               of problem analysis, term source selection, high-frequency terms extraction, relevance calculation
               of terms, multivariate statistical analysis, and visual presentation of results. This paper focuses
               on those six key issues to analyze and demonstrate the main problems based on the induction and
               summarization of the existing relevant research. Results indicate the following conclusions. 1)
               In the term source selection, solely making use of keywords and index words, which is called
               “indexer effect” by researchers, is the biggest problem of early co-word analysis. Keywords are
               uncontrolled words, and problems of homonyms and synonyms will be brought out. Meanwhile,
               terms expression differences exist among different parts of analysis units, and some errors of
               co-word analysis will be induced if those differences are ignored. In order to solve the above
               problems, the textual semantic structure and the phenomenon of different quality with different
               quantity of terms can be considered. 2)Researchers engaged in co-word analysis have never been
               out of the pattern that adopts high-frequency term to develop the multivariate statistical analysis.
               The extraction of high-frequency terms not only makes low-frequency terms more marginalized,
               but also causes isolation of high-frequency terms that have low correlation with clusters.
               Considering the discipline and multi-semantic types of terms to distinguish the representation
               capabilities of subject areas, we can have a comprehensive and in-depth understanding of the
               research characteristics of this field. 3)Two co-occurrence terms may correlate each other directly
               or indirectly, but these semantic relationships between co-occurrence terms are not considered
               at all, which may affect the soundness of the results of co-word analysis ultimately. Thus we
               summarize the existing calculation methods of semantic correlation and point out the limitations
               of each method. 4)Finally, in the multivariate statistical analysis, taking the co-word clustering and
               co-word association analysis method as example, we discuss the problems of their application in
               the new data environment and put forward the improvement method and suggestion.
                 Co-word analysis has been most commonly utilized in mapping or tracing patterns and trends in


               * Correspondence should be addressed to BA Zhichao, Email: bazhichaoty@126.com, ORCID: 0000-0001-5626-5604
   176   177   178   179   180   181   182   183   184   185   186