Page 111 - Journal of Library Science in China, Vol.45, 2019
P. 111
110 Journal of Library Science in China, Vol.11, 2019
2.2 Comparative analysis of user interest preferences
From descriptive statistics and correlation analysis above, it is known that user platform
preferences of each discipline and journal are quite different. So, what are the similarities and
differences about their user interest preferences? In the following part, user interest preferences are
explored at the indicator level and content level.
(1) User interest preferences comparison (indicator level)
In previous researches, top 10 (Zhao, 2017), top 20 (X.W. Wang et al., 2013) or top 20% (Chen,
2018) of most downloaded articles in a journal or a discipline in a month or year are selected as
samples. Also, top 100, top 1000, top 5%, top 10% or top 20% of most frequently entities, such
as authors, institutions, keywords, etc., are selected as samples in scientometrics research. In this
study, in terms of Pareto principle (or 80/20 rule), top 20% papers ranked by CNKI downloads and
website downloads in each journal are selected as samples to measure user interests by Jaccard
similarity coefficient. Jaccard similarity coefficient measures similarity between finite sample sets,
and is defined as the size of the intersection divided by the size of the union of the sample sets.
In Table 5, Jaccard similarity of 15/61 journals is equal to or higher than 0.3, which means that
there are more than 50% intersection between top 20% papers ranked by CNKI downloads and
website downloads in each journal. Jaccard similarity of 9/61 journals is equal to or higher than
0.25 and less than 0.3, which means that there are 40%-50% intersection between top 20% papers
ranked by CNKI downloads and website downloads in each journal. Jaccard similarity of other
journals is less than 0.25, which means that there are less than 40% intersection between top 20%
papers ranked by CNKI downloads and website downloads in each journal. In summary, there are
significant differences in user interest preferences between journal official websites and CNKI
platform.
(2) User interest preferences comparison (content level)
In this part, considering the limited pages, only top 20% papers ranked by CNKI downloads and
website downloads in “Library, Information and Archival Science” are selected to analyze user
interest preferences at the content level by co-word analysis. Firstly, keywords of top 20% papers
ranked by CNKI downloads and website downloads are processed (such as removing stop words
and merging synonyms). Then, samples are imported into VOSviewer for co-word analysis, shown
in Figure 6 and 7. In Figure 6 and 7, each node indicates a keyword, node size indicates total
strength of links it had, links between nodes are normalized by association strength, and clusters
are shown by different colors. In order to reveal more details, an isolated node is eliminated from
Figure 6 (89 nodes are kept). There are no isolated nodes in Figure 7 (124 nodes in total). In
addition, all parameters are configured the same, including node size, font size, link thickness, etc.
There are fewer nodes and sparser links in Figure 6 than Figure 7. But there are same number
of clusters in Figure 6 (internet public opinion, library service, scientometrics, science mapping,
knowledge management and knowledge service) and Figure 7 (library, big data, science mapping,