Page 137 - JOURNAL OF LIBRARY SCIENCE IN CHINA 2018 Vol. 43
P. 137
ZHAO Xing / Exploring the measurement features of usage data for academic literature 137
probability of user citation behavior after usage is μ, and the ratio of citation data inclusion by WoS
is θ. Then, the relationship between the citation value,C, and the total usage count of academic
literature, N, can be expressed as C=μ·θ·N. At a certain moment of measurement, θ is a constant
value, and μis a user behavior parameter with approximate stability under a large sample volume.
c
Combined with the conclusion of Proposition 1 (that is, U=·), we have U=δ· μ·θ =ρ·c,
wherein ρ is approximately a constant. The demonstration is hereby completed.
Proposition 2 suggests that in a large sample analysis, there is a positive correlation between the
academic literature usage value of the WoS platform and the citation value. The higher the usage
count, the higher is the citation count, and so, we have Inference 2-1. However, Proposition 2 also
implies that the relationship between the literature usage and citation counts is jointly influenced
by the probability of the user citation behavior,μ, the ratio of citation data inclusion by WoS θ, and
the download export factor of WoS, δ. For a specific single piece of literature, there is no necessary
correspondence between the usage and citation counts. For different disciplines and fields, as well
as for different research topics of the same discipline or field, it is possible for the probability of
user citation behavior,μ, and the usage factor of WoS,δ, to present significant differences, so there
are quite a few singularities (Inference 2-2).
Deduction 2-1: The usage count of academic literature, the citation probability and the volume
of the database platform jointly exert a positive influence on the citation count.
Deduction 2-2: When it comes to the trend relationship between usage data and citation data for
single samples, there are quite a few singularities.
What should be especially pointed out is that, in bibliometrics, almost all the theoretical models
are approximate laws, which do not necessarily have a correspondence with all actual data.
Theoretical analysis only serves as a guiding reference or hypothesis, and empirical study alone
provides the core paradigm of bibliometrics. Thus, in the following section, empirical discussions
are organized from the basic perspectives of measurement feature, bibliometric distribution,
comparison with citation, discipline case, etc.
2 Data source and processing
This paper selects the Web of Science platform as its data source and adopts physics, computer
science, economics, and Library and Information Science, respectively, as the representatives of
basic natural sciences, featured engineering sciences, basic liberal arts and featured liberal arts. The
research articles of the four disciplines were taken as the objects of analysis. The steps of the data
collection were: Perform data retrieval from the citation index databases of Science Citation Index
Expanded (SCI-EXPANDED) and Social Sciences Citation Index (SSCI) under WoS, and select
“discipline” “publication time” and “literature type” via the “abstraction” function provided by
the WoS platform; define the publication year as “2013” and the literature type as “Article”; after
data retrieval, download the bibliographic data in full-record format (download time of all data: