Page 142 - JOURNAL OF LIBRARY SCIENCE IN CHINA 2018 Vol. 42
P. 142
OU Shiyan, TANG Zhengui & SU Feifei / Construction and usage of terminology services for information retrieval 141
system first and the rest half evaluated Baidu Search Engine first. With these evenly distributed
evaluation orders, the unfairness caused by different evaluation orders for different users can be
balanced. The total score of each usage mode is the weighted sum of all the evaluation criteria.
Total score = (B11*25% + B12*25% + B13*50%)*26.3%+(B21*33.3% + B22*66.7%)*14.1%
+ B3*45.5% + B4*14.1%
Table 8 shows the evaluation results of the library OPAC system and Table 9 shows the results of
Baidu Search Engine.
Table 8. Evaluation results of usage modes of terminology services of OPAC
Usage modes of terminology services
Evaluation criterion
Mode 1 Mode 2 Mode 3 Mode 4
Representation
degree of retrieval 4.13 3.29 4.25 3.75
requirement
Improvement degree
Effectiveness 3.92 4.13 3.02 3.04 3.92 3.83 3.1 3.50
of retrieval results
Freedom of term 3.71 2.88 3.79 2.58
selection
Operation time 3.88 3.46 3.54 5
Efficiency 4.38 4.13 4.13 5
Clicking times 4.63 4.46 4.42 5
User satisfaction 4.04 2.92 3.88 3.42
Learnability 4.21 3.96 4.04 4.33
Weighted sum 4.08 3.30 3.95 3.69
Note:The score of each criterion is the average score of the 24 human subjects’ scores. The total score of each mode is the
weighted sum of all the criteria’s scores.
Table 9. Evaluation results of usage modes of terminology services of Baidu engine
Usage modes of terminology services
Evaluation criterion
Mode 1 Mode 2 Mode 3 Mode 4
Representation
degree of retrieval 4.22 3.38 4.22 3.38
requirement
Improvement degree 4.04 3.15 3.85 3.03
Effectiveness 4.13 3.38 4.00 3.58
of retrieval results
Freedom of term 3.91 2.92 3.58 2.58
selection
Operation time 3.91 4.00 3.75 5
Efficiency 4.4 4.17 4.28 5
Clicking times 4.65 4.25 4.54 5
User satisfaction 3.78 3.25 3.83 3.33
Learnability 4.09 3.92 4.00 4.08
Weighted sum 3.98 3.45 3.92 3.59
Note:The score of each criterion is the average score of the 24 human subjects’ scores. The total score of each mode is the
weighted sum of all the criteria’s scores.