Page 132 - JOURNAL OF LIBRARY SCIENCE IN CHINA 2018 Vol. 42
P. 132

OU Shiyan, TANG Zhengui & SU Feifei / Construction and usage of terminology services for information retrieval  131


               3  Usefulness of terminology services for information retrieval


               In theory, terminology services are no doubt useful for information retrieval and can improve the
               performance of an information retrieval system. For example, it can improve the recall of retrieval
               results by expanding query terms with synonyms. However, it is always not tested quantitatively
               to what extent terminology services can improve retrieval results. In this study, we quantitatively
               measured the impact of the “getSynonyms” service on information retrieval results using a library
               OPAC system and Baidu Search Engine as testing targets and Chinese Thesaurus as the source
               vocabulary of terminology services, and compared the retrieval results with or without the use of
               terminology services to prove the usefulness of terminology services for information retrieval.
               Why we selected the “getSynonyms” service to test is because that it is very general to expand
               query terms with synonyms in information retrieval, whereas how to use other terminology
               services depends more on users’ own choice and thus has bigger individual difference.


               3.1  Experimental setup


               We selected 30 terms from Chinese Thesaurus to do information retrieval experiments. Some terms
               are descriptors, and some are non-descriptors. Each term has one or more synonyms. We carried
               out two-round retrieval experiments. In the first round, we did retrieval experiments respectively
               in two information retrieval systems using each term as a query term. In the second round, we
               obtained the synonyms of each term through the “GetSynonym” terminology service, combined
               each term with its synonyms with the logical operator “OR” to form a new query and then did
               retrieval experiments with these new queries. The retrieval results were measured with precision,
               recall and F value. However, for in-house information retrieval systems and open Web search
               engines, the calculation of precision and recall is a little different.


               3.2  Experiment results


               For the library OPAC system and Baidu Search Engine, the results of the retrieval experiments and
               analyses are reported in detail as follows.
                 (1) The experimental results of the library OPAC system
                 For an OPAC system, the number of retrieved documents and the number of relevant documents
               among them are both finite, and thus it is easy to accurately calculate precision. However, the
               total number of relevant documents in a document collection cannot be determined directly, and
               thus it is impossible to accurately calculate recall. Our solution is to obtain the corresponding
               classification number of Chinese Library Classification for each query term and then browse
               relevant documents in the OPAC system according to the classification number to obtain the total
   127   128   129   130   131   132   133   134   135   136   137