Page 126 - JOURNAL OF LIBRARY SCIENCE IN CHINA 2018 Vol. 42
P. 126
OU Shiyan, TANG Zhengui & SU Feifei / Construction and usage of terminology services for information retrieval 125
In this system, the function of the terminology registry is to provide an authoritative, continuously
updated source of various vocabularies. It contains three components: vocabulary metadata
registration, vocabulary document uploading and vocabulary validation, and does not provide
vocabulary editing and management components such as term online editing and version tracking.
Vocabulary metadata registration is to submit vocabulary metadata to the system for users to search
vocabularies and find their information. Vocabulary document uploading is to upload vocabulary
documents in SKOS/RDF format to the system and provide them a centrally controlled repository
and unified management. To ensure the correctness and validity of uploaded vocabularies, it is
required to do validation to these vocabularies, and only those that can pass the validation can be
submitted to the RDF triple store. Vocabulary validation is divided into three levels: 1) RDF syntax
validation: to validate whether a vocabulary conforms to the syntax of the RDF/XML serialization
format; 2) SKOS label validation: to validate whether the labels used in a vocabulary are the terms
defined in the SKOS language; 3) SKOS integrity validation: to validate whether data is consistent
with the SKOS model (including the SKOS extended model). In addition, to make up-to-date
vocabularies provide services, the system allows the new versions of the uploaded vocabularies to
replace the old versions.
The essence of terminology services is Web services. Currently there are two architectural styles
for Web services: SOAP-style Web services and REST-style Web services, which have their own
advantages and disadvantages respectively. The SOAP architecture is a more mature Web service
architecture with a complete set of protocols, appropriate for transaction-oriented applications,
and has better manufacturer support. But it is complicated to implement and not good for fast
implementation, and thus has low efficiency. The REST architecture is a light-weight architectural
style, whose implementation and operation is simpler than the SOAP’s, good for resource-oriented
applications, and thus appropriate for the scenarios that have high requirement for efficiency
but low requirement for safety. In this study, we selected the REST architecture to construct
terminology services. On the one hand, terminology services are a kind of resource-providing
behaviors (mainly providing query functions) which is consistent with REST. On the other hand, in
the REST-based Web services, a URI corresponds to a resource, which is consistent with the basic
idea of RDF.
The focus of this paper is to describe the storage and retrieval of vocabularies as well as
the construction of terminology services in detail. The detailed information about semantic
representation of traditional thesaurus and their automatic validation are reported in the previous
study (Ou, 2015).
2.1 Term storage and retrieval
The uploaded SKOS/RDF vocabulary data are stored in a RDF triple store. There are three types