Page 160 - Journal of Library Science in China 2020 Vol.46
P. 160
159
Extended English abstracts of articles published in the Chinese edition of Journal of Library Science in China, Vol.46, 2020 159
The development of the thesaurus is a foundation work of information resource management
of Dunhuang murals, which can not only provides a controlled and structured vocabulary
for the standardized information resource organization of Dunhuang murals, but also
provides a clear knowledge system for Dunhuang murals researches. In order to speed up the
construction of thesaurus and meet the needs of dynamic updating of knowledge organization,
the development of thesaurus needs to adopt the machine assisted semi-automatic cooperative
development strategy, and give full play to the advantages of machine and experts, and
realizes the rapid development and maintenance of thesaurus through human-computer
cooperation.
This paper presents the method and procedure of the constructing and linked data publishing
of Dunhuang mural thesaurus. A combination of top-down and bottom-up approach was used to
develop the thesaurus. The top-down approach starts at the higher-level categories, and establishes
a general framework for the thesaurus based on the objectives of the thesaurus. The basic structure
of the thesaurus was developed by referring to the fundamental dictionary of Dunhuang studies
and Art & Architecture Thesaurus (AAT). The thesaurus contains 5 facets including agents,
objects, activities, time and physical attributes. Sub-categories were defined for each facet with
total 25 second-level categories. The bottom-up approach builds up lower-level categories from
the concepts extracted from the corpus with the help of natural language processing technology,
and through the process of manually grouping and classification of the terms, definition of the
semantic relations, adjustment and optimization of the structure. At present, Dunhuang mural
thesaurus contains 4276 terms. After that, this paper studied and implemented linked data
publishing of the Dunhuang mural thesaurus. The thesaurus was semantically represented and
described with SKOS model, and the terms were mapped to AAT concepts using the SPARQL
query. Finally, about 21.2% of the terms was linked to AAT. In the meantime, a linked data service
platform of Dunhuang mural thesaurus was produced, providing concepts browsing, term retrieval,
visualization, open data service, etc.
The originality of this research lies in the compilation of a controlled vocabulary for the
Dunhuang murals with standard specific semantic relations and authoritative multi-source scope
notes. The thesaurus comprehensively covers mural protection and restoration, archaeology,
iconography, human geography and other research perspectives. The construction of Dunhuang
mural thesaurus lays an important foundation for the semantic annotation, information retrieval and
knowledge organization of Dunhuang mural digital resource. It also has an important exemplary
significance for the development of smart data resources in the field of cultural heritage in China.
The future work contains further studies to improve the thesaurus’ structure, categories, and terms.
Systematic evaluation of the thesaurus remains to be studied.