Page 246 - Journal of Library Science in China, Vol.47, 2021
P. 246

245
                           Extended English abstracts of articles published in the Chinese edition of Journal of Library Science in China, Vol.47, 2021  245


               version, and the critique of the corresponding book. In order to write a good Tiyao, even the most
               eminent scholars spent a lot of time and effort in collecting, collating, reviewing, and annotating
               large-scale book collections. However, confronted with large-scale rare-book collections, even
               though we pay huge human efforts on writing, editing, and recommending Tiyaos, this task is
               still time-consuming and omissions are inevitable. In this paper, we propose a Tiyao-centric
               network model, which integrates rare-books, historic figures, and Tiyaos into one tripartite graph.
               This network model is not limited to the language or scale of texts, and can be further applied to
               large-scale catalogues of rare-book collections.
                 Based on this model, we construct a Tiyao explorative tool TiyaoX. By using SPARQL, this
               tool can extract RDF data in related knowledge bases and provide users with information of rare-
               book authors or editors, for instance, occupation, imperial examination, biography, and so on.
               Furthermore, this tool can automatically extract the individual relations embedded in Tiyaos,
               enrich the existing resources with reliable and valuable relation information; in addition, this tool
               leverages metadata and content features of Tiyaos to recommend potential interesting and query
               relevant (e.g., Tiyao content, category, book name and author relevant) Tiyaos.
                 In this paper, we use Siku Quanshu Zongmu as dataset, and on the one hand, we investigate
               the latent relations among rare-books, historic figures, and Tiyaos. We also make separate
               quantitative analysis towards person names and book names in four divisions of this dataset
               respectively. The results demonstrate that Tiyaos in“Ji” division contain most names of historical
               persons and books, and Tiyaos in “Jing” division contain names least. Tiyaos in “Zi” and “Ji”
               divisions have the maximum overlap of person names, and Tiyaos in “Shi” and “Zi” divisions
               have the maximum overlap of book names. In the constructed network, most key figures are
               Confucius scholars, book collectors, and bibliographers, and most important rare-books are
               descriptive catalogues, Confucian classics, and history books. On the other hand, we take
               advantage of descriptive features in Tiyao content (author introduction, content summary, and
               critics), and combine them with Tiyao metada as well as three text recommendation strategies
               (Cosine similarity, LDA + JS distance, and Word2Vec + RWMD) . Our objective is to evaluate
               the impacts of different content features on the accuracy of recommendation module of TiyaoX
               respectively. The experimental results demonstrate that the approach that integrates summary,
               critique information and Tiyao metadata information as content features, performs best among all
               results. This approach can be extended to knowledge discovery of rare-book collections which
               provides convenience for related professionals, scholars and enthusiasts, and improves efficiency
               in practice.
   241   242   243   244   245   246   247   248   249   250   251