Page 184 - Journal of Library Science in China, Vol.47, 2021
P. 184

ZHANG Wei, WANG Hao, DENG Sanhong & ZHANG Baolong / Sentiment term extraction   183
                                                       and application of Chinese ancient poetry text for digital humanities


               1 Related works


               1.1 Knowledge organization and humanistic connotation of Chinese ancient poetry

               Ancient poetry is an important carrier of our poetic and literary heritage. It contains a wealth of
               humanistic knowledge. The current knowledge organization of ancient poetry texts mostly focuses
                                                                                         [7]
               on automated classification, whose dimensions include poetic style , poetic subject matter , and
                                                                      [6]
               poetic rhythm . However, most of these studies are in the perspective of the external structure of
                          [8]
               the poetry text, while the excavation and emotional analysis of the inner meaning of ancient poems
               are more humanistic. In this regard, WU Bin et al. determined the poems into positive, medium,
               and negative emotional polarity based on a migration learning model ; to improve the accuracy of
                                                                       [9]
               emotional classification, TANG et al. proposed a sentiment analysis scheme for poetry of the Tang
               Dynasty based on CNNs and GRUs. They adopted a multi-channel processing model to extract
               semantic features, and verified the effectiveness of the research scheme by comparing it with
                                           [10]
               mainstream deep learning models . However, all the above studies are chapter-level sentiment
               polarity analysis, which is difficult to gain insight into the fine-grained humanistic knowledge in
               the poems. In contrast, sentiment terminology can more finely analyze the humanistic content of
               ancient poetry, but the problems such as limited terminology scale, coarse sentiment granularity
               and insufficient learning of text features caused by the condensed style of ancient poetry need to be
                                               [11]
               solved first. Appreciation text of poetry  is a modern literary text that analyzes the connotations of
               ancient poetry and can significantly expand the number of domain terms and refine the emotional
               knowledge granularity. Based on the above considerations, we introduce a modern appreciation
               approach by integrating the appreciation texts of ancient poetry into the corresponding poems and
               devote to the humanistic connotation analysis of word granularity in ancient poetry texts.


               1.2 Construction methods of domain sentiment lexicons


               Sentiment lexicon is a word granular knowledge organization and sentiment analysis tool . At
                                                                                         [12]
               present, the blankness of sentiment lexicon in the field of Chinese ancient poetry makes automatic
               construction work imperative. The core construction methods include: 1)Semantic knowledge
               base, that is, building an sentiment lexicon on the basis of general sentimental dictionaries (such
               as HowNet) by mining word relationships (cognitive and transitive, subordinate) [13] , although
               this method is fast and practical, it has the problem of poor domain applicability; 2)Domain
               corpus, which aims to automatically extract sentiment word from domain texts, mainly includes
               connection relationship method , Statistics (inter-point mutual information, chi-square statistics,
                                         [14]
               etc.) [15]  and representation learning method (such as Word2Vec) [16] . Among them, the language
               rules resorted to the connection relationship method are likely to limit the coverage of sentiment
                   [17]
               word ; statistics are difficult to avoid the inherent defects in the recognition of low-frequency
   179   180   181   182   183   184   185   186   187   188   189