Page 186 - Journal of Library Science in China, Vol.47, 2021
P. 186

ZHANG Wei, WANG Hao, DENG Sanhong & ZHANG Baolong / Sentiment term extraction   185
                                                       and application of Chinese ancient poetry text for digital humanities


                 2 Data and methods


                 2.1 Research framework


                 Based on the character tagging model, this study proposes a “cold-start” automatic acquisition
               method for learning corpus, aiming to solve the tagging problem of sentiment terms in ancient
               poetry corpus. On this basis, incorporating the linguistic knowledge of Chinese characters into
               machine and deep learning algorithms, to realize the automatic extraction and application of
               sentiment terms in ancient poetry texts, as shown in Figure 1.







































                    Figure 1. The research framework of sentiment term extraction and application in ancient poetry texts


                 As is shown in Figure 1, the research framework mainly includes three modules. 1)Cold-
               start. First, collect and organize the list of multi-source sentiment vocabulary, and integrate them
               to form a more knowledge-expandable domain sentiment lexicon; then integrate the modern
               appreciation text (hereinafter referred to as “appreciation”) into the corresponding original text
               of ancient poetry (hereinafter referred to as “poetry”) to form the corpus of ancient poetry texts,
               and use lexicon to match term on it to divide it into 0-1 text fragments; use the character tagging
   181   182   183   184   185   186   187   188   189   190   191