Page 186 - Journal of Library Science in China, Vol.47, 2021
P. 186
ZHANG Wei, WANG Hao, DENG Sanhong & ZHANG Baolong / Sentiment term extraction 185
and application of Chinese ancient poetry text for digital humanities
2 Data and methods
2.1 Research framework
Based on the character tagging model, this study proposes a “cold-start” automatic acquisition
method for learning corpus, aiming to solve the tagging problem of sentiment terms in ancient
poetry corpus. On this basis, incorporating the linguistic knowledge of Chinese characters into
machine and deep learning algorithms, to realize the automatic extraction and application of
sentiment terms in ancient poetry texts, as shown in Figure 1.
Figure 1. The research framework of sentiment term extraction and application in ancient poetry texts
As is shown in Figure 1, the research framework mainly includes three modules. 1)Cold-
start. First, collect and organize the list of multi-source sentiment vocabulary, and integrate them
to form a more knowledge-expandable domain sentiment lexicon; then integrate the modern
appreciation text (hereinafter referred to as “appreciation”) into the corresponding original text
of ancient poetry (hereinafter referred to as “poetry”) to form the corpus of ancient poetry texts,
and use lexicon to match term on it to divide it into 0-1 text fragments; use the character tagging