Page 198 - Journal of Library Science in China, Vol.47, 2021
P. 198
ZHANG Wei, WANG Hao, DENG Sanhong & ZHANG Baolong / Sentiment term extraction 197
and application of Chinese ancient poetry text for digital humanities
From Figure 7, we can see that: 1)On the whole, as the number of migrations increases, the
model performs better and better in sentiment term extraction, and exceeds the effect of baseline
in different criteria. 2)Analysis of the original term extraction results shows that: in terms of
P-value, baseline always remains the highest; in terms of R-value, it has been higher than baseline
since TL1; in terms of F1-value, the model surpasses baseline from TL4 and then remains stable
at a high level (95.63%). Since then, the deep learning model has significantly outperformed the
machine learning model in terms of the level of raw term extraction. 3)In terms of R-value, the
model’s TL1 result is much better than CF, rising by 7.36% to 88.35%, breaking all records in
this paper’s experiments; in terms of F1-value, the model still outperforms CF at TL1 with the
offsetting P-value and R-value, and steadily improves to TL6. CF, and steadily improved to TL6
(85.43%). 4)TL5 reached the peak F1 value under different criteria, while TL6 results were slightly
inferior compared to TL5, which indicated that the model had an overfitting trend, and therefore
TL5 was the optimal model.
(3) Variability analysis of different ancient poetry text corpus term extraction. In this paper,
the test results of the optimal model TL5 are split, from which the poetic and appreciation
sequences are extracted separately and then various metrics are calculated; in addition, to verify
the effectiveness of the corpus in this paper, the author retrains the deep learning model on a new
training set constructed from 17365 external poems and tests it on the split poetic sequences to
achieve the optimal extraction results through twenty rounds of iterations, as shown in Table 5.
Table 5. Calculation of the extraction results of affective terms for different ancient poetry text corpus
Original evaluation criteria Discriminative evaluation criteria
Train set Test set New term New term
P/% R/% F1/% P/% R/% F1/%
number number
Poetry+appreciation Poetry+appreciation 95.92 95.35 95.63 941 81.94% 89.23% 85.43% 836
Poetry+appreciation Appreciation 95.85 95.26 95.56 888 82.32% 89.34% 85.68% 805
Poetry+appreciation Poetry 96.69 96.25 96.46 53 90.74% 92.61% 91.66% 38
Poetry+appreciation Poetry 98.47 98.14 98.31 27 95.81% 94.18% 94.99% 25
From Table 5, we can see that: 1)the original terms in the poetry and appreciation texts are
similar, and both maintain good recognition efficiency; while for the distinguishing terms, the
effect of the poetry is significantly better than that of the appreciation, and the difference is mainly
in the accuracy rate (8.42%), which indicates that the condensed nature of the poetry corpus is
more advantageous for the distinguishing terms. 2)Compared with the corpus of this paper, the
model training of external poetry achieves better accuracy, which shows that the more accurate
corpus environment is more suitable for the extraction of emotion terms in poetry; the shortcoming
is that the new term recognition effect is slightly inferior, which indicates that the richer text