Page 200 - Journal of Library Science in China, Vol.47, 2021
P. 200
ZHANG Wei, WANG Hao, DENG Sanhong & ZHANG Baolong / Sentiment term extraction 199
and application of Chinese ancient poetry text for digital humanities
As can be seen from Table 6, the domain neologisms extracted by machine learning and deep
learning models generally have better sentiment tendency. Among them, the former can identify
verses containing emotional connotations in poetry, and the terminology is more cohesive; at
the same time, it can extract long terms in appreciation, which ensures the coherence of domain
knowledge identification and refines the emotional granularity of terms. The latter expands on
the former to include many new imagery words with humanistic connotations, such as “彩凤” to
symbolize the desire for beautiful things, “千里山河” to express the lamentation for the rivers
and mountains, and the phrase “春泉” reveals the praise of vitality, “唐军” expresses the praise
of power, and “宦官” criticizes the power of treacherous officials. This indicates that the deep
learning model is more intelligent in recognizing deep emotional knowledge in Chinese texts.
4 Application and analysis of sentiment terms in ancient poems for digital
humanities
The digital application of sentiment terms in the domain of ancient poetry aims to realize sentiment
analysis and knowledge service of humanities information resources through term association.
The best model—TL5 will be adapted to extract the sentiment terms in the overall corpus, and
considering the constraints of the rules of this paper on the new terms, they will be directly merged
with the domain terms without checking the correctness of the terms, 14, 599 distinguished terms
are obtained after de-duplication to form humanistic sentiment knowledge at the word granularity
level. Based on this, the following application and analysis are carried out in three aspects: term
retrieval, granularity mining and portrait construction.
4.1 Humanities sentiment terms retrieval
Humanities sentiment terms retrieval aims to query the associated knowledge of terms from the
emotional knowledge base based on user needs. By matching the index values between sentiment
terms and poems automatically, a total of 216, 522 binary relationships between poems and words
were obtained, and terms retrieval was conducted. The two main retrieval modes are “search term
by the poem” (see Figure 8) and “search poem by the term” (see Figure 9).
In Figure 8, the poem-word relationships are stored through the OWL ontology language and
the ten poems with the largest number of words are displayed using protégé, with the middle layer
being the poetic text and the outermost layer being the sentiment terms extracted from the poetic
text, there are 6, 386 associations in total. In this regard, the sentiment terms of the target poem can
be found by using the title of the poem as the base, for example, by using “长恨歌” as the search
term, it is possible to find out the sentiment terms such as “绝色”, “宠爱”, “荒废”, “祸国殃民”,
“愤慨”, “受害者”, “惋惜”, “哀叹”, which on the one hand reveals the desire for beauty and lust of
the Emperor Xuanzong of Tang Dynasty and the abuse of favors of Yang Guifei, and on the other