Page 183 - Journal of Library Science in China, Vol.47, 2021
P. 183

182   Journal of Library Science in China, Vol.13, 2021



            “humanity” and “computability”. The former emphasizes the humanistic emotion and connotation
            of thinking inherited from cultural resources, while the latter advocates the deep calculation of data
            to mine the core knowledge implicit in humanistic materials. Since text is the fundamental carrier
            of humanistic knowledge, textual knowledge mining driven by semantic parsing technology and
            humanistic emotion is more and more important.
              Among Chinese cultural heritages, ancient poetry, which occupies an extremely important literary
            position, contains the ancient people’s knowledge of things such as political background, historical
            events and folk customs, and covers a wealth of emotional knowledge. At present, the emotional
            connotation of domain-specific texts has not been effectively mined and utilized, so it is with
            the knowledge organization of humanistic emotional information, whose sources and structures
            vary, from the perspective of semantic association. In this regard, some scholars have attempted
                                                                    [4]
            to classify the emotional polarity of ancient poems automatically , but chapter-level polarity
            analysis is still insufficient for the mining of fine-grained emotional knowledge. At present, there
            is no perfect emotion dictionary in the field of Chinese ancient poetry, and it is hard to say that the
                                                                                          [5]
            emotion terms constructed by the existing Chinese emotion dictionaries is complete or accurate .
            In order to achieve a more accurate sentiment analysis of ancient poetry, it is necessary to first
            realize the automatic extraction of sentiment terms in domain-specific texts and their performance
            optimization, where the sentiment terms involved are sentiment words with humanistic
            connotations in the domain of ancient poetry, including Chinese single words, words and phrases.
            The key problems include: 1) Since the texts of ancient poems are mostly original Chinese texts
            based on single characters, the highly condensed style limits the scale of term extraction, the
            sentiment granularity and the learning effect of text features, so it is necessary to extend the content
            of the ancient poetry text; 2) the large-scale term extraction is mainly realized by machine learning,
            but the lack of labeled corpus in the ancient poetry domain makes it difficult to start the learning
            task; 3) The rise of BERT has challenged the traditional ML algorithms represented by CRFs, but
            its word vector mapping (Char2Vec) enlightens the latter to introduce linguistic features on the
            structure and content of Chinese characters to optimize the extraction effect of sentiment terms.
              Based on the core emotional content (humanity) in the ancient poetry text, this study adopts key
            semantic techniques (computability) in information extraction. On the one hand, we introduce a
            modern appreciation method. We integrate the appreciation text into the poetry it evaluates, and
            extend the emotional knowledge in the original text through the appreciation text to form the
            ancient poetry text within the scope of this study. On the other, we propose a method for automatic
            annotation of word sequences to obtain the learning corpus in a cold environment (no learning
            corpus). On this basis, linguistic knowledge in the domain of ancient poetry is integrated into the
            CRFs ML model to focus on the effectiveness of linguistic features of Chinese characters on the
            extraction of sentiment terms. We also compare this method with DL model BERT-BiLSTM-CRFs,
            and finally generalize the optimal model for the automated extraction and application of large-scale
            sentiment terms in ancient poetry texts.
   178   179   180   181   182   183   184   185   186   187   188