Page 187 - Journal of Library Science in China, Vol.47, 2021
P. 187

186   Journal of Library Science in China, Vol.13, 2021



            model to map the text into a character-role space to explore the features of Chinese characters;
            finally divide the tagged corpus to start the follow-up learning assignment. 2)Machine learning
            and deep learning models. In terms of machine learning, this paper introduces the language
            features of Chinese characters into the feature space to train the CRFs algorithm in order to
            improve the model; in terms of deep learning, it aims to integrate BERT language knowledge
            to optimize the neural network, including the Char2Vec mapping deep features of domain
            Chinese character, BiLSTM context information encoding and network parameter training, CRFs
            decoding constraint label order and other processes. Finally, the labeled sequences are obtained
            by using the model to predict the test set. 3)Digital humanities application. Map the predicted
            sequence to the role-character space, extract domain terms and new terms, and integrate them
            into a humanistic sentiment terminology in the field of ancient poetry, and then explore the digital
            application of sentiment terminology in terms of term retrieval, granularity mining, and poet
            portraits.


            2.2 Data sources and preprocessing

            The corpus of ancient poetry texts in this paper mainly consists of poems and their appreciation.
            In Chinese poetry culture, Tang Dynasty poetry has an important influence on the politics,
            people, customs and culture of the later generations [29] .This paper takes the Dictionary of
            Appreciation of Tang Poetry as the source material for analysis, which is a literary research
            result including the original text of ancient poetry and the analysis of modern appreciation, and
            can map the implicit emotional knowledge in the poetry more comprehensively and accurately.
            The digital text of this book is from the Literature 100 website (http://www.wenxue100.com), as
            shown in Figure 2.




















                        Figure 2. Corpus of ancient poetry text and terms of Emotional Knowledge (Part)


              Through data cleaning and structural organization of the text, a total of 1, 374 valid poems and
   182   183   184   185   186   187   188   189   190   191   192