Page 199 - Journal of Library Science in China, Vol.47, 2021
P. 199

198   Journal of Library Science in China, Vol.13, 2021



            features in appreciation are more favorable for emotion knowledge discovery. Moreover, according
            to the author, the number of distinguishing terms in the comprehensive training set (12, 151) is
            twice as much as that in the external poetry training set (6, 259), so the former is more helpful for
            the model to fully learn the features. The introduction of appreciation texts can not only effectively
            optimize the number and granularity of sentiment terms, but also facilitate the knowledge
            discovery of new sentiment words.


            3.3 Discovery of new emotional terms in ancient poetry texts

            The deeper value of machine learning techniques in term extraction lies in the knowledge
            discovery of unlogged terms. Based on the rules for new term identification in this paper, the
            author extracts new sentiment terms using optimal machine learning (CF) and deep learning (TL5)
            models to obtain 1, 057 and 1, 531 candidate new words, respectively. Among them, the maximum
            length of new words identified by machine learning is 8, and that of deep learning is 4. In addition,
            the number of two-word new words identified by the latter (1, 101) is more than twice that of
            the former (481), and the number of four-word new words identified (339) is also significantly
            ahead of the former (190). Based on the above, the author was the first to count the long-length
            neologisms extracted by machine learning, and subsequently extracted the high-frequency
            terms not contained in the former from the set of neologisms recognized by deep learning, and
            distinguished the source texts of the terms on this basis, as shown in Table 6.


            Table 6. Candidates for new emotional terms in ancient poetry text (partial)
                   Machine learning-based long new term   Deep learning-based expanded new term
              Candidates   Length  Candidates  Length  Candidates  Frequency  Candidates  Frequency
               (poetry)          (appreciation)       (poetry)           (appreciation)
              有才过屈宋       5    王孙公子肆无忌惮        8       彩凤          2        沦谪         11
              可笑不自量       5     不甘于无所作为        7     肺腑无言          1        秀艳         7
              挺立不动膝       5     惶惶若丧家无主        7     东风日暖          1        离席         5
              自献自为酬       5     花心如痴如醉         6     千里山河          1        末世         5
               艰极泰循       4     依依惜别之情         6       薄俗          1        欢情         3
               举目凄凉       4     风格清新俊逸         6       浮世          1        轻扬         3
                恨不胜       3     不期然而然的         6       多雨          1        亡灵         3
                自作为       3     变得模糊不清         6       惜金          1       春光易逝        2
                惊满座       3     决胜千里之外         6       春泉          1       慷慨不平        2
                神通力       3     明珠交相辉耀         6       娉娉          1       忍辱饮恨        2
                辛苦力       3     放荡无不拘的         6       濡染          1       名盛一时        2
                奔亡        2     寄情山水自然         6       惨澹          1       征人思妇        2
                忍耻        2     美人迟暮之恨         6       愬武          1       有志难伸        2
                惜惧        2     文采风流而生         6       私债          1       阶下囚         2
                冤谪        2     鸣声有如不如         6       角逐          1        被贬         2
                抚膺        2     痛感无能为力         6       霸图          1        唐军         2
                威怒        2     活生生感染力         6       丧逝          1        宦官         2
                恳苦        2     年华迟暮之悲         6       兵荒          1        哀怆         2
   194   195   196   197   198   199   200   201   202   203   204