Page 195 - Journal of Library Science in China, Vol.47, 2021
P. 195

194   Journal of Library Science in China, Vol.13, 2021



              As can be seen from Figure 5, after establishing domain and emotion constraints on Chinese
            characters for B/P, the model performance effects both rise to a level comparable to that of
            baseline. In terms of P, after the experimental tuning of the parameter comparison, the top 25
            frequencies in domain Chinese pinyin as domain common pinyin (F_P) and the top 60 frequencies
            in emotional Chinese pinyin as emotional common pinyin (E_P), the F1 values of the model all
            exceeded baseline, which verified the positive effect of pinyin features; In addition, counting
            the Chinese character radicals in the domain text and emotional word set, the top 115 domain
            radicals are used as domain common radicals (F_B) and the top 20 emotional radicals are used as
            emotional common radicals (E_B) after parameter adjustment, and the optimal F1 value is still
            slightly inferior to baseline. In this regard, the number of extracted terms were analyzed to explore
            the reason, as shown in Table 2.


            Table 2. Extraction results of the number of sentiment terms based on constrained feature extension
                              Original evaluation criteria      Discriminative evaluation criteria
              Feature
                       Original  Recognized  Correct   New term  Original   Recognized  Correct   New term
                        term      term     term             term     term     term
                C      63 151    62 519    60 005   796     7 045    6 409    5 628   647

               F_P     63 151    62 618    60 057   865     7 045    6 560    5 682   742
               E_P     63 151    62 592    60 051   893     7 045    6 584    5 698   747

               F_B     63 151    62 601    60 039   894     7 045    6 598    5 702   752
               E_B     63 151    62 632    60 057   912     7 045    6 608    5 706   761


              As can be seen from Table 2, the model using the F/E to constrain the P/B features produced
            a significant improvement in the recall of term extraction. 1) Under both evaluation metrics, the
            number of correct recognition and new recognition of F_B and E_B is generally better than that of
            F_P and E_P, indicating that the radical features can recall emotional terms and discover new terms
            better. 2)E_B outperforms F_B in both original term recall and new term recognition, indicating
            that the constraint of emotional radicals is more effective. Therefore, twenty Chinese radicals in
            the E_B set were observed, and the experiment was launched with {“忄”, “心”} as the constraint
            feature inspiringly. The F1 value is as high as 95.54%, jumping to the second in the overall
            experiment, just after the domain feature; F1_distinct also reaches 83.69%, remarkably higher than
            baseline, B, F_B and E_B, verifying the effectiveness of the radical feature. On this basis, another
            18 radicals were added to the set for incremental experiments with {“忄”, “心”} as the benchmark,
            and the results are shown in Figure 6.
   190   191   192   193   194   195   196   197   198   199   200