Page 197 - Journal of Library Science in China, Vol.47, 2021

P. 197

196 Journal of Library Science in China, Vol.13, 2021

overfitting of the model caused by insufficient or excessive training rounds, I set the number of
training rounds for a single model to 10, and then calculate the recognition effect of each migration
by multiple training rounds, and then generalize the domain optimal model, as shown in Table 3.

Table 3. Hyperparameter configuration of BERT-BiLSTM-CRFs model training
Hyperparameter name Hyperparameter value Hyperparameter name Hyperparameter value
batch_size 64 dropout_rate 0.5
max_seq_length 128 lstm_size 128
learning_rate 2.00E-05 num_train_epochs 10

(1) Char2Vec training based on BERT model. Firstly, the learning corpus is further segmented,
and 1% of the overall corpus is extracted from the training set as the validation set. Then, a pre-
training model provided by Google is used as the initial migration model. Finally, Char2Vec is
implemented by word embedding of the corpus in this paper and fine-tuning of the pre-training
model, and the results are shown in Table 4.

Table 4. Word embedding of BERT-based ancient poetry text corpus (example)
tokens 纵横计不就 , 慷慨志犹存 …
input_ids 101 5288 3566 6369 679 2218 117 2724 2717 2562 4310 …
input_mask 1 1 1 1 1 1 1 1 1 1 1 …
segment_ids 0 0 0 0 0 0 0 0 0 0 0 …
label_ids 8 7 2 4 7 2 4 7 2 4 4 …

(2) Emotional term extraction results are calculated. Using the machine learning optimal model
CF as the baseline, the BERT-BiLSTM-CRFs deep learning model is trained and tested using the
classifier trained by each migration, and the results are shown in Figure 7.

(a) calculation of original term extraction results (b) calculation of differentiated term extraction results
Figure 7. Emotional term extraction results calculation based on BERT-BiLSTM-CRFs

192 193 194 195 196 197 198 199 200 201 202