章成志,苏新宁.基于条件随机场的自动标引模型研究[J].中国图书馆学报,2008,34(5): |
Automatic Indexing Model Based on Conditional Random Fields |
基于条件随机场的自动标引模型研究 |
|
DOI: |
Key words:Keywords extraction,Conditional random field,Automatic indexing |
中文关键词: 抽词标引,条件随机场,自动标引 |
基金项目: |
|
Hits: 5805 |
Download times: 5532 |
Abstract: |
CRF (Conditional Random Fields) model is a state-of-the-art sequence labeling method. The CRF model can use the features of documents more sufficiently and effectively. At the same time, keywords extraction can be considered as the string labeling. Keywords extraction model based on CRF is proposed and implemented. Experimental results show that the CRF model outperforms other machine learning methods such as support vector machine, multiple linear regression model etc. in the task of keywords extraction. 1 fig. 3 tabs. 32 refs. |
中文摘要: |
条件随机场(Conditional Random Fields,CRF)模型是一种概率图模型。为了有效利用标引对象的特征,并考虑到抽词标引可以转换为序列标注问题,本文提出基于条件随机场的自动抽词标引模型。实验结果表明,该模型在改善抽词标引的性能方面,要优于支持向量机、多元线性回归模型等其他机器学习方法,是到目前为止解决序列标注问题的最好方法。但是,该模型本身还不能解决由于样本中存在同义词和相近词带来的问题,需要进一步对训练集和标引过程中存在的词汇语义情况进行考虑,提高标引的质量。图1。表3。参考文献32。 |
View Full Text
View/Add Comment Download reader |