章成志,苏新宁.基于条件随机场的自动标引模型研究[J].中国图书馆学报,2008,34(5):
Automatic Indexing Model Based on Conditional Random Fields
基于条件随机场的自动标引模型研究
  
DOI:
Key words:Keywords extraction,Conditional random field,Automatic indexing
中文关键词:  抽词标引,条件随机场,自动标引
基金项目:
Author NameAffiliation
Zhang Chengzhi 南京理工大学信息管理系 
Su Xinning 南京大学信息管理系 
Hits: 5805
Download times: 5532
Abstract:
CRF (Conditional Random Fields) model is a state-of-the-art sequence labeling method. The CRF model can use the features of documents more sufficiently and effectively. At the same time, keywords extraction can be considered as the string labeling. Keywords extraction model based on CRF is proposed and implemented. Experimental results show that the CRF model outperforms other machine learning methods such as support vector machine, multiple linear regression model etc. in the task of keywords extraction. 1 fig. 3 tabs. 32 refs.
中文摘要:
      条件随机场(Conditional Random Fields,CRF)模型是一种概率图模型。为了有效利用标引对象的特征,并考虑到抽词标引可以转换为序列标注问题,本文提出基于条件随机场的自动抽词标引模型。实验结果表明,该模型在改善抽词标引的性能方面,要优于支持向量机、多元线性回归模型等其他机器学习方法,是到目前为止解决序列标注问题的最好方法。但是,该模型本身还不能解决由于样本中存在同义词和相近词带来的问题,需要进一步对训练集和标引过程中存在的词汇语义情况进行考虑,提高标引的质量。图1。表3。参考文献32。
View Full Text   View/Add Comment  Download reader