您当前的位置: > 详细浏览

面向低资源命名实体识别的CharBiLSTM-Att-CRF模型

CharBiLSTM-Att-BCRF Model for Low Resource Named Entity Recognition

摘要:当标注数据较少时,现有模型受训练数据量少的限制,参数没有拟合到预期效果,导致在低资源命名实体识别任务中模型识别性能不佳。本文通过采用K折交叉验证法,使模型较好拟合数据。此外,本文在BiLSTM-CRF模型基础上融合多层字符特征信息和自注意力机制,结合K折交叉验证法,构建了CharBiLSTM-Att-CRF模型。本文提出的CharBiLSTM-Att-CRF模型在20%的CONLL2003和20%的BC5CDR的数据集上,F1值在BiLSTM-CRF模型基础上分别提升了7.00%、4.08%。该模型能较好地适应低资源命名实体识别任务。

英文摘要:when there are few labeled data, the existing models are limited by the amount of training data, and the parameters do not fit the expected effect, resulting in poor model recognition performance in the task of low resource named entity recognition. a new loss function integrated with Bernoulli distribution is proposed to make the model fit the data better. In addition, based on the BiLSTM-CRF model, this paper integrates multi-layer character feature information and self attention mechanism, and the new loss function based on Bernoulli distribution is combined to construct the BiLSTM-Att-BCRF model. Based on the dataset of 20% CONLL2003 and 20% BC5CDR, the F1 value of the BiLSTM-BCRF model proposed in this paper increased by 7.00% and 4.08% respectively. the model can better adapt to the task of low resource named entity recognition.

版本历史

[V1] 2022-07-19 06:53:04 chinaXiv:202207.00144V1 下载全文
点击下载全文
同行评议状态
待评议
许可声明
metrics指标
  • 点击量1496
  • 下载量101
评论
分享
邀请专家评阅