项目作者: tianlinyang

项目描述 :
Transition-based NER system
高级语言: Python
项目地址: git://github.com/tianlinyang/stack-lstm-ner.git
创建时间: 2017-12-30T09:04:41Z
项目社区:https://github.com/tianlinyang/stack-lstm-ner

开源协议:

下载


stack-lstm-ner

PyTorch implementation of Transition-based NER system [1].

Requirements

  • Python 3.x
  • PyTorch 0.3.0

Task

Given a sentence, give a tag to each word. A classical application is Named Entity Recognition (NER). Here is an example

  1. John lives in New York
  2. B-PER O O B-LOC I-LOC

Corresponding sequence of actions

  1. SHIFT
  2. REDUCE(PER)
  3. OUT
  4. OUT
  5. SHIFT
  6. SHIFT
  7. REDUCE(LOC)

Data format

The training data must be in the following format (identical to the CoNLL2003 dataset).

A default test file is provided to help you getting started.

  1. John B-PER
  2. lives O
  3. in O
  4. New B-LOC
  5. York I-LOC
  6. . O

Training

To train the model, run train.py with the following parameters:

  1. --rand_embedding # use this if you want to randomly initialize the embeddings
  2. --emb_file # file dir for word embedding
  3. --char_structure # choose 'lstm' or 'cnn'
  4. --train_file # path to training file
  5. --dev_file # path to development file
  6. --test_file # path to test file
  7. --gpu # gpu id, set to -1 if use cpu mode
  8. --update # choose from 'sgd' or adam
  9. --batch_size # batch size, default=100
  10. --singleton_rate # the rate for changing the words with low frequency to '<unk>'
  11. --checkpoint # path to checkpoint and saved model

Decoding

To tag a raw file, simpliy run predict.py with the following parameters:

  1. --load_arg # path to saved json file with all args
  2. --load_check_point # path to saved model
  3. --test_file # path to test file
  4. --test_file_out # path to test file output
  5. --batch_size # batch size
  6. --gpu # gpu id, set to -1 if use cpu mode

Please be aware that when using the model in stack_lstm.py, --batch_size must be 1.

Result

When models are only trained on the CoNLL 2003 English NER dataset, the results are summarized as below.

Model Variant F1 Time(h)
Lample et al. 2016 pretrain 86.67
pretrain + dropout 87.96
pretrain + dropout + char 90.33
Our Implementation pretrain + dropout
pretrain + dropout + char (BiLSTM)
pretrain + dropout + char (CNN)

Author

Huimeng Zhang: zhang_huimeng@foxmail.com

References

[1] Lample et al., Neural Architectures for Named Entity Recognition, 2016