Transition-based NER system
PyTorch implementation of Transition-based NER system [1].
Given a sentence, give a tag to each word. A classical application is Named Entity Recognition (NER). Here is an example
John lives in New York
B-PER O O B-LOC I-LOC
Corresponding sequence of actions
SHIFT
REDUCE(PER)
OUT
OUT
SHIFT
SHIFT
REDUCE(LOC)
The training data must be in the following format (identical to the CoNLL2003 dataset).
A default test file is provided to help you getting started.
John B-PER
lives O
in O
New B-LOC
York I-LOC
. O
To train the model, run train.py
with the following parameters:
--rand_embedding # use this if you want to randomly initialize the embeddings
--emb_file # file dir for word embedding
--char_structure # choose 'lstm' or 'cnn'
--train_file # path to training file
--dev_file # path to development file
--test_file # path to test file
--gpu # gpu id, set to -1 if use cpu mode
--update # choose from 'sgd' or adam
--batch_size # batch size, default=100
--singleton_rate # the rate for changing the words with low frequency to '<unk>'
--checkpoint # path to checkpoint and saved model
To tag a raw file, simpliy run predict.py
with the following parameters:
--load_arg # path to saved json file with all args
--load_check_point # path to saved model
--test_file # path to test file
--test_file_out # path to test file output
--batch_size # batch size
--gpu # gpu id, set to -1 if use cpu mode
Please be aware that when using the model in stack_lstm.py
, --batch_size
must be 1.
When models are only trained on the CoNLL 2003 English NER dataset, the results are summarized as below.
Model | Variant | F1 | Time(h) |
---|---|---|---|
Lample et al. 2016 | pretrain | 86.67 | |
pretrain + dropout | 87.96 | ||
pretrain + dropout + char | 90.33 | ||
Our Implementation | pretrain + dropout | ||
pretrain + dropout + char (BiLSTM) | |||
pretrain + dropout + char (CNN) |
Huimeng Zhang: zhang_huimeng@foxmail.com
[1] Lample et al., Neural Architectures for Named Entity Recognition, 2016