stan-cn-nlp:基于Stanford NLP软件包的API包装器,方便中国用户使用
An API wrapper based on Stanford NLP packages for the convenience of Chinese
users. This package is based on stan-cn-* family:
This package bundled seg, ner and tagging together. So if you only need one of
them, you can use stan-cn-seg, stan-cn-ner, stan-cn-tag separately.
The original Stanford CoreNLP packages with default language settings in Maven
central is only for English. If you are dealing with simplified Chinese, you
still need to download the Chinese model and fix some configuration files.
The burden is not too much, but if you deploy these packages to a server
cluster, this burden might be amplified.
Whatever you face a single node or a server farm, it would be a pleasurable
solution to provide packages with default settings of Chinese language
models. That is what we do.
Comments, reviews, bug reports and patches are welcomed.
Current version is 0.0.4 and based on Stanford CoreNLP 3.2.0 with minor fixes.
including below dependency:
<dependency>
<groupId>com.guokr</groupId>
<artifactId>stan-cn-nlp</artifactId>
<version>0.0.4</version>
</dependency>
[com.guokr/stan-cn-nlp "0.0.4"]
libraryDependencies += "com.guokr" % "stan-cn-nlp" % "0.0.4"
We use a very simple API to reduce the complexity.
new SegWrapper(settings).segment(text);
new NerWrapper(settings).recognize(text);
new TagWrapper(settings).tag(text);
Or if you want to use the default language models, just use
__PKG__.INSTANCE.segment(text);
__PKG__.INSTANCE.recognize(text);
__PKG__.INSTANCE.tag(text);
Please follow below steps to play with:
Before release this package to maven central, please execute below commands:
GPLv2, just same as the license of Stanford CoreNLP package