项目作者: wannaphong

项目描述 :
Thai Word Segmentation using TCC + Bidirectional RNNs
高级语言: Python
项目地址: git://github.com/wannaphong/NokCut.git
创建时间: 2018-11-30T06:30:32Z
项目社区:https://github.com/wannaphong/NokCut

开源协议:Apache License 2.0

下载


NokCut

Thai Word Segmentation using TCC + Bidirectional RNNs (PyTorch)

Credit code from A Beginner’s Guide to Deep NLP with PyTorch - Dr. Prachya Boonkwan

Colab Notebook : https://colab.research.google.com/drive/1WS08VsjlZGAmCGsoI7AlRm-Do3zo-b-g

Install

  1. pip install nokcut

Train by BEST I Corpus Training set. (90% training , 10% test)

  1. ep 6
  2. loss: 0.017879242024514966
  3. f1 : 98.47012481095481

F1 From BEST I Corpus Test set

  1. F-measure: 96.94929
  2. Recall: 122271.00000/125850.00000 = 97.15614
  3. Precision: 122271.00000/126387.00000 = 96.74333
  4. Number of incorrect : 3579.00000 words

Mr. Wannaphong Phatthiyaphaibun
wannaphong@kkumail.com