Multimodal Transformer for Korean Sentiment Analysis with Audio and Text Features
Multimodal Transformer for Korean Sentiment Analysis with Audio and Text Features
Train / Dev / Test Split
Preprocess
Locate downloaded dataset as follows:
korean-audiotext-transformer/
└── data/
└── 4.1 감정분류용 데이터셋/
├── 000/
├── 001/
├── 002/
├── ...
├── 099/
├── participant_info.xlsx
├── rename_file.sh
├── Script.hwp
└── test.py
Convert Script.hwp
to script.txt
cd data/4.1 감정분류용 데이터셋
hwp5txt Script.hwp --output script.txt
Generate {train, dev, test}.pkl
python preprocess.py \
raw_path='./data/4.1 감정분류용 데이터셋' \
script_path'./data/4.1 감정분류용 데이터셋/script.txt' \
save_path='./data' \
train_size=.8
Preprocessed Output (train.pkl)
person_idx | audio | sentence | emotion |
---|---|---|---|
0 | [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, … | 오늘 입고 나가야지. | 행복 |
2 | [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, … | 오늘 입고 나가야지. | 행복 |
7 | [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, … | 오늘 입고 나가야지. | 행복 |
12 | [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, … | 오늘 입고 나가야지. | 행복 |
17 | [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, … | 오늘 입고 나가야지. | 행복 |
conda create -n <your_env_name> python=3.6
conda activate <your_env_name>
conda install pip
pip install -r requirements.txt
Download fine-tuned BERT, and locate the model as follows:
korean-audiotext-transformer/
└── KoBERT/
├── args.bin
├── config.json
├── model.bin
├── tokenization.py
└── vocab.list
python train.py \
--data_path='./data' \
--bert_path='./KoBERT' \
--save_path='./result' \
--attn_dropout=.2 \
--relu_dropout=.1 \
--emb_dropout=.2 \
--res_dropout=.1 \
--out_dropout=.1 \
--n_layers=2 \
--d_model=64 \
--n_heads=8 \
--lr=1e-5 \
--epochs=10 \
--batch_size=64 \
--clip=1.0 \
--warmup_percent=.1 \
--max_len_audio=400 \
--sample_rate=48000 \
--resample_rate=16000 \
--n_fft_size=400 \
--n_mfcc=40
python train.py --only_audio \
--n_layers=4 \
--n_heads=8 \
--lr=1e-3 \
--epochs=10 \
--batch_size=64 \
python train.py --only_text \
--n_layers=4 \
--n_heads=8 \
--lr=1e-3 \
--epochs=10 \
--batch_size=64 \
python eval.py [--FLAGS]
Emotion | Total | 공포 | 놀람 | 분노 | 슬픔 | 중립 | 행복 | 혐오 |
---|---|---|---|---|---|---|---|---|
F1-score | 33.95 | 75.00 | 33.33 | 44.44 | 22.22 | 18.18 | 44.44 | 0.00 |
Emotion | Total | 공포 | 놀람 | 분노 | 슬픔 | 중립 | 행복 | 혐오 |
---|---|---|---|---|---|---|---|---|
F1-score | 35.28 | 31.84 | 42.68 | 24.71 | 47.32 | 35.80 | 44.52 | 20.12 |
Emotion | Total | 공포 | 놀람 | 분노 | 슬픔 | 중립 | 행복 | 혐오 |
---|---|---|---|---|---|---|---|---|
F1-score | 52.54 | 44.18 | 34.44 | 50.95 | 81.81 | 34.28 | 65.93 | 56.19 |