项目作者: abhinavg97

项目描述 :
Multilabel Aspect Prediction using Graph Convolutional Networks
高级语言: Python
项目地址: git://github.com/abhinavg97/Learnable_DropEdge.git
创建时间: 2020-08-05T08:57:53Z
项目社区:https://github.com/abhinavg97/Learnable_DropEdge

开源协议:MIT License

下载


Aspect Based Sentiment Analysis with GCN

This project is under works. It aims to do sentiment analysis using text GCN.

Currently done:

  • Identified aspects terms from user opinions.
  • Dependency parsing is used to capture syntactical structure.
  • Graph Convolutional Network is used to capture dependencies of aspect and opinions.
  • Stratified split is used to ensure even distribution of aspect classes among train, validation and test data.
  • For predicting the aspect terms, MultilabelClassification from the simpletransformers library is used as the baseline.

Backlog:

  1. Connect Updating Adjacency matrix code with the main pipeline

Datasets

Six datasets are used to evaluate our model.

All the datasets are cleaned by using the text processing pipeline as mentioned in the paper. The description of the pipeline is given in the utils folder of absa_gnn module in this repository as well.

The cleaned data is stored in the data folder of this repository. The format of the data is [text labels].

Text contains the cleaned text from the datasets mentioned above, labels contain a multi hot vector as described in the paper.

For a detailed information about the files present in each dataset folder, please navigate to the data folder.

Please cite us if you find the the above cleaned datasets helpful in your work.

Folder Structure

Browse into the corresponding folders in the absa_gnn module to see the pertaining details

Setup

Prerequisites

  • Python3

Install the virtual Environment for python

  1. $ sudo apt install python3-venv

Install the Java dependancy

  1. $ sudo apt install openjdk-8-jre-headless

In case pip install gives wheel related errors:

  1. $ sudo update-alternatives --set java /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java

This is required for language check package

Clone the repository

  1. $ git clone https://github.com/abhinavg97/ABSA_GNN.git
  2. $ cd gcn

Create and Activate the virtual environment

  1. $ python -m venv venv
  2. $ source venv/bin/activate

Install

  1. $ pip install -r requirements.txt

Install the spacy Dependancies

  1. $ sudo apt install python3-tk
  2. $ python -m spacy download en_core_web_lg
  3. $ python -m spacy download en_core_web_sm

Using the scripts

Run our model

  1. $ python main.py

Run baseline

  1. $ python baseline.py

Logging

Logging is done by PyTorch lightning which uses Tensorboard by default.

Visualize the metrics:

  1. $ tensorboard --logdir lightning_logs/

or

  1. $ python3 -m tensorboard.main --logdir lightning_logs/

Containerize the application

  1. $ docker image build -t image_name:tag .
  2. $ docker container run --name absa_gnn --mount source=volume_name,target=/usr/src/app image_name:tag

The mounted directory is present at /var/lib/docker/volumes/

Note: You need sudo permissions to access the above directory

Citation

  1. @misc{
  2. author = {Gupta, Abhinav and Ghosh, Samujjwal and Konjengbam, Anand},
  3. title = {ABSA GNN},
  4. year = {2020},
  5. publisher = {GitHub},
  6. journal = {GitHub repository},
  7. howpublished = {\url{https://github.com/abhinavg97/ABSA_GNN}}
  8. }