- This is a college course project about collecting social data from Internet for for NLP emotion classification task.
- Techniques applied:
- Data collection and cleaning
- Annotation guildlines
- Word tokenization, Deep learning models
- About
- Table of contents
- Data source
- Experiment pipelines
- Data details
- Code
- Presentation slides and Report
- References
- Cite us
- Label distribution
- Word count
- Comment length distribution over labels
- Annotation agreement results over 5 rounds
- Feature extraction and models training (and so on) in this repo are implemented in Google Colab.
- All codes are organized in
name.ipynb
files.
- All references are cited in the report file.
@INPROCEEDINGS{9997964,
author={Van Duong, Binh and Nguyen, An Trong and Ha, Chien Nhu and Duong, Hong-Hanh Thi and Tran, My-Linh Thi and Do, Trong-Hop},
booktitle={2022 25th Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)},
title={UIT-VLFC: Vietnamese Lipstick Feedbacks Corpus},
year={2022},
volume={},
number={},
pages={1-5},
doi={10.1109/O-COCOSDA202257103.2022.9997964}}