Skip to content

binhfdv/DS202_Data-Science-thesis-1

Repository files navigation

DS202 / Data science thesis 1

About

  • This is a college course project about collecting social data from Internet for for NLP emotion classification task.
  • Techniques applied:
  • Data collection and cleaning
  • Annotation guildlines
  • Word tokenization, Deep learning models

Table of contents

Data source

Experiment pipelines

Data details

  • Label distribution

  • Word count

  • Comment length distribution over labels

  • Annotation agreement results over 5 rounds

Code

  • Feature extraction and models training (and so on) in this repo are implemented in Google Colab.
  • All codes are organized in name.ipynb files.

Presentation slides and Report

References

  • All references are cited in the report file.

Cite us

@INPROCEEDINGS{9997964,
  author={Van Duong, Binh and Nguyen, An Trong and Ha, Chien Nhu and Duong, Hong-Hanh Thi and Tran, My-Linh Thi and Do, Trong-Hop},
  booktitle={2022 25th Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)}, 
  title={UIT-VLFC: Vietnamese Lipstick Feedbacks Corpus}, 
  year={2022},
  volume={},
  number={},
  pages={1-5},
  doi={10.1109/O-COCOSDA202257103.2022.9997964}}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published