DS202 / Data science thesis 1

About

This is a college course project about collecting social data from Internet for for NLP emotion classification task.
Techniques applied:

Data collection and cleaning

Annotation guildlines

Word tokenization, Deep learning models

Data source

Experiment pipelines

Data details

Label distribution

Word count

Comment length distribution over labels

Annotation agreement results over 5 rounds

Code

Feature extraction and models training (and so on) in this repo are implemented in Google Colab.
All codes are organized in name.ipynb files.

Presentation slides and Report

References

All references are cited in the report file.

Cite us

@INPROCEEDINGS{9997964,
  author={Van Duong, Binh and Nguyen, An Trong and Ha, Chien Nhu and Duong, Hong-Hanh Thi and Tran, My-Linh Thi and Do, Trong-Hop},
  booktitle={2022 25th Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)}, 
  title={UIT-VLFC: Vietnamese Lipstick Feedbacks Corpus}, 
  year={2022},
  volume={},
  number={},
  pages={1-5},
  doi={10.1109/O-COCOSDA202257103.2022.9997964}}

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
images		images
processed_data		processed_data
raw_data		raw_data
README.md		README.md
Sanpham_data_fn.csv		Sanpham_data_fn.csv
guildlines_UITVLFC.pdf		guildlines_UITVLFC.pdf
report_UIT-VLFC.pdf		report_UIT-VLFC.pdf
slides_UITVLFC.pptx		slides_UITVLFC.pptx
source_UITVLFC.ipynb		source_UITVLFC.ipynb
vn_words.csv		vn_words.csv
word_list.csv		word_list.csv
word_list.xlsx		word_list.xlsx

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DS202 / Data science thesis 1

About

Table of contents

Data source

Experiment pipelines

Data details

Code

Presentation slides and Report

References

Cite us

About

Releases

Packages

Languages

binhfdv/DS202_Data-Science-thesis-1

Folders and files

Latest commit

History

Repository files navigation

DS202 / Data science thesis 1

About

Table of contents

Data source

Experiment pipelines

Data details

Code

Presentation slides and Report

References

Cite us

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages