Disclosure: Submitted as an assignment on LSE's MY474 Classification Challenge in requirement to the MSc Applied Social Data Science degree
Project involves classification of personal attacks in Wikipedia talk page comments. The competition page on Kaggle can be accessed here
Task: Build a classifier that performs well on a test set with the true labels obfuscated to prevent cheating. All of the comments in the train and test set were hand-labeled by folks at Jigsaw.
Content warning: This competition makes use of data from a project to automate moderation of toxic speech online. Many comments in this dataset contain hate speech and upsetting content. Please take care as you work on this assignment.