Objectivity/Subjectivity dataset built over Reddit Comments. May add other data sources for this dataset down the line, but focused on proper classification of the reddit comments first as it is a large dataset.
-
Reddit Comments May2015 up on Kaggle.com https://www.kaggle.com/reddit/reddit-comments-may-2015
-
parse_reddit_db.py looks for the database.sqlite in a custom path so if using this script to classify the comments, adjust the sqlite connection path to match where you put your database file. The zip and database file are large files that I don't want to bog down the repository with.