Recently, much attention is being paid to the identification of sarcastic posts on social media, particularly because sarcastic comments in the form of tweets frequently contain positive words that reflect negative or undesirable features. More important now than ever is indeed the need of a device that can recognize sarcasm automatically and effectively. This paper presents a Machine Learning Approach for the detection of sarcasm sentences in news websites. The above described task was treated as a binary classification problem and a series of ML classifiers were chosen as candidate models. Performances of the latter were successively analyzed and compared using accuracy scores and F1-measures.
SARCASTIC HEADLINES | NON-SARCASTIC HEADLINES |
---|---|
The idea in this project was to treat the problem as any other standard classification problem. In particular, as we have a labeled training dataset, the aim was to build supervised classification models to classify the data into two different classes which are sarcastic and non-sarcastic. The chosen approach was the Machine Learning-based one and the models to build are:
- Logistic Regression
- Decision Tree
- Naive Bayes
- Random Forest
- Support Vector Machine
The table below shows the best results in terms of train accuracy and test accuracy obtained by each of the models: