Skip to content

Latest commit

 

History

History
40 lines (23 loc) · 810 Bytes

File metadata and controls

40 lines (23 loc) · 810 Bytes

Assignment-11-Text-Mining-01-Elon-Musk

Perform sentimental analysis on the Elon-musk tweets (Exlon-musk.csv)

Text Preprocessing:

  1. remove both the leading and the trailing characters
  2. removes empty strings, because they are considered in Python as False

Joining the list into one string/text

Remove Twitter username handles from a given twitter text. (Removes @usernames)

Again Joining the list into one string/text

Remove Punctuations

Remove https or url within text

Converting into Text Tokens

Tokenization

Remove Stopwords

Normalize the data

Stemming (Optional)

Lemmatization

Feature Extaction

  1. Using BoW CountVectorizer
  2. CountVectorizer with N-grams (Bigrams & Trigrams)
  3. TF-IDF Vectorizer

Generate Word Cloud

Named Entity Recognition (NER)

Emotion Mining - Sentiment Analysis