The dataset consist of sentences gathered from reviews of various e-commerce website, comments on public facebook pages, and twitter accounts. Each row would ideally consist of a single sentence and have a corresponding sentiment attach to it, which would be either Negative, Positive or Neutral. There are more than 20,000 sentences and they have been manually tagged.
Importing the libraries like tinker,numply,sklean,etc. We have to imported everything that we will be using for given task that is sentiment analysis on roman urdu
Importing dataset taken from UCI https://archive.ics.uci.edu/ml/datasets/Roman+Urdu+Data+Set This data set contains three columns two are useful one is junk column which will not be used anywhere
Looking at the data using functions like head,colunm,row,etc.
Lets apply some cleaning on text, we have only allowed English character no special characters are allowed and all text are converted to lower case and stopwords of roman urdu are deleted.
We have used 80% of data for training and 20% for testing
This matrix shows how many correct and wrong prediction were made for each of the category of labels.
Checking the accuracy scorce of the classifiers.
Visualising the matrix using the matplotlab libaray.
On running the program name “home.py” we will see this interface
User can get the two interfaces. One is to check about the developer and other is to take comment or review as input. On clicking the button click to know about developer, user landed to the file called “about_us.py”. Interface is
Then the user click the return to home page button and landed to the previous page. After that on clicking the button called “Take User Input”, they landed to the file called “PredictUserInput.py”. The user interface is like this
User can type their comment on the box and click the predict sentiment button to see the sentiment.
They can use other buttons according to their requirements. Then click the EXIT button to exit from the program.