**Run the python notebook first to create pickle files
Frontend is implemented using FLASK API. Python scripts are called using HTML and Flask API.
Inside Python Notebook: This model predicts news category. I used the concepts of Machine Learning and Natural Language Processing. I compared the performance of two algorithms in correctly classifying the category of the news. I used Multinomial Naive Bayes and Logistic Regression for comparison. For Multinomial NB, I got accuracy of 93.86%. For Logistic Regression, I got accuracy of 94.01%. I split the dataset into 70% train dataset and 30% test dataset.
Tools Used:
- Python
- Numpy
- ScikitLearn libraries
- Flask API
Method:
- Import the dataset into the python notebook.
- Import all the necessary python libraries.
- Store the 'text' column in X and 'category' column in Y.
- Split the X and Y in 70% train and 30% test dataset using train_test_split() method.
- Vectorize X using tfidfVectorizer() and Y using LabelEncoder().
- Create and test MultinomialNB and Logistic Regression models using X and Y.
- Now test the created models on test datasets created in step 4.
- Print the accuracy for both the models.
I have also used pickle in python to store the models trained to reduce the effort of training models every time I run the code.