Detecting Unreliable News with Fine-Tuned RoBERTa

This project fine-tunes the RoBERTa model for detecting unreliable news articles. The application classifies news articles as reliable or unreliable by leveraging a pre-trained transformer model and a labeled dataset. The backend is implemented using Flask and deployed on Render.com, with the fine-tuned model stored on Hugging Face Space.

Features

Model: Fine-tuned roberta-base for binary classification.
Dataset: Labeled dataset with news article attributes: title, text, and reliability label.
Deployment: Flask-based API hosted on Render; model stored on Hugging Face for easy access.
Evaluation: Achieves over 99% accuracy and F1 score on the validation dataset.

Installation

Clone the repository:

git clone https://github.com/username/unreliable-news-detector.git
cd unreliable-news-detector

Install dependencies:
```
pip install -r requirements.txt
```
Set up Hugging Face authentication:
```
export HF_TOKEN=your_huggingface_token
```

Configure Flask environment variables:

export FLASK_APP=app.py
export FLASK_ENV=development

Usage

Run the Flask server:

flask run

Access the API at http://localhost:5000 or the deployed version on Render.

API Endpoints

POST /predict
Input: JSON with title and text.
Output: Predicted label (0 = Reliable, 1 = Unreliable) and confidence score.

Project Workflow

Data Preprocessing:
- Cleaned text by removing special characters and URLs.
- Combined title and text fields for holistic context.
- Removed duplicates and balanced classes.
Data Splitting:
- 80/20 split for training and validation.
- Ensured no overlap between sets.
Model Training:
- Used Hugging Face's Trainer API.
- Fine-tuned RoBERTa for 2 epochs with a learning rate of 2e-5.
Evaluation:
- Metrics: Accuracy and F1 Score.
- Plotted confusion matrix and performance graphs.

Model Training

Hyperparameters:

Learning Rate: 2e-5
Batch Size: 16
Epochs: 2
Weight Decay: 0.01

Performance:

Training Loss: 0.0556
Validation Accuracy: 99.35%
Validation F1 Score: 99.34%

Results

Confusion Matrix:

	Predicted Reliable	Predicted Unreliable
Actual Reliable	2055	11
Actual Unreliable	15	1944

Visualization:

Plotted training loss and evaluation accuracy over epochs.
Highlighted the model's strong generalization capabilities.

Deployment

Backend:

Flask REST API serving predictions via POST requests.

Hosting:

Backend deployed on Render.com.
Model weights stored on Hugging Face Space for easy accessibility.

Acknowledgments

Hugging Face for providing pre-trained RoBERTa and the Trainer API.
Render.com for hosting the Flask backend.

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
static		static
template		template
HuggingFace-pushing-finetuned-model.ipynb		HuggingFace-pushing-finetuned-model.ipynb
LICENSE		LICENSE
README.md		README.md
RoBERTa-Unreliable-News.ipynb		RoBERTa-Unreliable-News.ipynb
app.py		app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Detecting Unreliable News with Fine-Tuned RoBERTa

Features

Table of Contents

Installation

Usage

API Endpoints

Project Workflow

Model Training

Hyperparameters:

Performance:

Results

Confusion Matrix:

Visualization:

Deployment

Backend:

Hosting:

Acknowledgments

About

Releases

Packages

Languages

License

SevilayMuni/Flask-App-Roberta-Detect-News

Folders and files

Latest commit

History

Repository files navigation

Detecting Unreliable News with Fine-Tuned RoBERTa

Features

Table of Contents

Installation

Usage

API Endpoints

Project Workflow

Model Training

Hyperparameters:

Performance:

Results

Confusion Matrix:

Visualization:

Deployment

Backend:

Hosting:

Acknowledgments

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages