Skip to content

A project that applies machine learning to solve a real-world challenge: what cryptocurrencies are available on the trading market and how they can be grouped using classification.

Notifications You must be signed in to change notification settings

Karla-Flores/Predicting-Credit-Risk--Supervised-Machine-Learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Predicting Credit-Risk--Supervised Machine Learning


Background

In this assignment, a machine learning model will be built that attempts to predict whether a loan from LendingClub will become high risk or not.

LendingClub is a peer-to-peer lending services company that allows individual investors to partially fund personal loans as well as buy and sell notes backing the loans on a secondary market. LendingClub offers their previous data through an API.

You will be using this data to create machine learning models to classify the risk level of given loans. Specifically, you will be comparing the Logistic Regression model and Random Forest Classifier.

Logistic Regression Model

Training Score: 0.6509031198686371
Testing Score: 0.5165886856656742

Random Forest Classifier Model

Training Score: 1.0
Testing Score: 0.6433432581880051

Logistic Regression Model - scale the data

Training Score: 0.7078817733990148
Testing Score: 0.767333049766057

Random Forest Classifier Model - scale the data

Training Score: 1.0
Testing Score: 0.6420672054444917

Summary

  • Logistic Regression fails to precede high-risk customers by allowing a recall of 0.30 due to false-negative cases.
  • Random Forest Classifier has the perfect training score but the testing score of 0.64. This gap allows us to say that the model is overfitting.
  • Random Forest Classifier model on the scaled data, in this model, the training score is one, and the testing score is 0.64. At the moment of observing, the recall is 0. It indicates that the model is overfitting and that the false negatives are high, leading to error.
  • The logistic Regression model on the scaled data has a training score of 0.71 and a testing score of 0.76. With these results, we can say that although it is not a model that predicts perfectly, it is close to the reality you want to predict. This is verified with a recall of the high-risk clients of 0.72, where the false negatives go down.

About

A project that applies machine learning to solve a real-world challenge: what cryptocurrencies are available on the trading market and how they can be grouped using classification.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published