Skip to content

Latest commit

 

History

History
129 lines (68 loc) · 2.96 KB

README.md

File metadata and controls

129 lines (68 loc) · 2.96 KB

Customer-Churn

Business Value

Customer churn is a major problem and one of the most important concerns for companies due to the direct effect on the revenues. Therefore it is important to develop means to predict potential customer to churn. Hence finding factors that increase customer churn is important to take necessary actions to reduce this churn.

Problem Statement

To predict customer churn based on various variables like customer account information and customer activity.

Data

Each row of data represents a customer and each column contains's a customer's attributes.

Customers who left/churned : Exited

Demographic Information of customers : Geography , Gender , Age

Customer Account Information :Tenure , HasCrCard , Balance , IsActiveMember , EstimatedSalary , NumOfProducts , CreditScore

Approach

  • Loading Data

  • Data Exploration

  • Spliting Data for Train, test and Validation

  • Data Visualization

    • Univariate
    • Bivariate
  • Finding Missing Values

  • Label Encoding

  • One Hot Encoding of Categorical Values

  • Feature Scaling and Normalization

  • Feature Selection

  • Training Model

    • Logistic Regression
    • SVM
    • Decision Tree

In order to measure the performance of the model, the Area Under Curve (AUC) standard measure, and Accuracy is adopted

Visualizing Data

Data Distribution

Product Distribution

Salary Distribution

Tenure Distribution

Balance Distribution

Customer Age vs Customer Churn

Account Balance vs Customer Churn

Correlation Heat Map of All Features

Feature Selection

Selected Features

Model Building and Training

  1. Logistic Regression

Training Data

Validating Data

  1. SVM Model

Training Data

Validating Data

  1. Decision Tree Model

Training Data

Validating Data

Comparing All Clasifiers

From the Model Comparison we see that Decision Tree Model has better Area Under curve and Accuracy over the other two models.

Conclusion

The precision of the model on previously unseen test data is slightly higher with regard to predicting 1's i.e. those customers that churn. However, in as much as the model has a high accuracy, it still misses some of those who end up churning. The model could be improved by providing and retraining the model with more data over time. :-)