Skip to content

Developed a machine learning pipeline to predict customer churn with over 90% accuracy, leveraging data preprocessing, feature engineering, and Random Forest modelling. Conducted exploratory data analysis to uncover key drivers of churn, such as customer recency and cohorts from first transations.

Notifications You must be signed in to change notification settings

ivanseldas/microcredit-churn-classifier

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

93 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Microcredit Churn Classifier

Welcome to my Churn Prediction project, where I explored the challenge of identifying customers at risk of leaving a business. By applying machine learning techniques, I built a pipeline capable of delivering accurate predictions and valuable insights to support decision-making.

cohort_analysis_wallpaper

  1. Exploratory Data Analysis (EDA):

    • Identify cohorts by first transaction and retention rate: image
  2. Feature Engineering:

    • Set Churn as > 90 days without activity
    • Create a Dataframe from transactional data to analytical data for churn prediction with the following correlation between features: image
  3. Machine Learning Models:

    • Developed and evaluated Logistic Regression and Random Forest Classifier models,
    • Achieved over 90% accuracy in both of them measured by F1-Score, ROC-AUC-Curve and Confusion Matrix

    RandomForestClassifier Confusion Matrix: image

    ROC Curve: image

Results

  1. Model Comparison:

    • The RandomForestRegressor performs slightly better, with both models achieving over 90% accuracy.
  2. Data Limitations:

    • Despite the high accuracy, the models do not appear to be overfitting, suggesting they are well-generalized.
    • However, the dataset covers only 1 year, which may limit capturing long-term churn trends.
  3. Feature Importance:

    • Recency strongly correlates with churn, aligning with the 90-day threshold.
  4. Cohort Impact:

    • Customer cohorts from first transactions significantly affect churn predictions, highlighting the importance of segmentation.

Files

About

Developed a machine learning pipeline to predict customer churn with over 90% accuracy, leveraging data preprocessing, feature engineering, and Random Forest modelling. Conducted exploratory data analysis to uncover key drivers of churn, such as customer recency and cohorts from first transations.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%