Loan Defaulter Prediction with PySpark. This repository contains PySpark code for predicting loan defaulters using a Random Forest classifier. The project is split into two parts:
-Train Model
This script loads loan dataset, preprocesses it, trains a Random Forest classifier using cross-validation for hyperparameter tuning, and saves the trained model for later use.
-Test Model
After training the model, you can use this script to load the trained model and make predictions on synthetic test data. It demonstrates how to preprocess test data and utilize the trained model for predictio