Skip to content

Latest commit

 

History

History
29 lines (20 loc) · 891 Bytes

README.md

File metadata and controls

29 lines (20 loc) · 891 Bytes

ds-for-telco

Created by Juliet Houghland + Sandy Ryza (juliet@cloudera.com)
The source notebook demonstrates building a churn prediction model using Spark and Spark MlLib's pipeline API for cross validation and model tuning. The Pipeline API is available in PySpark in version 1.6 or higher.

Status: Demo Ready
Use Case: Telco Churn Prediction

Steps:

  1. Open a terminal and run setup.sh
  2. Create a Python Session and run setup.py
  3. In your python session run ds-for-telco.py
  4. When finished, run cleanup.sh in the terminal

Recommended Session Sizes: 2 CPU, 4 GB RAM

Estimated Runtime:
ds-for-telco.py --> approx 1 min

Recommended Jobs/Pipeline:
None

Demo Script
TBD

Related Content:
http://blog.cloudera.com/blog/2016/02/how-to-predict-telco-churn-with-apache-spark-mllib/