Customer Churn analysis

Snowpark Python for Data Science

As a data engineering and data science team member at a Telecom company, we have been tasked to build an end to end data pipeline and model in snowflake to support customer churn analysis by data science team. For this we have some customer data that our data science team would need. We are responsible to build a feature store for the data science team.

Challenges

As Data Engineers we want to build an end to end pipeline within snowflake and reduce the overall cost and technology footprints so that there are less point of failures
On one hand we have customer billing and demographics data saved in semi structured format (PARQUET) and on other we have unstructured data in the form of emails
We want to simplify the data pipeline by ingesting and processing the data closer within snowflake But there are multiple developers persona in our team. Some know SQL, Some are Java and python professionals
We also need to make data scientist’s life easier by cleaning, formatting, transforming and creating a feature store for them to refer in churn analysis
We have selected Feast to build and deploy our feature store for offline store implementation.

Feast Feature Store

We will deploy the Feast feature store and will configure 'feature views' for the features created in Snowflake and 'feature service' to deliver those features through offline feature store for model training purposes.

In notebook ~01, we will use Snowpark for python to load raw data into snowflake and transform the data to create customer entities.
In notebook ~02, we will setup Feast, configure the integration with Snowflake, define and setup Feature repository, deploy and test the deployment.
In notebook ~03, we will create training dataframe using Feast offline feature store, train the model, evaluate the model and deploy it on Snowflake as a UDf for batch inferencing.

Machine Learning Pipeline

Run the notebooks starting from 01~ to 03. we will go through the implementation of each one of the steps in the Machine Learning Pipeline.

We will discuss:

Data Preparation
Extracting training data from offline Feast feature store
Feature Engineering
Feature Selection
Model Training
Checking model predictions using Feast online feature store
Obtaining Predictions / Scoring

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
01-Load-Data-with-Snowpark.ipynb		01-Load-Data-with-Snowpark.ipynb
02-Install-and-Setup-Feast-Feature-Store.ipynb		02-Install-and-Setup-Feast-Feature-Store.ipynb
03-Snowpark-UDF-Deployment.ipynb		03-Snowpark-UDF-Deployment.ipynb
Feast_on_Snowflake.jpeg		Feast_on_Snowflake.jpeg
LICENSE		LICENSE
ReadMe.md		ReadMe.md
arch.jpg		arch.jpg
config.py		config.py
cust_repo.py		cust_repo.py
jupyter_env.yml		jupyter_env.yml
raw_telco_data_dt.parquet		raw_telco_data_dt.parquet

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Customer Churn analysis

Snowpark Python for Data Science

Challenges

Feast Feature Store

Machine Learning Pipeline

About

Releases

Packages

Contributors 2

Languages

License

Snowflake-Labs/sfguide-getting-started-snowpark-python-feast

Folders and files

Latest commit

History

Repository files navigation

Customer Churn analysis

Snowpark Python for Data Science

Challenges

Feast Feature Store

Machine Learning Pipeline

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages