DS200.L21 / Big data

About

This is a college course project about applying big data tools to solve real-life problems.
The project is to utilize Apache-spark to predict classify the credit score.

Table of contents

DS200.L21 / Big data

About
Table of contents
Data source
Experiment pipelines
Feature extraction pipelines
Code
Presentation slides and Report
Reference

Data source

klp's creditscring challenge for students

Experiment pipelines

Feature extraction pipelines

Code

Feature extraction and models training (and so on) in this repo are implemented in Google Colab.
All codes are organized in name.ipynb files.

Presentation slides and Report

References