Skip to content

dataiku-research/performance_prediction_under_shift

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Performance Prediction Under Dataset Shift

Performance Prediction with Confidence Interval

This repository is the official implementation of Performance Prediction Under Dataset Shift.

Requirements

To install requirements:

pip install -r requirements.txt

Datasets are stored via LFS in this github repository.

Comparison of Performance Predictors

To run the benchmark in the paper, run this command:

python run_benchmark.py 

This script will 1. generate training and test shifted datasets, 2. train several performance predictors models to compare, 3. produce paper figures.

Results

Use the notebook Performance Prediction Under Dataset Shift.ipynb to load the results and generate the tables in the paper.

Use the notebook Performance Prediction With Confidence Interval.ipynb to generate confidence intervals for performance predictions with the method proposed in the paper.

MAE of Accuracy Prediction on Unseen Perturbations

Dataset ATC ExpertRF (amazon) ExpertRF (naver) ErrorPredictorRF
adult 0.031 0.013 0.012 0.001
artificial_characters 0.056 0.046 0.051 0.010
bank 0.036 0.001 0.001 0.000
bng_ionosphere 0.131 0.217 0.132 0.050
bng_zoo 0.062 0.130 0.136 0.011
default_of_credit_card_clients 0.132 0.033 0.036 0.006
heart 0.071 0.029 0.032 0.004
jsbach_chorals_modified 0.027 0.118 0.091 0.002
SDSS 0.090 0.108 0.141 0.023
video_games 0.039 0.010 0.010 0.002

Releases

No releases published

Packages

No packages published