-
Notifications
You must be signed in to change notification settings - Fork 150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding PSI for continious data #329
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #329 +/- ##
==========================================
+ Coverage 83.40% 84.69% +1.28%
==========================================
Files 100 100
Lines 7245 8931 +1686
Branches 1275 1730 +455
==========================================
+ Hits 6043 7564 +1521
- Misses 905 1016 +111
- Partials 297 351 +54 ☔ View full report in Codecov by Sentry. |
Small Comment: You may want to consider numpy.histogram_bin_edges instead of manually implementing Freedman-Diaconis Rule. It has an option to use FD specifically as well, but maybe use We are also using it for JS here |
Maybe also add some numerical tests in This helps us verify that the behavior is correct now and doesn't change over time! |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
added Population Stability Index (PSI) for continuous data.
I used 0.25 as the alerting threshold citing this -> https://www.risk.net/journal-of-risk-model-validation/7725371/statistical-properties-of-the-population-stability-index#:~:text=In%20practice%2C%20the%20following%20%E2%80%9Crule,or%20type%20II%20error%20rates.
Used to Freedman-Diaconis Rule determine bin size
also updated the comments in drift/methods to say drift metric, instead of performance metrics
I recommend looking at the calculations and math closely to see whether it makes sense.
I'll add it for categorical next