Skip to content

Commit

Permalink
revert reliability binary cls docs
Browse files Browse the repository at this point in the history
  • Loading branch information
ZebinYang committed Jun 5, 2024
1 parent c41bb21 commit c757dc4
Show file tree
Hide file tree
Showing 2 changed files with 3 additions and 23 deletions.
10 changes: 1 addition & 9 deletions docs/_build/html/_sources/guides/testing/reliability.rst.txt
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,7 @@ In the plot above, we can see that the marginal bandwidth of `hr` exceeds the th

Reliability for Binary Classification
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
In PiML, the reliability assessment for binary classification tasks is based on Venn-Abers Predictors [Vladimir2015]_. Different from the regression case, test set is split into 2 subsets, including 60% for calibration and 40% for testing the performance of calibration results. Additionally, isotonic regression is employed to calibrate the predicted probabilities. The reliability diagram is used to visually illustrate the calibration of the predicted probabilities, showing how well they align with the observed frequencies. The Brier score, on the other hand, is utilized to quantify the accuracy of the predicted probabilities by calculating the mean squared difference between the predicted probabilities and the actual outcomes.
In PiML, the reliability assessment for binary classification tasks is currently under development and not yet fully mature. As a temporary approach, we utilize the formula :math:`\sqrt{\hat{p}(1-\hat{p})}` to quantify the uncertainty associated with each prediction. Additionally, isotonic regression is employed to calibrate the predicted probabilities. The reliability diagram is used to visually illustrate the calibration of the predicted probabilities, showing how well they align with the observed frequencies. The Brier score, on the other hand, is utilized to quantify the accuracy of the predicted probabilities by calculating the mean squared difference between the predicted probabilities and the actual outcomes.

Distance of Reliable and Un-reliable Data
""""""""""""""""""""""""""""""""""""""""""""""""""""""
Expand Down Expand Up @@ -169,11 +169,3 @@ Examples
The second example below demonstrates how to use PiML’s high-code APIs for the TaiwanCredit dataset from the UCI repository. This dataset comprises the credit card details of 30,000 clients in Taiwan from April 2005 to September 2005, and more information can be found on the TaiwanCreditData website. The data can be loaded directly into PiML, although it requires some preprocessing. The FlagDefault variable serves as the response for this classification problem.

* :ref:`sphx_glr_auto_examples_4_testing_plot_3_reliability_cls.py`


.. topic:: References

.. [Vladimir2015] Vovk, Vladimir, Ivan Petej, Valentina Fedorova.
`Large-scale probabilistic predictors with and without guarantees of validity
<https://arxiv.org/pdf/1511.00213.pdf>`_,
Advances in Neural Information Processing Systems 28 (2015).
16 changes: 2 additions & 14 deletions docs/_build/html/guides/testing/reliability.html
Original file line number Diff line number Diff line change
Expand Up @@ -215,8 +215,6 @@
</ul>




</ul>
</div>
</div>
Expand Down Expand Up @@ -342,7 +340,8 @@ <h3><span class="section-number">6.4.1.3. </span>Marginal Bandwidth<a class="hea
</section>
<section id="reliability-for-binary-classification">
<h2><span class="section-number">6.4.2. </span>Reliability for Binary Classification<a class="headerlink" href="#reliability-for-binary-classification" title="Permalink to this heading"></a></h2>
<p>In PiML, the reliability assessment for binary classification tasks is based on Venn-Abers Predictors <a class="reference internal" href="#vladimir2015" id="id1"><span>[Vladimir2015]</span></a>. Different from the regression case, test set is split into 2 subsets, including 60% for calibration and 40% for testing the performance of calibration results. Additionally, isotonic regression is employed to calibrate the predicted probabilities. The reliability diagram is used to visually illustrate the calibration of the predicted probabilities, showing how well they align with the observed frequencies. The Brier score, on the other hand, is utilized to quantify the accuracy of the predicted probabilities by calculating the mean squared difference between the predicted probabilities and the actual outcomes.</p>
<p>In PiML, the reliability assessment for binary classification tasks is currently under development and not yet fully mature. As a temporary approach, we utilize the formula <span class="math notranslate nohighlight">\(\sqrt{\hat{p}(1-\hat{p})}\)</span> to quantify the uncertainty associated with each prediction. Additionally, isotonic regression is employed to calibrate the predicted probabilities. The reliability diagram is used to visually illustrate the calibration of the predicted probabilities, showing how well they align with the observed frequencies. The Brier score, on the other hand, is utilized to quantify the accuracy of the predicted probabilities by calculating the mean squared difference between the predicted probabilities and the actual outcomes.
</p>
<section id="id2">
<h3><span class="section-number">6.4.2.1. </span>Distance of Reliable and Un-reliable Data<a class="headerlink" href="#id2" title="Permalink to this heading"></a></h3>
<p>To obtain the distributional distance between reliable and unreliable data, set <code class="docutils literal notranslate"><span class="pre">show</span></code> to “reliability_distance”. It’s important to note that the <code class="docutils literal notranslate"><span class="pre">alpha</span></code> argument is not utilized for classifiers.</p>
Expand Down Expand Up @@ -445,17 +444,6 @@ <h2><span class="section-number">6.4.3. </span>Examples<a class="headerlink" hre
<li><p><a class="reference internal" href="../../auto_examples/4_testing/plot_3_reliability_cls.html#sphx-glr-auto-examples-4-testing-plot-3-reliability-cls-py"><span class="std std-ref">Reliability: Classification</span></a></p></li>
</ul>
</aside>
<aside class="topic">
<p class="topic-title">References</p>
<div role="list" class="citation-list">
<div class="citation" id="vladimir2015" role="doc-biblioentry">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#id1">Vladimir2015</a><span class="fn-bracket">]</span></span>
<p>Vovk, Vladimir, Ivan Petej, Valentina Fedorova.
<a class="reference external" href="https://arxiv.org/pdf/1511.00213.pdf">Large-scale probabilistic predictors with and without guarantees of validity</a>,
Advances in Neural Information Processing Systems 28 (2015).</p>
</div>
</div>
</aside>
</section>
</section>

Expand Down

0 comments on commit c757dc4

Please sign in to comment.