revert reliability binary cls docs

SelfExplainML · Jun 5, 2024 · c757dc4 · c757dc4
1 parent c41bb21
commit c757dc4
Show file tree

Hide file tree

Showing 2 changed files with 3 additions and 23 deletions.
diff --git a/docs/_build/html/_sources/guides/testing/reliability.rst.txt b/docs/_build/html/_sources/guides/testing/reliability.rst.txt
@@ -91,7 +91,7 @@ In the plot above, we can see that the marginal bandwidth of `hr` exceeds the th
 
 Reliability for Binary Classification
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-In PiML, the reliability assessment for binary classification tasks is based on Venn-Abers Predictors [Vladimir2015]_. Different from the regression case, test set is split into 2 subsets, including 60% for calibration and 40% for testing the performance of calibration results. Additionally, isotonic regression is employed to calibrate the predicted probabilities. The reliability diagram is used to visually illustrate the calibration of the predicted probabilities, showing how well they align with the observed frequencies. The Brier score, on the other hand, is utilized to quantify the accuracy of the predicted probabilities by calculating the mean squared difference between the predicted probabilities and the actual outcomes.
+In PiML, the reliability assessment for binary classification tasks is currently under development and not yet fully mature. As a temporary approach, we utilize the formula :math:`\sqrt{\hat{p}(1-\hat{p})}` to quantify the uncertainty associated with each prediction. Additionally, isotonic regression is employed to calibrate the predicted probabilities. The reliability diagram is used to visually illustrate the calibration of the predicted probabilities, showing how well they align with the observed frequencies. The Brier score, on the other hand, is utilized to quantify the accuracy of the predicted probabilities by calculating the mean squared difference between the predicted probabilities and the actual outcomes.
 
 Distance of Reliable and Un-reliable Data 
 """"""""""""""""""""""""""""""""""""""""""""""""""""""
@@ -169,11 +169,3 @@ Examples
   The second example below demonstrates how to use PiML’s high-code APIs for the TaiwanCredit dataset from the UCI repository. This dataset comprises the credit card details of 30,000 clients in Taiwan from April 2005 to September 2005, and more information can be found on the TaiwanCreditData website. The data can be loaded directly into PiML, although it requires some preprocessing. The FlagDefault variable serves as the response for this classification problem.
 
  * :ref:`sphx_glr_auto_examples_4_testing_plot_3_reliability_cls.py`
-
-
-.. topic:: References
-
-    .. [Vladimir2015] Vovk, Vladimir, Ivan Petej, Valentina Fedorova. 
-               `Large-scale probabilistic predictors with and without guarantees of validity 
-               <https://arxiv.org/pdf/1511.00213.pdf>`_,
-               Advances in Neural Information Processing Systems 28 (2015).
diff --git a/docs/_build/html/guides/testing/reliability.html b/docs/_build/html/guides/testing/reliability.html
@@ -215,8 +215,6 @@
                 </ul>
 
 
-
-
                 </ul>
               </div>
         </div>
@@ -342,7 +340,8 @@ <h3><span class="section-number">6.4.1.3. </span>Marginal Bandwidth<a class="hea
 </section>
 <section id="reliability-for-binary-classification">
 <h2><span class="section-number">6.4.2. </span>Reliability for Binary Classification<a class="headerlink" href="#reliability-for-binary-classification" title="Permalink to this heading">¶</a></h2>
-<p>In PiML, the reliability assessment for binary classification tasks is based on Venn-Abers Predictors <a class="reference internal" href="#vladimir2015" id="id1"><span>[Vladimir2015]</span></a>. Different from the regression case, test set is split into 2 subsets, including 60% for calibration and 40% for testing the performance of calibration results. Additionally, isotonic regression is employed to calibrate the predicted probabilities. The reliability diagram is used to visually illustrate the calibration of the predicted probabilities, showing how well they align with the observed frequencies. The Brier score, on the other hand, is utilized to quantify the accuracy of the predicted probabilities by calculating the mean squared difference between the predicted probabilities and the actual outcomes.</p>
+<p>In PiML, the reliability assessment for binary classification tasks is currently under development and not yet fully mature. As a temporary approach, we utilize the formula <span class="math notranslate nohighlight">\(\sqrt{\hat{p}(1-\hat{p})}\)</span> to quantify the uncertainty associated with each prediction. Additionally, isotonic regression is employed to calibrate the predicted probabilities. The reliability diagram is used to visually illustrate the calibration of the predicted probabilities, showing how well they align with the observed frequencies. The Brier score, on the other hand, is utilized to quantify the accuracy of the predicted probabilities by calculating the mean squared difference between the predicted probabilities and the actual outcomes.
+</p>
 <section id="id2">
 <h3><span class="section-number">6.4.2.1. </span>Distance of Reliable and Un-reliable Data<a class="headerlink" href="#id2" title="Permalink to this heading">¶</a></h3>
 <p>To obtain the distributional distance between reliable and unreliable data, set <code class="docutils literal notranslate"><span class="pre">show</span></code> to “reliability_distance”. It’s important to note that the <code class="docutils literal notranslate"><span class="pre">alpha</span></code> argument is not utilized for classifiers.</p>
@@ -445,17 +444,6 @@ <h2><span class="section-number">6.4.3. </span>Examples<a class="headerlink" hre
 <li><p><a class="reference internal" href="../../auto_examples/4_testing/plot_3_reliability_cls.html#sphx-glr-auto-examples-4-testing-plot-3-reliability-cls-py"><span class="std std-ref">Reliability: Classification</span></a></p></li>
 </ul>
 </aside>
-<aside class="topic">
-<p class="topic-title">References</p>
-<div role="list" class="citation-list">
-<div class="citation" id="vladimir2015" role="doc-biblioentry">
-<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#id1">Vladimir2015</a><span class="fn-bracket">]</span></span>
-<p>Vovk, Vladimir, Ivan Petej, Valentina Fedorova.
-<a class="reference external" href="https://arxiv.org/pdf/1511.00213.pdf">Large-scale probabilistic predictors with and without guarantees of validity</a>,
-Advances in Neural Information Processing Systems 28 (2015).</p>
-</div>
-</div>
-</aside>
 </section>
 </section>