Skip to content

Commit

Permalink
Add files via upload
Browse files Browse the repository at this point in the history
JOSS paper somewhat redrafted.
  • Loading branch information
danielhatton authored Jul 4, 2022
1 parent 02994b0 commit bb410ec
Showing 1 changed file with 65 additions and 65 deletions.
130 changes: 65 additions & 65 deletions paper/paper.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,53 +23,52 @@ bibliography: paper.bib
## Scope

This software is relevant to a situation in which:
* the value of some dependent variable depends on the values of some
collection of zero or more independent variables;

* the value of some dependent variable depends on the values of zero
or more independent variables;
* there exists a data set of empirical measurements of the value of
the dependent variable, each accompanied by empirical measurements
of the corresponding values of all the independent variables;
of corresponding values of all the independent variables;
* there exists at least one theoretical model for predicting the value
of the dependent variable from the values of the independent
variables;
* each model contains zero or more adjustable parameters,
i.e. quantities asserted in the model to be constant, and which
affect the mapping from independent variable values to dependent
variable value, but whose exact values are not known a priori; and
i.e. constant quantities which affect the mapping from independent
variable values to dependent variable value, whose exact values are
not known a priori; and
* one wishes to infer from the data the values of the parameters
in each model, and, if there is more than one model, which model is
most probably true.

## Action

In Bayesian parameter estimation, beliefs about the values of
parameters, in light of data, can [@Jeffreys:1931:SI;
@Jeffreys:1932:TEL] be summarized in a posterior expectation value and
standard error. If the model has more than one parameter, the
posterior standard error comes [@Jeffreys:1931:SI; @Jeffreys:1932:TEL]
in two versions: a conditional standard error, based on the assumption
that the other parameters take their posterior expectation values; and
a marginal standard error, which is a probability-weighted average
over all values of the other parameters. In Bayesian model
comparison, the goodness of fit of a model to data is
[@Jeffreys:1935:STS] measured by a quantity known as the marginal
likelihood, which automatically embodies Occam's razor and is
therefore suitable for direct comparison with other models.

However, in software procurement terms, it is much easier to obtain
software packages and libraries which, rather than Bayesian parameter
estimation and model comparison, instead perform least squares fitting
[qv. @Legendre:1805:NMD], which defines a badness of fit measure
called chi-squared for a model _with specific values of its
parameters_. Software typically outputs the minimum value of
chi-squared with respect to the parameters, estimators of the
parameters computed as the point in parameter space that achieves that
minimum chi-squared, and some or all of the elements of the Hessian of
chi-squared with respect to the parameters at that point in parameter
space, intended for use in estimating the standard errors of the
parameters by heuristic methods. Software documentation also often
suggests heuristic ways of attempting model comparison with these
outputs. This is convenient, but lacks clear epistemological
underpinning.
In Bayesian parameter estimation, beliefs about parameter values, in
light of data, can [@Jeffreys:1931:SI; @Jeffreys:1932:TEL] be
summarized in a posterior expectation value and standard error. If
the model has more than one parameter, the posterior standard error
has [@Jeffreys:1931:SI; @Jeffreys:1932:TEL] two variants: a
conditional standard error, based on the assumption that the other
parameters take their posterior expectation values; and a marginal
standard error, which involves a probability-weighted average over all
values of the other parameters. In Bayesian model comparison, the
goodness of fit of a model to data is [@Jeffreys:1935:STS] measured by
a quantity known as the marginal likelihood, which automatically
embodies [@MacKay:1992:BI] Occam's razor and is therefore suitable for
direct comparison with other models.

However, it is much easier to obtain software packages and libraries
which, rather than Bayesian parameter estimation and model comparison,
instead perform least squares fitting [qv. @Legendre:1805:NMD], which
defines a badness of fit measure called chi-squared for a model _with
specific values of its parameters_. Software typically outputs the
minimum value of chi-squared with respect to the parameters; the
parameter values that achieve that minimum chi-squared; and some or
all of the elements of the Hessian of chi-squared with respect to the
parameters at that point in parameter space, intended for use in
estimating the standard errors of the parameters by heuristic methods.
Software documentation also often suggests heuristic ways of
attempting model comparison with these outputs. This is convenient,
but lacks clear epistemological underpinning.

By a happy coincidence, however, these typical outputs of
least-squares fitting software contain enough information to obtain,
Expand All @@ -79,11 +78,12 @@ versions) of the parameters, and to the marginal likelihood. This
calculation is not entirely straightforward, and that is where the
present package, `leastsqtobayes`, comes in. It is written in a
combination of three languages: Gnuplot [@Williams:2015:GRM], which
has powerful built-in capabilities for least-squares fitting; Octave
has powerful (indeed uniquely suitable in its handling of measurement
uncertainties) built-in capabilities for least-squares fitting; Octave
[@Eaton:2012:GOR], which has the matrix and scalar arithmetic
capabilities for the conversion; and Perl [@Wall:2022:P], with the
text-processing capabilities to create bespoke sections Octave code
containing results produced by Gnuplot and vice versa.
capabilities for the calculations; and Perl [@Wall:2022:P], with the
text-processing capabilities to facilitate intercommunication between
Gnuplot and Octave code.

`leastsqtobayes` takes as input an empirical data set; a formula
representing a model; and specifications of the prior
Expand All @@ -96,21 +96,22 @@ parameters; and the marginal likelihood.
## Parameter estimation step

Several recent workers [e.g. @Fenton:2022:UVA; @Albert:2022:BAE;
@Gerster:2022:EBC] have, in attempting to infer the posterior
expectations and standard errors of parameters in a model from data,
particularly in cases where the distinction between conditional
standard errors and marginal standard errors matters, found themselves
unable to obtain those values with "off the peg" least-squares fitting
processes, and have had to resort to bespoke computational approaches.
Indeed, the present author has [@Hatton:2003:SPE; @Sammonds:2017:MSI]
found himself in the same position.
@Gerster:2022:EBC], in common with the present author
[@Hatton:2003:SPE; @Sammonds:2017:MSI], have, in attempting to infer
posterior expectations and standard errors of parameters in a model
from data, found themselves unable to obtain those values with "off
the peg" least-squares fitting processes, and have had to resort to
bespoke computational approaches, with all the risks and duplication
of effort in software quality control this implies. This has been
particularly prevalent in cases where the distinction between
conditional and marginal standard errors matters.

## Model comparison step

@Dunstan:2022:ECB argue that all users of least-squares fitting should
supply the value of the marginal likelihood for each model they fit,
and note the current ubiety of applications of least-squares fitting
in which such a value is not supplied. They further point out that a
in which no such value is supplied. They further point out that a
reason for general non-reporting of marginal likelihood values is the
perceived computational complexity of obtaining those values.

Expand All @@ -123,20 +124,19 @@ formulae having been in the open literature for decades,
@Dunstan:2022:ECB attribute the perceived complexity of computing the
marginal likelihood, which they believe leads to the absence of
marginal likelihood computations in most applications of least-squares
fitting, to a failure to use these formulae.

However, @Dunstan:2022:ECB leave as an exercise for the reader the
implementation issues of how to extract, from the outputs of standard
least-squares fitting software, the information required as input to
the @Lindley:1980:ABM and @Kass:1995:BF formulae, and how to perform
the actual computation. The primary challenge in that computation is
that it involves the determinant of the Hessian of the chi-squared
statistic with respect to the parameters, at the location in parameter
space that minimizes chi-squared. That is where the present software,
`leastsqtobayes`, comes in: it resolves these issues by having the
Gnuplot least-squares fitting system output its results in a format
suitable for direct import to the Octave scientific programming
language, with its inbuilt determinant-finding capability, then have
Octave compute the determinant, by way of using Perl's text-processing
capabilities to generate bespoke Gnuplot and Octave code for any given
inference problem.
fitting, to a widespread failure to use these formulae.

# From the need to the software

However, @Dunstan:2022:ECB leave as an exercise for the reader finding
how to extract, from the outputs of standard least-squares fitting
software, the information required as input to the @Lindley:1980:ABM
and @Kass:1995:BF formulae, and how to perform the actual computation.
Both are somewhat challenging, the former because Gnuplot outputs its
results in a format that is not very readily interoperable with other
systems, and the latter because of the need to compute the determinant
of the Hessian. That is where the present software, `leastsqtobayes`,
comes in: it resolves the former issue using the text-processing
capabilities of Perl, and the latter using the matrix algebra
capabilities of Octave. The final output to the user includes all the
quantities for which an inferential need is identified above.

0 comments on commit bb410ec

Please sign in to comment.