Introductory, model family, benchmarks, and technical specification vignettes #16

jolars · 2018-07-24T20:28:00Z

This pull request adds three vignettes to the package and updates one:

A brief introductory vignette featuring a short example of fitting, plotting and predicting.
The benchmarks vignette has been updated with results based on running simulations on an amazon ec2 server. It now features benchmarks from all of the current families (gaussian, binomial and multinomial). Note that the results from the gaussian model are nonsensical at the moment. I believe that this is because the package isn't at all optimized for dense problems. These benchmarks will be updated before release.
A technical documentation featuring pseudocode for the algorithm and descriptions of the various preprocessing steps.
A vignette for the various model families, featuring an example for each one.

These are still a work in progress so I don't expect to merge this one right now, but I'd love for you to pitch in if you have any feedback.

Edit (18/13/8)

I have now also generated a pkgdown-site for the project, which I intend to link to as the final work product of gsoc.

This vignette features an example of fitting, plotting, and assessing a simple model using sgdnet, as well as a brief background.

This vignette provides a description of all the model families as well as examples for a model fit using them in each case.

…o algorithm-vignette

…gorithm-vignette

Merge branch 'algorithm-vignette' of https://github.com/jolars/sgdnet into algorithm-vignette # Conflicts: # data-raw/benchmarks.R

We are actually using vehicle, not segment, so change this

michaelweylandt · 2018-07-26T16:50:44Z

tests/testthat/test-families.R

@@ -13,7 +13,7 @@ test_that("test that all combinations run without errors", {
    family = c("gaussian", "binomial", "multinomial"),
    intercept = TRUE,
    sparse = FALSE,
-    alpha = c(0, 0.5, 1),
+    alpha = c(0, 0.75, 1),


Why this change?

I test with alpha at 0.5 elsewhere, so I just figured it might not hurt to change it up.

Fair enough.

michaelweylandt · 2018-07-26T16:52:59Z

vignettes/algorithm.Rmd

+@schmidt2017. SAGA handles both strongly and
+non-strongly convex objectives -- even in the composite case -- making it
+applicable to a wide range of problems such as generalized
+linear regression with elastic net regularization, which is the problem


generalized linear models is a bit more common terminology

Cite Zou and Hastie (2005) for the elastic net.

michaelweylandt · 2018-07-26T16:54:28Z

vignettes/algorithm.Rmd

+
+Before the algorithm is set loose on data, its parameters have to be set up 
+properly and its data (possibly) preprocessed. For illustrative purposes,
+we will proceed by example and use Gaussian univariate multiple regression


I'd drop multiple here - it's easily confused with family = "mgaussian" and the fact you are using sparsity / regularization should make the fact we have more than one predictor implicitly clear.

michaelweylandt · 2018-07-26T16:54:57Z

vignettes/algorithm.Rmd

+```{r}
+sgdnet_sd <- function(x) sqrt(sum((x - mean(x))^2)/length(x))
+n <- nrow(x)
+x_bar <- colSums(x)/n


colMeans(x)

michaelweylandt · 2018-07-26T16:56:44Z

vignettes/algorithm.Rmd

+
+We then construct the $\lambda$ path to be a $\log_e$-spaced sequence starting
+at $\lambda_{\text{max}}$ and finishing at 
+$\frac{\lambda_{\text{max}}}{\lambda_{\text{min ratio}}}$. Thus we have


lambda_max times lambda_min_ratio , not divided by

michaelweylandt · 2018-07-26T17:06:42Z

vignettes/introduction.Rmd

+
+## Background and motivation
+
+**sgdnet** was developed as a Google Summer of Code 2018. Its goal was to be


I think you can leave GSoC mentions until the end as a funding acknowledgement. I expect you'll keep maintaining and improving this past the end of GSoC-2018

michaelweylandt · 2018-07-26T17:08:19Z

vignettes/models.Rmd

+where $\mathcal{L}(\beta_0,\beta; \mathbf{y}, \mathbf{X})$ is the 
+log-likelihood of the model $\lambda$ is the regularization strength, and
+$\alpha$,is the elastic net mixing parameter, such that
+$\alpha = 1$ results in the lasso and $\alpha = 0$ the ridge penalty.


michaelweylandt · 2018-07-26T17:10:07Z

vignettes/models.Rmd

+
+$$
+\text{Pr}(Y_i = c) = 
+  \frac{e^{\beta_{0_c}+\beta_c^\intercal \mathbf{x}_i}}{\sum_{k = 1}^m{e^{\beta_{0_k}+\beta_k^\intercal \mathbf{x}_i}}},


Maybe have K for the number of classes instead of m

michaelweylandt · 2018-07-26T17:10:22Z

vignettes/models.Rmd

+
+which is the overspecified version of this model.  As in **glmnet**
+[@friedman2010],
+which much of this packages functinality is modeled after, however,


functionality

michaelweylandt · 2018-07-26T17:10:59Z

vignettes/models.Rmd

+[@friedman2010],
+which much of this packages functinality is modeled after, however,
+we rely on the regularization of model to take care of
+this [@hastie2015, pp. 36-7].


Clarify this a bit more. (I believe you are saying that regularization implicitly induces a sum-to-zero constraint)

michaelweylandt · 2018-07-26T17:13:00Z

This is a good start - I only reviewed the vignettes since I think the other changes are already under review as #15. I particularly appreciated the technical write up - I finally (sort of) get the point of the lagged update structure.

michaelweylandt · 2018-08-07T16:07:13Z

Did you want another round of comments on this?

jolars · 2018-08-08T12:31:10Z

Hm, thanks but let's wait a little. I noticed that I had missed seeing to a couple of your comments too.

…gorithm-vignette

Merge branch 'master' of github.com:jolars/sgdnet into algorithm-vignette # Conflicts: # tests/testthat/test-families.R

…rithm-vignette

michaelweylandt · 2018-08-13T04:36:23Z

README.Rmd

+
+## Versioning 
+
+**eulerr** uses [semantic versioning](https://semver.org/).


Wrong package name

michaelweylandt · 2018-08-13T04:36:55Z

README.Rmd

+[ridge](https://en.wikipedia.org/wiki/Tikhonov_regularization)
+and [lasso](https://en.wikipedia.org/wiki/Lasso_(statistics)) penalties.
+
+**sgdnet** automatically fits the model across an automatically computed


Don't use "automatically" twice in one sentence.

michaelweylandt · 2018-08-13T04:37:26Z

inst/COPYRIGHTS

@@ -19,7 +19,8 @@ along with this program.  If not, see <https://www.gnu.org/licenses/>.
 # scikit-learn

 Parts of this package are modified translations of source code from the
-Python package scikit-learn, which is licensed under the New BSD License
+Python package scikit-learn (including the contributed module lightning),
+which is licensed under the New BSD License


Possibly add URLs here

michaelweylandt · 2018-08-13T04:38:48Z

vignettes/sgdnet.bib

+  volume = {2},
+  shorttitle = {{{SAGA}}},
+  abstract = {In this work we introduce a new optimisation method called SAGA in the spirit of SAG, SDCA, MISO and SVRG, a set of recently proposed incremental gradient algorithms with fast linear convergence rates. SAGA improves on the theory behind SAG and SVRG, with better theoretical convergence rates, and has support for composite objectives where a proximal operator is used on the regulariser. Unlike SDCA, SAGA supports non-strongly convex problems directly, and is adaptive to any inherent strong convexity of the problem. We give experimental results showing the effectiveness of our method.},
+  journal = {Advances in Neural Information Processing Systems},


NIPS isn't a journal. Does Zotero support the InProceedings type?

michaelweylandt · 2018-08-13T04:39:00Z

vignettes/sgdnet.bib

+  abstract = {We analyze the stochastic average gradient (SAG) method for optimizing the sum of a finite number of smooth convex functions. Like stochastic gradient (SG) methods, the SAG method's iteration cost is independent of the number of terms in the sum. However, by incorporating a memory of previous gradient values the SAG method achieves a faster convergence rate than black-box SG methods. The convergence rate is improved from O(1/k$\surd$)O(1/k)O(1/$\backslash$sqrt\{k\}) to O(1 / k) in general, and when the sum is strongly-convex the convergence rate is improved from the sub-linear O(1 / k) to a linear convergence rate of the form O($\rho$k)O($\rho$k)O($\backslash$rho \^k) for $\rho{}<$1$\rho{}<$1$\backslash$rho $<$ 1. Further, in many cases the convergence rate of the new method is also faster than black-box deterministic gradient methods, in terms of the number of gradient evaluations. This extends our earlier work Le Roux et al. (Adv Neural Inf Process Syst, 2012), which only lead to a faster rate for well-conditioned strongly-convex problems. Numerical experiments indicate that the new algorithm often dramatically outperforms existing SG and deterministic gradient methods, and that the performance may be further improved through the use of non-uniform sampling strategies.},
+  language = {en},
+  number = {1-2},
+  journal = {Math. Program.},


Full journal name

michaelweylandt · 2018-08-13T04:41:50Z

vignettes/models.Rmd

+penalty.
+
+When the lasso penalty is in place, the regularization imposed on the
+coefficients takes the shape of an octahedron.


Not generally (only in $p=3$ dimensions)

michaelweylandt · 2018-08-13T04:44:32Z

vignettes/introduction.Rmd

+
+The goal of **sgdnet** is to be
+a one-stop solution for fitting elastic net-penalized generalized linear 
+models. This is of course not novel in and by itself. Several packages,


I think this is too broad - I think it's unlikely we'll outperform glmnet in the p \gg n setting or the p \approx n \gg 1 setting. I'd focus our attention on the n \gg p setting.

michaelweylandt · 2018-08-13T04:47:15Z

vignettes/algorithm.Rmd

+
+By default, the feature matrix in **sgdnet** is centered and scaled to unit
+variance. Here, however, we used the *biased* population standard deviation
+and divide by $n$ rather than $n-1$.


I'd move this to a footnote and rephrase. We're using the sample standard deviation here since we are only concerned with fitting the model to the data at hand. Since we aren't concerned with a population, biasedness (or not) is irrelevant

michaelweylandt · 2018-08-13T04:48:51Z

vignettes/algorithm.Rmd

+
+The step size, $\gamma$, in SAGA is constant throughout the algorithm.
+For non-strongly convex objectives it is set to $1/(3L)$, where $L$ is the 
+*Lipschitz* constant. Since we are picking a single sample at a time,


Be precise here - $L$ is the maximum Lipschitz constant of the term of the log-likelihood corresponding to a single observation, not the overall Lipschitz constant of the problem

michaelweylandt · 2018-08-13T04:49:59Z

vignettes/algorithm.Rmd

+    nz <- Nonzeros(X[j]) # nonzero indices of X[j]
+
+    # Perform just-in-time updates of coefficients X is sparse
+    X[j] <- LaggedUpdate(j,


For simplicity, I might focus on the dense (no lagged update) case first and then describe the lagged updates near the end.

…es, remove file field, add doi to friedman2010

…ette

jolars · 2018-08-13T19:15:22Z

I'll merge this one in now and upload the pkgdown site.

jolars added 18 commits July 19, 2018 16:14

Provide technical documentation as a vignette

c80f44c

Specify a proper vignette title

d3116a3

Test with a different alpha value here just to change things up

e41fe5a

Provide an introductory vignette

cabf4ad

This vignette features an example of fitting, plotting, and assessing a simple model using sgdnet, as well as a brief background.

Provide vignette for model descriptions

62ed3c2

This vignette provides a description of all the model families as well as examples for a model fit using them in each case.

Redesign benchmarks.

c7e25dc

Scrap internal data

71f9594

Update benchmark generation code. Switch to median for central tendency

77eab91

Update benchmarking vignette

1795271

Merge branch 'class-redesign' of https://github.com/jolars/sgdnet int…

ab48823

…o algorithm-vignette

Merge branch 'algorithm-vignette' of github.com:jolars/sgdnet into al…

6563832

…gorithm-vignette

Update benchmark data

fbf0ee9

Merge

e66fe27

Merge branch 'algorithm-vignette' of https://github.com/jolars/sgdnet into algorithm-vignette # Conflicts: # data-raw/benchmarks.R

Correct documentation of multinomial datasets

0637702

We are actually using vehicle, not segment, so change this

Reformat tables, omit code, and free up scales

a1789e9

Update documentation

b899d16

Delete old benchmarking data

22811d9

Change header level for unstandardization

1a13929

jolars added the documentation label Jul 24, 2018

jolars requested review from tdhock and michaelweylandt July 24, 2018 20:28

Refresh documentation

8140513

michaelweylandt reviewed Jul 26, 2018

View reviewed changes

jolars added 6 commits July 27, 2018 14:24

Replace "regression" with "models"

e48808e

Cite Zou and Hastie for the elastic net

7e4babb

Drop "multiple" from "Gaussian univariate multiple regression"

311766e

Use the simpler colMeans() instead of colSums()/n

1859f9f

Correct definition of lambda path end point

82b7051

Add missing space

412f8f6

jolars added 2 commits August 6, 2018 16:23

Update model description vignette with multivariate gaussian model

dd1a7c0

Complement benchmark data with multivariate gaussian models

2fb421f

jolars added 9 commits August 12, 2018 09:36

Merge branch 'algorithm-vignette' of github.com:jolars/sgdnet into al…

e474278

…gorithm-vignette

Merge in changes in master

df4298f

Merge branch 'master' of github.com:jolars/sgdnet into algorithm-vignette # Conflicts: # tests/testthat/test-families.R

Expand on mu in the step size calculations a bit

a2f7f4e

Fix table headings in multivariate gaussian example.

d2c6159

Add missing text to paragraph about reset() in algorithm

53ee9d2

Update readme with overview of the package

3b0a694

Update description with better information on package.

a96e713

Add information regarding lightning in copyrights.

d201acc

Merge branch 'cross-validation' of github.com:jolars/sgdnet into algo…

7f04bf8

…rithm-vignette

michaelweylandt reviewed Aug 13, 2018

View reviewed changes

jolars added 13 commits August 13, 2018 09:52

Correct package name

3fe610e

Don't use automatically twice

3b0573b

Add urls to scikit-learn and lightning

390ea0c

Fix Defazio 2014 citation, remove abstracts, unabbreviate journal nam…

0e98959

…es, remove file field, add doi to friedman2010

Specify that it is only an octahedron when there are three coefficients.

52fb636

Clarify the target domain for sgdnet

612fa16

Move notes about scaling to footnote.

f43378f

Move sparse implementation notes back

4c30a9c

Merge branch 'master' of github.com:jolars/sgdnet into algorithm-vign…

39c452f

…ette

Add pkgdown documentation

c1026c9

Resize readme image

b8913a1

Correct typo

48f2cb1

Correct spelling in "set"

0932683

jolars merged commit 7e9261a into master Aug 13, 2018

jolars deleted the algorithm-vignette branch August 13, 2018 19:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introductory, model family, benchmarks, and technical specification vignettes #16

Introductory, model family, benchmarks, and technical specification vignettes #16

jolars commented Jul 24, 2018 •

edited

Loading

michaelweylandt Jul 26, 2018

jolars Jul 27, 2018

michaelweylandt Jul 27, 2018

michaelweylandt Jul 26, 2018

michaelweylandt Jul 26, 2018

michaelweylandt Jul 26, 2018

michaelweylandt Jul 26, 2018

michaelweylandt Jul 26, 2018

jolars Jul 27, 2018

michaelweylandt Jul 26, 2018

michaelweylandt Jul 26, 2018

michaelweylandt Jul 26, 2018

michaelweylandt Jul 26, 2018

michaelweylandt commented Jul 26, 2018

michaelweylandt commented Aug 7, 2018

jolars commented Aug 8, 2018

michaelweylandt Aug 13, 2018

michaelweylandt Aug 13, 2018

michaelweylandt Aug 13, 2018

michaelweylandt Aug 13, 2018

michaelweylandt Aug 13, 2018

michaelweylandt Aug 13, 2018

michaelweylandt Aug 13, 2018

michaelweylandt Aug 13, 2018

michaelweylandt Aug 13, 2018

michaelweylandt Aug 13, 2018

jolars commented Aug 13, 2018


		## Background and motivation

		sgdnet was developed as a Google Summer of Code 2018. Its goal was to be


		## Versioning

		eulerr uses [semantic versioning](https://semver.org/).

Introductory, model family, benchmarks, and technical specification vignettes #16

Introductory, model family, benchmarks, and technical specification vignettes #16

Conversation

jolars commented Jul 24, 2018 • edited Loading

Edit (18/13/8)

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

michaelweylandt commented Jul 26, 2018

michaelweylandt commented Aug 7, 2018

jolars commented Aug 8, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jolars commented Aug 13, 2018

jolars commented Jul 24, 2018 •

edited

Loading