-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fastadi submission #3
Comments
Update: I've dealt with both of these issues. |
Thanks @alexpghayes. I still see some remaining
Created on 2020-10-07 by the reprex package (v0.3.0) |
All of these are addressed in my commit from above, but perhaps not in ways that |
oh great, then that'll give me a chance to compare currently coded |
Reporting back: It was a bug in |
That's a little weird since |
Thanks @alexpghayes for your submission and improvements that have already been made to the package. Don't worry about the remaining Given that, we would now like to proceed to the formal review stage, for which members of the project's advisory board @bbolker and @topepo have kindly agreed to review your package. They are now requested to perform a two-stage review, the first part involving assessment of the package against the standards as they are currently drafted, with the second being a more "traditional" review. We hope, by the time we proceed to this second component, that many aspects which might otherwise have arisen within a "traditional" unstructured review will already have been covered, and will thereby make the review process notably easier. Our review system will ultimately perform much of the preceding automated assessment prior to actual submission, and reviewers will be provided with a high-level interactive graphical overview of the package's functions and their inter-relationships. In lieu of the system being that far, reviewers can clone Alex's repo from github.com/RoheLab/fastadi, then run the following three lines in the
That should give you an interactive version something like this: Instructions for review@bbolker and @topepo, could you please now asses the Please do this in two phases:
In each case, please only note those standards which you judge the package not to conform to, along with a description of what you would expect this particular software package to do in order to conform to each standard. When you do that, please provide sufficient information on which standard you are referring to. (The standards themselves are all enumerated, but not yet at a necessarily stable state, so please provide enough information for anyone to clearly know which standard you are referring to regardless of potential changes in nomenclature.) Please also note as a separate list all those standards which you think should not apply to this package, along with brief explanations of why. Importantly, to aid us in refining the standards which will ultimately guide the peer review of statistical software, we also ask you to please consider whether you perceive any aspects of software (design, functionality, algorithmic implementations or applications, testing, and any other aspects you can think of) which you think might be able to be addressed by standards, and yet which are not addressed by our standards in their current form. In particular, we note that the nominated category "Dimensionality Reduction, Clustering, and Unsupervised Learning" only partially describes @alexpghayes's package, notably because our categories effectively aim to describe the general aims of software, whereas in this case that category applies to much of the methodology, while the actual aim remains arguably beyond that scope. We will therefore be particularly interested in hearing your thoughts on the applicability or otherwise of the category-specific standards in this case. To sum up, please post the following in this issue:
Once you've done that, we'll ask to you proceed to a more general review of the software, for which we'll provide more detail at that time. Thanks all for agreeing to be part of this! Due dateWe would like to have this review phase completed within 4 weeks, so by the 13th of November 2020. We accordingly suggest that you aim to have the first of the two tasks completed within two weeks, by the 30th October. Could you both please also record approximately how much time you have spent on each review stage. Thank you! |
Update for reviewers @bbolker and @topepo, note that this repo now includes an R package which enables you to get a pre-formatted checklist for your reviews (inspired by, and with gratitude to, co-board member @stephaniehicks) by running the following lines: remotes::install_github("ropenscilabs/statistical-software-review")
library(statsoftrev) # the name of the package
rssr_standards_checklist (category = "unsupervised") That will produce a markdown-formatted checklist in your clipboard ready to paste where you like, or you can use a ping @noamross so you'll be notified of these conversations. |
OK, will do. Note that I had to use |
also, there's a typo "funtions" in the |
General StandardsDocumentation
In vignettes/README/DESCRIPTION only; could be more prominent? Statistical Terminology
Not sure about this. Uses standard definitions (the bulk of this package is mathematical and algorithmic rather than statistical) Function-level Documentation
Supplementary Documentation
no associated pubs [yet? vignette looks like a ms in early stages of prep: readme says " the vignettes are currently scratch work for reference by the developers and are not yet ready for general consumption."
[README says " In simulations fastadi often outperforms softImpute by a small margin.", but I don't know if/where this code lives] Input StructuresUni-variate (Vector) Input
Tabular Input
Software only accepts sparse matrices. Missing or Undefined Values
Inf or NA values give Output Structures
TestingTest Data Sets
Responses to Unexpected Input
Algorithm Tests
Extended tests
Dimensionality Reduction, Clustering, and Unsupervised Learning StandardsInput Data Structures and Validation
Pre-processing and Variable Transformation
Documentation points out that results will be unreliable if a rank is chosen for approximation that is approximately equal to, or greater than, the rank of the input. (Testing the rank of a large matrix is expensive and probably impractical for typical input data; don't know if it is worth pointing out AlgorithmsLabelling
Prediction
Group Distributions and Associated Statistics
Return Results
Reporting Return Results
DocumentationVisualization
Testing
Input Scaling
Output Labelling
Prediction
Batch Processing
|
MINOR comments
|
This feedback is very useful, thank you! |
Thanks @alexpghayes for volunteering to submit your
fastadi
package for trial "soft submission" to rOpenSci's new system for peer review of statistical software. This issue includes output from our automated assement and reporting tools developed as part of this new system. These currently function as the following two distinct components (which will be integrated later):packgraph
autotest
Created on 2020-10-06 by the reprex package (v0.3.0)
Most of these diagnostic messages are about the admissable ranges of single-value parameters, and can probably be safely ignored, or maybe taken to indicate a minor need to tweak documentation of the parameters to indicate expected or admitted ranges. The parsing of description entries by
autotest
to estimate stated ranges is currently fairly crude, so updates of documentation on your side to address these could provide useful test cases for refinement of those procedures.Those aside, the only issues are generally lack of control for lengths of parameters presumed to be single-valued, and matching character arguments regardless of case. If you could please ping here once you've addressed those, we'll post an updated
autotest
output and proceed to subsequent stages of the review process. Thanks!Further information on autotest output
The output of
autotest
includes a columnyaml_hash
. This in turn refers to the yaml specification used to generate the autotests, which can be generated locally by runningexamples_to_yaml (<path>/<to>/<package>)
. Those contain theyaml_hash
values, and finding the matching value will show you the base code used to trigger the diagnostic messages. The operation column should then provide a sufficient description of what has been mutated with regard to the structure defined in the yaml.The text was updated successfully, but these errors were encountered: