Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expansion of yspec range capabilities #147

Open
jenna-a2ai opened this issue Oct 26, 2023 · 1 comment
Open

Expansion of yspec range capabilities #147

jenna-a2ai opened this issue Oct 26, 2023 · 1 comment

Comments

@jenna-a2ai
Copy link

Hi Kyle,

This is Jenna from A2-AI. We’re using yspec more and more in our day-to-day workflow for quality control and really appreciate how well designed the package is! We have identified a gap in functionality to ensure that column entries are indeed within specified ranges. We were wondering if it would be possible to expand the capabilities of ys_document() and ys_check() to include discontinuous ranges, i.e. ranges that union discrete values. For example we might have a value of -999 as a flag in a column that otherwise had some known range, and we'd like to include this isolated value in yspec ranges.

Consider the following Test.yml and its ys_document() output:

AMT:
  range: [[0, 100], -999]
library(yspec)
yspec::ys_document("Test.yml")

Screenshot 2023-10-26 at 11 30 44 AM
Then Details might say: numeric; valid range: [0 to 100] or {-999}

Suppose instead that we’d like to specify multiple flags:

AMT:
  range: [[0, 100], -999, -99]

In which case, Details might say numeric; valid range: [0 to 100] or {-999, -99}

Alternatively, we’ve considered a flag specifier item in the case of multiple flags.

Consider the following yaml:

AMT:
  range: [0, 100]
  flags: {"missing": -999, "outlier": -888}

This doesn’t render when ys_document() is called because flags isn't a defined item. If it did, the output could appear as key-value pairs in the same way set values do.

VARIABLE LABEL DETAILS FLAGS
AMT AMT numeric; valid range: [0 to 100] -999 = missing, -888 = outlier

or

VARIABLE LABEL DETAILS
AMT AMT numeric; valid range: [0 to 100]
flags: -999 = missing, -888 = outlier

Additionally, these flags could be included as valid values in range checks for ys_check().

Let us know know what you think and if this is something we could help collaborate on implementing. We’re also curious to know if there are other approaches Metrum may be using at this point to address these challenges in some other way?

Thanks,

Jenna

cc: @dpastoor

@dpastoor
Copy link
Contributor

dpastoor commented Nov 6, 2023

@jennaelwing I'm sitting here with kyle at acop talking about this :-) we just missed reviewing this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants