Add an `<epidemic>` class #159

pratikunterwegs · 2024-01-29T08:43:07Z

This draft PR fixes #156 by adding an <epidemic> class. This is scheduled to be merged after #158.

<epidemic> is a list-based class with four elements: the function name, the model data, the model inputs (parameters and composable elements), and the version hash for reproducibility or comparability.

Changes:

All model functions now return <epidemic> as output;
All helper functions previously operating on data.frame model outputs now operate on <epidemic> (e.g. epidemic_size());
get_parameter() supports <epidemic> and allows accessing the model data ("data") and the parameters and composable inputs ("parameters");
All vignettes, examples, and tests now return, expect, and use <epidemic> outputs.

Bisaloo

I don't think it's the right approach to solve the reproducibility issue. This is not something that should be addressed at the package-level, but rather in trainings, such as the one Andree produced on research compendia.
This gives users a false sense of security and could incentivize them to compare outputs in situations where they should not be compared. If the time frame between two simulations is such that the package has been updated, it is much safer to encourage users to re-run all simulations, as other elements in the whole stack may have changed.

Additionally, this leads to a more complex return object. This PR moves away from the nice simple data.frame, which is immediately clear to e.g., excel users, to have a more complex list. The change in the output type is also is step back in terms of interoperability IMO since many generic data science packages work directly with data.frames.

As stated on slack, there could be a case for including some metadata in the output if this greatly facilitates the downstream analysis via / the integration with scenarios. But this is a very slippery slope that comes at the cost of increased complexity and a steeper learning curve for the package. Doing it for reproducibility reasons is not correct IMO. As such:

the included info should be limited to the bare minimum. I don't think the package version belongs here
this needs to be synced with a more precise design plan for scenarios

Bisaloo · 2024-01-31T12:22:25Z

R/epidemic_class.R

+
+  # get package version or hash
+  # NOTE: is NULL when installed from local version
+  hash <- utils::packageDescription("epidemics")[["RemoteSha"]]


This doesn't exist for CRAN packages. Having worked on R packages for reproducibility in the past, I can say with certainty this whole topic is much more complex than one might expect, and full of edge cases.

pratikunterwegs · 2024-01-31T13:40:49Z

Thanks - this is still a draft and there will be an oppotunity to discuss this among other things in the upcoming P3 development meeting. I'll update this PR to reflect any decisions we make.

pratikunterwegs · 2024-01-31T15:56:54Z

Thanks for the feedback @Bisaloo - this was discussed in the (just concluded) meeting with @rozeggo, @BlackEdder, @TimTaylor and @bahadzie.

In summary we have decided against the structure of <epidemic> shown in this draft, but an <epidemic> class is likely to be implemented in the foreseeable future.

A discussion of, and decisions about, the new structure of <epidemic> and {epidemics} are available as a link on Slack - any feedback there is welcome. I will open issues based on meeting decisions by mid-day tomorrow, leaving time for any feedback (and I can edit them based on any feedback given later too).

pratikunterwegs · 2024-01-31T16:08:32Z

I don't think it's the right approach to solve the reproducibility issue. This is not something that should be addressed at the package-level, but rather in trainings, such as the one Andree produced on research compendia.
This gives users a false sense of security and could incentivize them to compare outputs in situations where they should not be compared. If the time frame between two simulations is such that the package has been updated, it is much safer to encourage users to re-run all simulations, as other elements in the whole stack may have changed.

We have decided against including any versioning and agree that environment management is a user task.

Additionally, this leads to a more complex return object. This PR moves away from the nice simple data.frame, which is immediately clear to e.g., excel users, to have a more complex list. The change in the output type is also is step back in terms of interoperability IMO since many generic data science packages work directly with data.frames.

I agree and think users should be responsible for parameter management. Keeping in mind that not all users might be able to do this successfully, we have decided to structure model outputs as a class with encapsulated parameters, but hopefully one that can be handled by methods for data.frames - possibly a nested data.frame, data.table, or tibble. Feedback on this is welcome.

As stated on slack, there could be a case for including some metadata in the output if this greatly facilitates the downstream analysis via / the integration with scenarios. But this is a very slippery slope that comes at the cost of increased complexity and a steeper learning curve for the package. Doing it for reproducibility reasons is not correct IMO. As such:

I agree - but see above that we have decided that we should better facilitate parameter management and scenario comparability for users. Again any feedback on having/the eventual structure of <epidemic> are welcome.

pratikunterwegs added 30 commits January 25, 2024 12:41

Add population_change Cpp struct

8a2f7b9

Struct member fn to calc current pop change

a9eeda1

Add pop change mechanic to diphtheria model

9d4b0ff

Population change Cpp struct uses Rcpp objs

5cf793f

Diphtheria model src file + RcppExports updated

5f43382

Diphtheria args checker handles pop_changes

46421ff

Update diphtheria model R frontend and docs

e694bf8

Fix formatting

62199ff

Correct calculation of infectious/hospitalised

d334140

Correct pop change time comparison

f79973f

Test population change for diphtheria model

d6effe4

Misc tests and checks for diphtheria model

3074972

Improve diphtheria checks and tests

b08b524

Add diphtheria model vignette, fixes #157

1a94499

Fix mistake in ebola vignette text

ac6ff57

Update WORDLIST

218aecb

Add <epidemic> class, WIP #156

c0d1344

Unify <epidemic> related docs

00854d6

get_parameter() operates on <epidemic>

8c28b65

Update helper fns for <epidemic> class

ff7db2b

Update model fns to return <epidemic>

9ef4f8a

Update misc Rd files

89d7436

Add tests for <epidemic> class

0473c81

Update all tests to account for <epidemic> return

ecb1fbf

Update or add model fn output snapshots

c36f439

Update NAMESPACE, WORDLIST, and site reference

f507d49

Explicitly namespace head()

1999722

new_infections() returns data.frame not data.table

9c076a1

Update vignettes

417b706

Update Rd files

d4706cb

pratikunterwegs self-assigned this Jan 30, 2024

pratikunterwegs and others added 2 commits January 31, 2024 09:49

Update Readme.Rmd with <epidemic> class

0a24884

Automatic readme update

afc3ad3

Bisaloo requested changes Jan 31, 2024

View reviewed changes

pratikunterwegs closed this Jan 31, 2024

pratikunterwegs deleted the feature/epidemic-class branch February 13, 2024 11:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add an `<epidemic>` class #159

Add an `<epidemic>` class #159

pratikunterwegs commented Jan 29, 2024 •

edited

Loading

Bisaloo left a comment

Bisaloo Jan 31, 2024

pratikunterwegs commented Jan 31, 2024

pratikunterwegs commented Jan 31, 2024

pratikunterwegs commented Jan 31, 2024 •

edited

Loading

Add an <epidemic> class #159

Add an <epidemic> class #159

Conversation

pratikunterwegs commented Jan 29, 2024 • edited Loading

Bisaloo left a comment

Choose a reason for hiding this comment

Bisaloo Jan 31, 2024

Choose a reason for hiding this comment

pratikunterwegs commented Jan 31, 2024

pratikunterwegs commented Jan 31, 2024

pratikunterwegs commented Jan 31, 2024 • edited Loading

Add an `<epidemic>` class #159

Add an `<epidemic>` class #159

pratikunterwegs commented Jan 29, 2024 •

edited

Loading

pratikunterwegs commented Jan 31, 2024 •

edited

Loading