Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DESIGN: Save package (versions) instead of unmarshal function #11

Open
sebffischer opened this issue Feb 22, 2024 · 0 comments
Open

DESIGN: Save package (versions) instead of unmarshal function #11

sebffischer opened this issue Feb 22, 2024 · 0 comments

Comments

@sebffischer
Copy link

sebffischer commented Feb 22, 2024

I think it might be an idea to replace the default marshal() implementation by not storing the unmarshal function in the marshalled object, but instead by saving the required package (versions).
I.e. I am talking about this line here: https://github.com/HenrikBengtsson/marshal/blob/72e976c61f00861ceb0b4c0c0fd7016b4eee417a/R/marshal.default.R#L30

This would mean, that instead of only having to implement the marshal.<class> method, one would have to implement.
marshal.<class> and unmarshal.<class>. The unmarshal generic could then verify that the packages required to unmarshal the object are loaded (including the package that contains the unmarshal generic) before dispatching onto the method.

The advantages of this approach would be:

  • memory efficiency: saving the required package (versions) uses less space then saving the unmarshal function. Especially with the --with-keep.source option, the size of marshaled objects can explode with the current approach (admittedly, one could also just remove the srcrefs manually to address the latter problem). This is somewhat reminicent of the R6 problems we were facing in mlr3 and our workaround now is similar to what I am suggesting here, i.e. not store the methods alongside the objects but in the package.
  • One could also think about even storing the package versions to give even better error messages and make this part of the standard (e.g. a compatibility matrix is stored alongside the package implementing the (un)marshal method and can be consulted when calling (un)marshal. Here we would have to take care what happens when reading an object written by package version A with package version B, where B < A, as package with version B does not know whether its format can be read by package with version A, as it did not know of its existence when it was released.

The disadvantages would be:

  • The package that implements the (un)marshal methods must be loaded, which is not the case right now.
    However, if this was to become a standard, the package that implements (un)marshal should usually also be the package that implements the actual functionality to do the marshaling.
  • In the other approach, the unmarshal function is ensured to be the same that was used to marshal the object.
@sebffischer sebffischer changed the title Idea: Save package (versions) instead of unmarshal function DESIGN: Save package (versions) instead of unmarshal function Feb 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant