DESIGN: Save package (versions) instead of unmarshal function #11

sebffischer · 2024-02-22T10:15:36Z

I think it might be an idea to replace the default marshal() implementation by not storing the unmarshal function in the marshalled object, but instead by saving the required package (versions).
I.e. I am talking about this line here: https://github.com/HenrikBengtsson/marshal/blob/72e976c61f00861ceb0b4c0c0fd7016b4eee417a/R/marshal.default.R#L30

This would mean, that instead of only having to implement the marshal.<class> method, one would have to implement.
marshal.<class> and unmarshal.<class>. The unmarshal generic could then verify that the packages required to unmarshal the object are loaded (including the package that contains the unmarshal generic) before dispatching onto the method.

The advantages of this approach would be:

memory efficiency: saving the required package (versions) uses less space then saving the unmarshal function. Especially with the --with-keep.source option, the size of marshaled objects can explode with the current approach (admittedly, one could also just remove the srcrefs manually to address the latter problem). This is somewhat reminicent of the R6 problems we were facing in mlr3 and our workaround now is similar to what I am suggesting here, i.e. not store the methods alongside the objects but in the package.
One could also think about even storing the package versions to give even better error messages and make this part of the standard (e.g. a compatibility matrix is stored alongside the package implementing the (un)marshal method and can be consulted when calling (un)marshal. Here we would have to take care what happens when reading an object written by package version A with package version B, where B < A, as package with version B does not know whether its format can be read by package with version A, as it did not know of its existence when it was released.

The disadvantages would be:

The package that implements the (un)marshal methods must be loaded, which is not the case right now.
However, if this was to become a standard, the package that implements (un)marshal should usually also be the package that implements the actual functionality to do the marshaling.
In the other approach, the unmarshal function is ensured to be the same that was used to marshal the object.

The text was updated successfully, but these errors were encountered:

sebffischer changed the title ~~Idea: Save package (versions) instead of unmarshal function~~ DESIGN: Save package (versions) instead of unmarshal function Feb 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DESIGN: Save package (versions) instead of unmarshal function #11

DESIGN: Save package (versions) instead of unmarshal function #11

sebffischer commented Feb 22, 2024 •

edited

Loading

DESIGN: Save package (versions) instead of unmarshal function #11

DESIGN: Save package (versions) instead of unmarshal function #11

Comments

sebffischer commented Feb 22, 2024 • edited Loading

sebffischer commented Feb 22, 2024 •

edited

Loading