-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add asdict() #109
feat: Add asdict() #109
Conversation
Note - I've tried changing the type hint to |
I'm sorry for the delay, today had a lot of non-work emergencies. This is my suggestion:
You need to add the
I created and pushed a branch that does this: tb-add-asdict. The tests pass. |
Thanks! I wanted to avoid another I found a solution using |
035630d
to
16d3f60
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've never heard of TypeGuard
before, that's cool. I think that's probably the recipe to clean up a lot of the type ignore statements currently in inspect.
My only remaining suggestion is that I think asdict
could be a member function of the Metric
class, with all the statements just acting on self
.
I had the same thought, but I'm of two minds. I strongly prefer being consistent with established convention when possible, and both However, packaging it as a method removes the need for an import (and possibly makes it more discoverable). Curious what @nh13 @tfenne @clintval think (NB: if we were to make |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see why this convenience function is desired. It is useful when you have a Metric
and you don't know if it is an attrs-defined or dataclass-defined instance. In my experience, I usually know which flavor I'm dealing with and use the respective import accordingly:
attr.asdict(metric)
dataclasses.asdict(metric)
I lean towards letting the user import their specific "as dict" implementation for their use case over providing another import to do functionally the same thing but I won't let that opinion of mine block this PR! Here's another idea though, what about adding an __iter__(self)
dunder method on the base Metric class so we can start doing dict(metric)
instead, which uses a built-in? I'm also a bigger fan of as_dict()
instead of asdict()
if we're allowed to vote on function naming too!
Will this function be needed when we eventually remove attrs support? Should we consider not adding it to the public API because eventually all Metrics should be using @dataclass
and can use the corresponding dataclasses.asdict
built-in?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is useful when you have a Metric and you don't know if it is an attrs-defined or dataclass-defined instance. In my experience, I usually know which flavor I'm dealing with and use the respective import accordingly:
Exactly, it's a convenience function to abstract the concern away.
IMO if we intend to support both attr.s
and dataclass
, then we should have an API that works with Metric
(as a pseudo-alias for the union of AttrsInstance
and DataclassInstance
).
This is also intended to facilitate the MetricWriter
in #107 and other such utilities which may not be able to assume which import to use.
what about adding an iter(self) dunder method on the base Metric class so we can start doing dict(metric) instead, which uses a built-in?
See below comment - dict()
and asdict()
do different things, and I think the latter implementation is preferable here.
I'm also a bigger fan of as_dict() instead of asdict() if we're allowed to vote on function naming too!
I agree that snakecasing is generally preferable, but I have a stronger preference for not conflicting with the established naming convention from dataclass
and attr.s
Will this function be needed when we eventually remove attrs support? Should we consider not adding it to the public API because eventually all Metrics should be using @DataClass and can use the corresponding dataclasses.asdict built-in?
At that time we could simply import dataclasses.asdict
into this module to avoid breakage? Or replace from fgpyo.util.metric import asdict
with from dataclasses import asdict
(which I consider another argument in favor of leaving the naming as is)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense. I'm onboard.
My approval contingent on cleaning up the actually reachable-unreachable branches with a TypeError
. Thanks Matt!
|
||
Returns: | ||
A dictionary representation of the given metric. | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This overlooks the format_value
method on Metric
that is used to format values when written to a file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nh13 the omission was deliberate. I left a comment on the topic in the PR description, but given how much conversation this one has attracted it was easy to overlook 🙂
NB: it may be valuable to permit formatting/casting the values as part of this function (e.g. by adding a parameter
format: bool = False
), but to do so we'll probably want to extractMetric.format_value()
to a standalone function instead of a classmethod
I do not think we should have an asdict()
function that changes the value types by default. This function is primarily intended as a dispatcher, selecting the correct (dataclasses
or attr.s
) function depending on how the Metric
in question was decorated.
Ideally, this function will be deprecated once we drop support for attrs
. As I mentioned in my comments above to Clint, when that happens it would be preferable to be able to transparently replace this with dataclasses.asdict
.
At that time we could simply import
dataclasses.asdict
into this module to avoid breakage? Or replacefrom fgpyo.util.metric import asdict
withfrom dataclasses import asdict
I am open to adding an argument to optionally support formatting (e.g. format_values: bool = False
), with the stipulation that it should be False
by default and the caveat that I expect it to add debt and increase friction when we deprecate attr.s
.
However, I'd prefer to leave as is, and then call format_value
on the resulting dict when necessary, e.g.
metric_dict: dict[str, str] = {key: format_value(val) for key, val in asdict(metric)}
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let’s omit it but document it clearly.
I do like that the format_value is a class method so that any new Metric type can perform custom formatting but overriding the class method. I think passing in a parsing function, like defopt, in rare situations causes a conflict when we want to format the type differently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of overriding the classmethod, but wouldn't consider doing so in practice. Given that it specifies formatting behavior for a wide variety of primitive and compound types, it doesn't seem to lend itself to easy extension or modification. (Since in order to override formatting for one type, you would have to override them all.)
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #109 +/- ##
==========================================
+ Coverage 88.53% 88.74% +0.20%
==========================================
Files 16 16
Lines 1727 1750 +23
Branches 321 372 +51
==========================================
+ Hits 1529 1553 +24
+ Misses 132 131 -1
Partials 66 66 ☔ View full report in Codecov by Sentry. |
fix: typeguard import doc: update docstring refactor: import Instance types refactor: import Instance types
* feat: add asdict * fix: typeguard fix: typeguard import doc: update docstring refactor: import Instance types refactor: import Instance types * fix: 3.8 compatible Dict typing * refactor: make instance checks private * tests: add coverage * fix: typeerror * doc: clarify that asdict does not format values
* feat: add asdict * fix: typeguard fix: typeguard import doc: update docstring refactor: import Instance types refactor: import Instance types * fix: 3.8 compatible Dict typing * refactor: make instance checks private * tests: add coverage * fix: typeerror * doc: clarify that asdict does not format values
* feat: add asdict * fix: typeguard fix: typeguard import doc: update docstring refactor: import Instance types refactor: import Instance types * fix: 3.8 compatible Dict typing * refactor: make instance checks private * tests: add coverage * fix: typeerror * doc: clarify that asdict does not format values
I'd like to add a function to convert a
Metric
instance to a dict. (In support of #107 )I'm currently receiving the following mypy errors, which I suspect are due to a bug in how
Metric
is typed - as mypy appears to be inferring thatMetric
is a union ofDataclassInstance
andtype[DataclassInstance]
, when it should only be the former.NB: it may be valuable to permit formatting/casting the values as part of this function (e.g. by adding a parameter
format: bool = False
), but to do so we'll probably want to extractMetric.format_value()
to a standalone function instead of a classmethodEdit: copying my explanation of the updated solution from Slack:
For the curious - the issue was two-fold, and due to my own incorrect typing.
dataclasses.is_dataclass
accepts either an instance or a class object, and has aTypeGuard
to narrow the type of the argument toDataclassInstance | type[DataclassInstance]
. I had to add a helper function to override this guard and narrow the type further toDataclassInstance
.Meanwhile,
attr.has
only accepts a class object. I was passing an instance, which was the source of one type error. Fixing this by callingattr.has(metric.__class__
was insufficient, because this did not narrow the type of the metric instance, so I added a similar helper forAttrsInstance
.