Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

match inner and outer keys #46

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Project.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
name = "KeyedDistributions"
uuid = "2576fb08-064d-4cab-b15d-8dda7fcb9a6d"
authors = ["Invenia Technical Computing Corporation"]
version = "0.1.11"
version = "0.1.12"

[deps]
AutoHashEquals = "15f4f7f2-30c1-5605-9d31-71845cf9641f"
Expand Down
12 changes: 12 additions & 0 deletions src/KeyedDistributions.jl
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,12 @@ for T in (:Distribution, :Sampleable)
"lengths of key vectors $key_lengths must match " *
"size of distribution $(_size(d))"
))
if d isa Distribution && mean(d) isa KeyedArray && !(axiskeys(mean(d)) == keys)
Copy link
Member

@glennmoy glennmoy Nov 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

won't we run into the same issue with cov? or any other function that might be return a KeyedArray?

Would it be better to rekey the underlying parameters and just mask the exception altogether?

I feel like wrapping a MvN around a KeyedArray, then wrapping that in a KeyedDistribution is akin to rekeying anyway?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we won't need to look at cov actually

julia> mvn = MvNormal(ka, sigma);

julia> mean(mvn)
1-dimensional KeyedArray(NamedDimsArray(...)) with keys:
   t  3-element Vector{String}
And data, 3-element Vector{Float64}:
 ("a")  0.3981265148452625
 ("b")  0.23250652731094346
 ("c")  0.09181862970601418

julia> cov(mvn)
3×3 Matrix{Float64}:
 0.0702741  0.0       0.0
 0.0        0.689994  0.0
 0.0        0.0       0.800233

# which is because of

julia> cholesky(sigma)
Cholesky{Float64, Matrix{Float64}}
U factor:
3×3 UpperTriangular{Float64, Matrix{Float64}}:
 0.265093  0.0       0.0
          0.830659  0.0
                   0.894558

and yes this

Would it be better to rekey the underlying parameters and just mask the exception altogether?

is a good idea

Copy link
Collaborator Author

@mzgubic mzgubic Nov 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to rekey the underlying parameters and just mask the exception altogether?

I think this is actually quite hard. I think we would have to recreate the underlying distribution with the keyless_unnamed arrays. The issue is that constructors for distributions differ. We could possibly do something nasty like keyless_unname.(fieldnames...) and call the default constructor but it sounds like asking for trouble.

Does that leave us with throwing an exception? Or should we simply continue ostriching the issue of mismatching inner and outer keys?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be good to prevent confusion where possible, so support having a

  1. add the mean sanity check to give a warning/exception generally
  2. rekey for common distributions such as MvNormal (with warning)

Support can then be extended to other distributions as need arises, and we can document the care that should be taken when using a general distribution with keyed fields

throw(ArgumentError(
"Distribution keys $(axiskeys(mean(d))) do not match " *
"KeyedDistribution keys $(keys)"
))
end
L = Tuple(:_ for _ in 1:length(key_lengths))
return new{F, S, typeof(d), L}(d, keys)
end
Expand All @@ -65,6 +71,12 @@ for T in (:Distribution, :Sampleable)
"lengths of key vectors $key_lengths must match " *
"size of distribution $(_size(d))"
))
if d isa Distribution && mean(d) isa KeyedArray && !(named_axiskeys(mean(d)) == named_keys)
throw(ArgumentError(
"Distribution keys $(named_axiskeys(mean(d))) do not match " *
"KeyedDistribution keys $(named_keys)"
))
end

return new{F, S, typeof(d), keys(named_keys)}(d, values(named_keys))
end
Expand Down
8 changes: 8 additions & 0 deletions test/runtests.jl
Original file line number Diff line number Diff line change
Expand Up @@ -347,6 +347,14 @@ using Test

@test d([1]) == d[[1]] == KeyedDistribution(GenericMvTDist(3, m[[1]], submat(W, [1])), [1])
end

@testset "construct with distribution backed by KeyedArray" begin
ka = wrapdims(rand(3); t=["a", "b", "c"])
mvn = MvNormal(ka, ones(3))
@test_throws ArgumentError KeyedDistribution(mvn, ["a", "b", "not c"])
@test_throws ArgumentError KeyedDistribution(mvn; t=["a", "b", "not c"])
@test_throws ArgumentError KeyedDistribution(mvn; not_t=["a", "b", "c"])
end
end

@testset "NamedDims functions" begin
Expand Down