-
Notifications
You must be signed in to change notification settings - Fork 251
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Density geometry revamp #1157
base: master
Are you sure you want to change the base?
Conversation
We now have the ability to generate conditional density distributions! using RDatasets, Gadfly
df = dataset("ggplot2", "diamonds")
plot(df, x=:Carat, color=:Cut, Geom.density(scale=:count, position=:fill)) @Mattriks and @bjarthur do you all know how I can adjust the x and y extents from an |
#781 is related |
working on #1152. Note: This is a WIP and currently completely breaks `Geom.density` and `Geom.violin` has several regressions.
this is necessary for allowing user control over ordering in stacked density plots
My original implementation was too clever in that it figured out the orientation of density and violin plots automatically. The logic ended up being quite convoluted and so I switched back to using the standard `orientation` logic and an explicit flag for whether a density plot is a violin or not. This commit also adds the ability to stack either raw densities and to create conditional density distributions.
There are no set of defaults that apply to both density and violin geometries so it's better if they each have their respective defaults
Simple KDEs should now be working
There's a bug in |
- adds support for horizontal violins - removes manual control over splitting (temp until position code is rewritten)
[ci skip]
Estimate the density of `x` at `n` points, and put the result in `x` and `y`. | ||
Smoothing is controlled by `bandwidth`. Used by [`Geom.density`](@ref Gadfly.Geom.density). | ||
""" | ||
const density = DensityStatistic |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to be consistent with the rest of the API, i'd suggest keeping const density = DensityStatistic
and moving the docstrings for struct DensityStatistic
to the retainedStat.density
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do
A general statistic for density plots (e.g. KDE plots and violin plots). | ||
See [`Geom.density`](@ref Gadfly.Geom.density) or [`Geom.violin`](@ref | ||
Gadfly.Geom.violin) for more details. | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in #1116 i purposely tried to teach new users the GoG way of thinking by making the Geom docstrings explicitly defer explanation of details to the corresponding Stat docstrings for "derived" geometries, not the other way around as you've done here. i feel this is important as it really is a different way of doing things that many find hard to grasp. does that make sense?
also, it would be good to be explicit about which aesthetics this statistic transforms, again so that the flow of data into making a graph is transparent. so in this case x
is transformed into x
and y
and is grouped by color
. in #1116 i defined aes2str
, to which one can input the output of {input,output}_aesthetics()
, to help ensure that docs don't get out of sync with code. might not be useful in this case because of the grouping aesthetic in :color
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. I'll correct this.
|
||
```@example | ||
using Gadfly, RDatasets | ||
plot(dataset("ggplot2", "diamonds"), x=:Carat, color=:Cut, Geom.density(position=:fill), Guide.title("Conditional density estimate"), Coord.cartesian(ymax=1.0, xmax=5)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this long line will likely create a horizontal slider in the generated doc html. a hard line break would be nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, I had already corrected this locally.
Geom.violin[(; bandwidth, adjust, kernel, trim, order)] | ||
|
||
Draws a violin plot which is a combination of [`Geom.density`](@ref) and | ||
[`Geom.boxplot`](@ref). This plot type is useful for comparing differences in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't see a call to Geom.boxplot in the Geom.violin code. is this docstring correct in this regard?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant it stylistically, not technically, but that could change (see #1157 (comment))
nothing # hide | ||
``` | ||
![](diamonds_violin1.svg) | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no other Geom docstrings currently have examples. i'd suggest moving this to the gallery.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this would be nice to change, but it depends on JuliaDocs/Documenter.jl#736 anyway so I'll remove this for now.
""" | ||
Geom.density(; bandwidth, adjust, kernel, trim, scale, position, orientation, order) | ||
|
||
Draws a kernel density estimate. This is a cousin of [`Geom.histogram`](@ref) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be nice to document how Geom.density(Stat.identity) behaved. in other words, what aesthetics does Geom.density directly use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I'll add this.
""" | ||
const density = DensityGeometry | ||
|
||
element_aesthetics(::DensityGeometry) = Symbol[] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
element_aesthetics should contain :x
, :y
, and :color
, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I don't leave this blank, they are filled with autogenerated values so it's impossible to give useful error messages using Gadfly.assert_aesthetics_defined
. I wasn't sure how to get around this so I leave this blank and figure out errors later: https://github.com/GiovineItalia/Gadfly.jl/pull/1157/files#diff-9ec506bf78232ae17d082c22c2e66449R616
looking forward to having this functionality! in general it looks great. my only big question is whether separate Geoms are necessary (density and violin) with their own typedefs and render functions, or whether Stat.density could be combined with Geom.line, and Geom.density and Geom.violin could simply be aliases. sorry for all the hassle resolving the conflicts with #1116. |
also, does this break anything? is it backwards compatible with the former Geom.density and Geom.violin? if not, depwarns are warranted. and we should think about whether to merge it before or after fixing #1130 |
Thanks for the review. I won't have time to fix this for a bit and I want to make sure we get it right before merging.
AFAIK, there shouldn't be any regressions or changes to the API. There are however some changes to defaults that I think are better than the current ones. The primary change is that density plots by default now clip to the range of the data (this the same as
This might be achievable. The primary difficulty is that these two geoms are grouped in very different ways.
They are both essentially I think in general we should think about separating out the positioning code into structs because a lot of it is boilerplate. The same positioning code should work for bars, boxplots, violins, etc. This is what |
do you want to merge this before or after we support 0.7? there will likely be merge conflicts if we wait. |
I don't have time right now to revamp this PR so I'm okay with letting it get a bit stale and then finalize it a bit later. There are enough small edge cases that I need to handle that it's better to sit on this till it's ready. |
Split violin plot is a very useful function and it seems it's still not part of |
agreed! @tlnagy ?? |
Fixes #1152
Contributor checklist:
NEWS.md
squash
'ed orfixup
'ed junk commits with git-rebaseImplemented changes
Geom.density
: custom kernels, custom scaling of densities, stacking, trimmed distributions, adjustments of the computed bandwidth, vertical and horizontal orientationsGeom.violin
also gets all these features plus the ability to create split violins andgroup by color(EDIT: this is bigger than this PR) as wellGeom.violin
andGeom.density
now share the sameDensityStatistic
.There is still some ugliness here because I infer whether the user wants aEDIT: I now rely on a manual setting ondensity
orviolin
plot based on what variables are initialized in the aesthetic. I'm not wedded to this approach.orientation
Geom.boxplot
and maybe evenGeom.bar
.Here's a little taste of what's going to be possible soon:
TODO
render()
forGeom.density
Geom.violin
render()