r - Density of each group of weighted geom_density sum to one -
how can group density plot , have density of each group sum one, when using weighted data?
the ggplot2
geom_density()
suggests hack using weighted data: dividing sum of weights. when grouped, means combined density of groups totals one. density of each group total one.
i have found 2 clumsy ways this. first treat each group separate dataset:
m <- ggplot() m + geom_density(data = movies[movies$action == 0, ], aes(rating, weight = votes/sum(votes)), fill=na, colour="black") + geom_density(data = movies[movies$action == 1, ], aes(rating, weight = votes/sum(votes)), fill=na, colour="blue")
obvious disadvantages manual handling of factor levels , aesthetics. tried using windowing functionality of data.table
package create new column total votes per action group, dividing instead:
movies.dt <- data.table(movies) setkey(movies.dt, action) movies.dt[, votes.per.group := sum(votes), action] m <- ggplot(movies.dt, aes(x=rating, weight=votes/votes.per.group, group = action, colour = action)) m + geom_density(fill=na)
are there neater ways this? because of size of tables, i'd rather not replicate rows weighting sake of using frequency.
i think auxillary table might option. had similar problem here. issue seems that, when ggplot
uses aggregating functions in aes(...)
, applies them whole dataset, not subsetted data. when write
aes(weight=votes/sum(votes))
the votes
in numerator subsetted based on action
, votes in denominator, sum(votes)
, not. same true implicit grouping facets.
if else has way around i'd love hear it.
Comments
Post a Comment