dplyr R组总数因因素而异
Posted
技术标签:
【中文标题】dplyr R组总数因因素而异【英文标题】:dplyr R group totals differ between factors 【发布时间】:2015-01-22 18:42:14 【问题描述】:我整天都在尝试用 dplyr 总结 R 中的数据集,但似乎 a) 得到的结果不一致,或者 b) 已经停止思考!
我的数据集“resi_type”(见下文)捕获了一个城市的人口数据和垃圾产生率。我试图在三个层次上总结这些数字: 地理区域、选区和社会经济区域(低、中、高)。这些是按大小排列的,例如有 437 个独立领域属于三个社会经济类别之一;这些都只在六个选区之一中;并且六个选区位于四个 geo_zones 之一中。
dplyr 的速度和易用性给我留下了深刻的印象,并且一直在使用它来总结我的数据并添加一些计算值。例如,在选区级别汇总整个数据集:
constit_sum <-
resi_type %.%
group_by(constit, grouped_types) %.%
summarise(area = sum(area),
constit_pop = first(constit_pop),
wg_rate = first(per_capita)) %.%
mutate(prop = area/sum(area)*100) %.%
mutate(pop = (prop/100)*constit_pop) %.%
mutate(waste = (wg_rate*pop)/1000)
这为我制作了一张漂亮的桌子。如果对行求和,我的总人口为 939,370,总“废物”产量为 744.3238(吨)。
现在,在尝试生成县级摘要时,我一直在使用两段代码。当我跨行求和时,这个产生与上面相同的结果,例如总人口为 939,370,总“废物”产量为 744.3238(吨):
county_sum <-
constit_sum %.%
group_by(grouped_types) %.%
summarise(area = sum(area),
waste = sum(waste))
但是,以下代码块是解决同一问题的一种略有不同的方法,会产生不同的结果,例如总人口为 939,370,总“废物”产量为 757.8447(吨)
county_sum2 <-
resi_type %.%
group_by(grouped_types) %.%
summarise(area = sum(area),
wg_rate = first(per_capita)) %.%
mutate(prop = area/sum(area)*100) %.%
mutate(pop = (prop/100)*939370) %.%
mutate(waste = (wg_rate*pop)/1000)
我似乎在绕圈子,因为我假设无论您如何分割数据,无论我进行何种级别的分析,产生的总浪费都应该是相同的?或者我的代码可能有问题?我真的是斗鸡眼了,我希望有人能帮我解释一下?!
希望……!
提前致谢
马蒂
数据集 = resi_type
structure(list(geo_zone = structure(c(2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 3L, 3L, 3L, 1L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L), .Label = c("Mainland North",
"Mainland South", "Mainland West", "Mombasa Island"), class = "factor"),
constit = c("likoni", "likoni", "likoni", "likoni", "likoni",
"likoni", "likoni", "likoni", "likoni", "likoni", "likoni",
"likoni", "likoni", "likoni", "likoni", "likoni", "likoni",
"likoni", "likoni", "likoni", "likoni", "likoni", "likoni",
"likoni", "likoni", "likoni", "likoni", "likoni", "likoni",
"likoni", "likoni", "likoni", "likoni", "likoni", "likoni",
"likoni", "likoni", "likoni", "likoni", "likoni", "likoni",
"likoni", "likoni", "likoni", "likoni", "likoni", "likoni",
"likoni", "likoni", "likoni", "likoni", "likoni", "likoni",
"likoni", "likoni", "nyali", "nyali", "nyali", "likoni",
"likoni", "likoni", "likoni", "nyali", "nyali", "nyali",
"nyali", "nyali", "nyali", "nyali", "nyali", "nyali", "nyali",
"nyali", "nyali", "nyali", "nyali", "nyali", "nyali", "nyali",
"nyali", "nyali", "nyali", "nyali", "nyali", "nyali", "nyali",
"nyali", "nyali", "nyali", "kisauni", "nyali", "nyali", "nyali",
"nyali", "nyali", "nyali", "nyali", "nyali", "nyali", "nyali",
"nyali", "kisauni", "kisauni", "nyali", "nyali", "nyali",
"nyali", "kisauni", "kisauni", "nyali", "nyali", "nyali",
"nyali", "nyali", "nyali", "nyali", "kisauni", "kisauni",
"kisauni", "kisauni", "kisauni", "nyali", "nyali", "nyali",
"nyali", "nyali", "nyali", "kisauni", "kisauni", "nyali",
"nyali", "nyali", "kisauni", "kisauni", "kisauni", "kisauni",
"kisauni", "kisauni", "nyali", "kisauni", "kisauni", "kisauni",
"kisauni", "kisauni", "kisauni", "kisauni", "nyali", "nyali",
"nyali", "kisauni", "kisauni", "kisauni", "kisauni", "kisauni",
"nyali", "nyali", "nyali", "nyali", "kisauni", "kisauni",
"kisauni", "nyali", "nyali", "nyali", "nyali", "kisauni",
"kisauni", "nyali", "nyali", "nyali", "nyali", "nyali", "nyali",
"nyali", "nyali", "nyali", "kisauni", "kisauni", "kisauni",
"kisauni", "kisauni", "nyali", "nyali", "nyali", "nyali",
"nyali", "nyali", "nyali", "nyali", "nyali", "nyali", "nyali",
"nyali", "nyali", "nyali", "nyali", "nyali", "nyali", "nyali",
"nyali", "kisauni", "nyali", "nyali", "kisauni", "kisauni",
"kisauni", "kisauni", "kisauni", "kisauni", "kisauni", "kisauni",
"nyali", "kisauni", "kisauni", "kisauni", "kisauni", "kisauni",
"kisauni", "kisauni", "kisauni", "kisauni", "kisauni", "jomvu",
"jomvu", "jomvu", "nyali", "jomvu", "jomvu", "jomvu", "jomvu",
"jomvu", "jomvu", "jomvu", "jomvu", "jomvu", "jomvu", "jomvu",
"jomvu", "jomvu", "jomvu", "jomvu", "jomvu", "jomvu", "jomvu",
"jomvu", "jomvu", "jomvu", "jomvu", "jomvu", "changamwe",
"jomvu", "jomvu", "jomvu", "jomvu", "jomvu", "jomvu", "jomvu",
"changamwe", "jomvu", "jomvu", "jomvu", "jomvu", "jomvu",
"jomvu", "jomvu", "jomvu", "jomvu", "jomvu", "jomvu", "jomvu",
"changamwe", "changamwe", "changamwe", "changamwe", "changamwe",
"changamwe", "changamwe", "changamwe", "changamwe", "changamwe",
"changamwe", "changamwe", "changamwe", "changamwe", "changamwe",
"changamwe", "changamwe", "changamwe", "changamwe", "changamwe",
"changamwe", "changamwe", "changamwe", "changamwe", "changamwe",
"changamwe", "changamwe", "changamwe", "changamwe", "changamwe",
"changamwe", "changamwe", "changamwe", "changamwe", "changamwe",
"changamwe", "changamwe", "changamwe", "changamwe", "mvita",
"mvita", "mvita", "mvita", "mvita", "mvita", "mvita", "mvita",
"mvita", "mvita", "mvita", "mvita", "mvita", "mvita", "mvita",
"mvita", "mvita", "mvita", "changamwe", "changamwe", "changamwe",
"changamwe", "changamwe", "changamwe", "changamwe", "changamwe",
"mvita", "mvita", "mvita", "mvita", "mvita", "mvita", "mvita",
"mvita", "mvita", "mvita", "mvita", "mvita", "mvita", "mvita",
"mvita", "mvita", "mvita", "mvita", "mvita", "mvita", "mvita",
"mvita", "mvita", "mvita", "mvita", "mvita", "mvita", "mvita",
"mvita", "mvita", "mvita", "mvita", "mvita", "mvita", "mvita",
"mvita", "mvita", "mvita", "mvita", "mvita", "mvita", "mvita",
"mvita", "mvita", "mvita", "mvita", "mvita", "mvita", "mvita",
"mvita", "mvita", "mvita", "mvita", "mvita", "mvita", "mvita",
"mvita", "mvita", "mvita", "mvita", "mvita", "mvita", "mvita",
"mvita", "mvita", "mvita", "mvita", "mvita", "mvita", "mvita",
"mvita", "mvita", "mvita", "mvita", "mvita", "mvita", "mvita",
"mvita", "mvita", "mvita", "mvita", "mvita", "mvita", "mvita",
"mvita", "mvita", "mvita", "mvita", "mvita", "mvita", "mvita",
"mvita", "mvita", "mvita", "mvita", "mvita", "mvita", "mvita",
"mvita", "mvita", "mvita", "mvita"), constit_pop = c(166008L,
166008L, 166008L, 166008L, 166008L, 166008L, 166008L, 166008L,
166008L, 166008L, 166008L, 166008L, 166008L, 166008L, 166008L,
166008L, 166008L, 166008L, 166008L, 166008L, 166008L, 166008L,
166008L, 166008L, 166008L, 166008L, 166008L, 166008L, 166008L,
166008L, 166008L, 166008L, 166008L, 166008L, 166008L, 166008L,
166008L, 166008L, 166008L, 166008L, 166008L, 166008L, 166008L,
166008L, 166008L, 166008L, 166008L, 166008L, 166008L, 166008L,
166008L, 166008L, 166008L, 166008L, 166008L, 185990L, 185990L,
185990L, 166008L, 166008L, 166008L, 166008L, 185990L, 185990L,
185990L, 185990L, 185990L, 185990L, 185990L, 185990L, 185990L,
185990L, 185990L, 185990L, 185990L, 185990L, 185990L, 185990L,
185990L, 185990L, 185990L, 185990L, 185990L, 185990L, 185990L,
185990L, 185990L, 185990L, 185990L, 194065L, 185990L, 185990L,
185990L, 185990L, 185990L, 185990L, 185990L, 185990L, 185990L,
185990L, 185990L, 194065L, 194065L, 185990L, 185990L, 185990L,
185990L, 194065L, 194065L, 185990L, 185990L, 185990L, 185990L,
185990L, 185990L, 185990L, 194065L, 194065L, 194065L, 194065L,
194065L, 185990L, 185990L, 185990L, 185990L, 185990L, 185990L,
194065L, 194065L, 185990L, 185990L, 185990L, 194065L, 194065L,
194065L, 194065L, 194065L, 194065L, 185990L, 194065L, 194065L,
194065L, 194065L, 194065L, 194065L, 194065L, 185990L, 185990L,
185990L, 194065L, 194065L, 194065L, 194065L, 194065L, 185990L,
185990L, 185990L, 185990L, 194065L, 194065L, 194065L, 185990L,
185990L, 185990L, 185990L, 194065L, 194065L, 185990L, 185990L,
185990L, 185990L, 185990L, 185990L, 185990L, 185990L, 185990L,
194065L, 194065L, 194065L, 194065L, 194065L, 185990L, 185990L,
185990L, 185990L, 185990L, 185990L, 185990L, 185990L, 185990L,
185990L, 185990L, 185990L, 185990L, 185990L, 185990L, 185990L,
185990L, 185990L, 185990L, 194065L, 185990L, 185990L, 194065L,
194065L, 194065L, 194065L, 194065L, 194065L, 194065L, 194065L,
185990L, 194065L, 194065L, 194065L, 194065L, 194065L, 194065L,
194065L, 194065L, 194065L, 194065L, 117487L, 117487L, 117487L,
185990L, 117487L, 117487L, 117487L, 117487L, 117487L, 117487L,
117487L, 117487L, 117487L, 117487L, 117487L, 117487L, 117487L,
117487L, 117487L, 117487L, 117487L, 117487L, 117487L, 117487L,
117487L, 117487L, 117487L, 132692L, 117487L, 117487L, 117487L,
117487L, 117487L, 117487L, 117487L, 132692L, 117487L, 117487L,
117487L, 117487L, 117487L, 117487L, 117487L, 117487L, 117487L,
117487L, 117487L, 117487L, 132692L, 132692L, 132692L, 132692L,
132692L, 132692L, 132692L, 132692L, 132692L, 132692L, 132692L,
132692L, 132692L, 132692L, 132692L, 132692L, 132692L, 132692L,
132692L, 132692L, 132692L, 132692L, 132692L, 132692L, 132692L,
132692L, 132692L, 132692L, 132692L, 132692L, 132692L, 132692L,
132692L, 132692L, 132692L, 132692L, 132692L, 132692L, 132692L,
143128L, 143128L, 143128L, 143128L, 143128L, 143128L, 143128L,
143128L, 143128L, 143128L, 143128L, 143128L, 143128L, 143128L,
143128L, 143128L, 143128L, 143128L, 132692L, 132692L, 132692L,
132692L, 132692L, 132692L, 132692L, 132692L, 143128L, 143128L,
143128L, 143128L, 143128L, 143128L, 143128L, 143128L, 143128L,
143128L, 143128L, 143128L, 143128L, 143128L, 143128L, 143128L,
143128L, 143128L, 143128L, 143128L, 143128L, 143128L, 143128L,
143128L, 143128L, 143128L, 143128L, 143128L, 143128L, 143128L,
143128L, 143128L, 143128L, 143128L, 143128L, 143128L, 143128L,
143128L, 143128L, 143128L, 143128L, 143128L, 143128L, 143128L,
143128L, 143128L, 143128L, 143128L, 143128L, 143128L, 143128L,
143128L, 143128L, 143128L, 143128L, 143128L, 143128L, 143128L,
143128L, 143128L, 143128L, 143128L, 143128L, 143128L, 143128L,
143128L, 143128L, 143128L, 143128L, 143128L, 143128L, 143128L,
143128L, 143128L, 143128L, 143128L, 143128L, 143128L, 143128L,
143128L, 143128L, 143128L, 143128L, 143128L, 143128L, 143128L,
143128L, 143128L, 143128L, 143128L, 143128L, 143128L, 143128L,
143128L, 143128L, 143128L, 143128L, 143128L, 143128L, 143128L,
143128L, 143128L), grouped_types = structure(c(1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 3L, 3L, 3L, 3L, 3L, 1L, 1L,
1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 3L, 1L, 1L, 2L, 1L, 1L,
1L, 1L, 2L, 1L, 1L, 1L, 1L, 3L, 3L, 3L, 1L, 1L, 1L, 1L, 3L,
1L, 3L, 3L, 1L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 3L, 3L, 3L,
3L, 3L, 1L, 1L, 3L, 3L, 1L, 2L, 1L, 1L, 1L, 3L, 3L, 3L, 1L,
1L, 1L, 1L, 1L, 1L, 3L, 1L, 1L, 1L, 1L, 2L, 2L, 3L, 3L, 1L,
1L, 1L, 1L, 2L, 2L, 3L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 2L, 2L,
2L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 2L,
3L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 1L, 1L, 2L, 2L, 2L,
1L, 3L, 1L, 1L, 2L, 3L, 1L, 2L, 1L, 1L, 2L, 2L, 3L, 2L, 2L,
3L, 3L, 3L, 1L, 1L, 2L, 2L, 1L, 2L, 1L, 1L, 1L, 1L, 3L, 2L,
2L, 2L, 2L, 2L, 2L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 3L, 2L, 2L,
1L, 1L, 1L, 1L, 2L, 2L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 3L,
1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 3L, 1L, 1L,
1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L,
1L, 1L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 2L, 2L,
2L, 3L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 2L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 1L, 1L, 2L,
1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L,
2L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 2L, 1L, 2L, 2L, 2L, 1L,
1L, 3L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L), .Label = c("low",
"medium", "high"), class = "factor"), area = c(191461L, 168928L,
881707L, 43546L, 39139L, 98823L, 22327L, 86611L, 58418L,
60798L, 56243L, 239978L, 4088L, 4184L, 39670L, 32892L, 33701L,
51806L, 52514L, 50464L, 128450L, 36563L, 7722L, 242320L,
4166L, 36158L, 70785L, 43925L, 70614L, 28876L, 40984L, 725320L,
628762L, 78149L, 39327L, 1040985L, 286288L, 324669L, 237726L,
269182L, 389987L, 291501L, 6729L, 322114L, 154042L, 73540L,
421880L, 7814L, 31154L, 52520L, 20187L, 570102L, 143059L,
1032717L, 238886L, 487527L, 412965L, 44010L, 337910L, 103214L,
898486L, 257766L, 2346497L, 40165L, 286848L, 15204L, 14741L,
33759L, 31367L, 10659L, 16966L, 8623L, 35058L, 5673L, 68267L,
705371L, 949977L, 486446L, 294029L, 134848L, 302860L, 33036L,
53250L, 68122L, 62749L, 68404L, 73999L, 132052L, 276413L,
96435L, 93399L, 11964L, 63652L, 43107L, 41734L, 8942L, 106987L,
73466L, 15260L, 34230L, 26790L, 102071L, 66920L, 53821L,
41854L, 112277L, 35686L, 121283L, 442860L, 161307L, 268652L,
99335L, 132137L, 44353L, 12662L, 46855L, 82968L, 107930L,
508985L, 62745L, 938082L, 22117L, 936310L, 94858L, 73017L,
52606L, 6627L, 24497L, 259809L, 232665L, 54735L, 198384L,
625396L, 210216L, 69182L, 122513L, 116519L, 615935L, 183869L,
191174L, 46026L, 45894L, 115477L, 526100L, 620402L, 409119L,
61246L, 50378L, 382692L, 226353L, 30415L, 140897L, 380308L,
278496L, 140738L, 90931L, 309373L, 675839L, 126616L, 28072L,
359436L, 265410L, 61959L, 14987L, 163470L, 63586L, 202991L,
17073L, 153207L, 10945L, 32625L, 6528L, 66156L, 41513L, 71556L,
59877L, 1672723L, 83184L, 127108L, 225371L, 7148L, 55633L,
23490L, 33440L, 31874L, 57616L, 141644L, 4041L, 16330L, 24755L,
21670L, 17487L, 18316L, 2185L, 57400L, 65359L, 2195L, 3315L,
1921L, 149230L, 27015L, 125189L, 8282L, 123498L, 43192L,
28750L, 53547L, 90610L, 111713L, 1389550L, 46932L, 65567L,
34963L, 157530L, 495876L, 35865L, 69871L, 33037L, 15238L,
187162L, 25614L, 222164L, 208159L, 35492L, 74961L, 119257L,
338205L, 157776L, 269244L, 7808L, 81022L, 117785L, 53097L,
88359L, 120146L, 25215L, 5604L, 44767L, 107211L, 48531L,
62563L, 30239L, 39535L, 18375L, 50473L, 118626L, 6583L, 25521L,
83484L, 37988L, 142569L, 60286L, 29201L, 21472L, 45229L,
218326L, 136081L, 72108L, 44545L, 419200L, 118830L, 293110L,
28082L, 34409L, 28544L, 70845L, 883447L, 50206L, 501740L,
292555L, 11381L, 760137L, 46906L, 45658L, 12068L, 31536L,
26448L, 66957L, 60306L, 63473L, 92440L, 33038L, 40690L, 4358L,
15420L, 35874L, 241844L, 103774L, 231520L, 55938L, 4980L,
15129L, 71766L, 15052L, 51907L, 58131L, 11222L, 84234L, 18250L,
55818L, 354058L, 426951L, 78515L, 69888L, 61814L, 63906L,
10022L, 12016L, 17379L, 41985L, 52158L, 26534L, 6521L, 24839L,
52534L, 6259L, 12217L, 27762L, 21291L, 13541L, 49876L, 10837L,
452375L, 71490L, 51393L, 138363L, 141327L, 22491L, 14763L,
24576L, 49290L, 103754L, 10742L, 85254L, 13606L, 3300L, 18141L,
16879L, 35271L, 82284L, 695830L, 4878L, 18671L, 2561L, 21473L,
31871L, 5064L, 37972L, 11353L, 13481L, 60994L, 10852L, 4068L,
15985L, 1380L, 90067L, 182569L, 111214L, 373818L, 29520L,
92995L, 15263L, 51544L, 14380L, 62169L, 31736L, 33060L, 34099L,
22353L, 3371L, 4004L, 130913L, 6064L, 35123L, 30165L, 32239L,
27727L, 65874L, 17705L, 15342L, 27021L, 32604L, 48038L, 43721L,
21798L, 23312L, 44106L, 34939L, 16593L, 5387L, 12524L, 88162L,
18039L, 475786L, 5763L, 8036L, 79110L, 9284L, 38040L, 63066L,
4939L, 5665L, 47305L, 47126L, 32030L, 10339L, 33543L, 6560L,
119694L, 45586L, 62892L, 56458L, 20836L, 38241L, 132671L,
4101L, 3229L, 32904L, 6779L, 21641L, 95824L, 182038L, 33146L,
11859L, 17811L, 19589L, 28067L, 92553L, 17485L, 25332L, 39660L,
4862L, 13969L, 18905L, 37109L, 37983L, 21651L), per_capita = c(0.55,
0.55, 0.55, 0.55, 0.55, 0.55, 0.55, 0.55, 0.55, 0.89, 0.89,
0.55, 0.55, 0.55, 0.55, 0.55, 1.33, 1.33, 1.33, 1.33, 1.33,
1.33, 1.33, 1.33, 1.33, 0.55, 1.33, 1.33, 1.33, 1.33, 1.33,
0.55, 0.55, 0.55, 0.89, 0.55, 0.55, 0.55, 0.55, 0.55, 0.55,
0.55, 1.33, 0.55, 0.55, 0.89, 0.55, 0.55, 0.55, 0.55, 0.89,
0.55, 0.55, 0.55, 0.55, 1.33, 1.33, 1.33, 0.55, 0.55, 0.55,
0.55, 1.33, 0.55, 1.33, 1.33, 0.55, 1.33, 1.33, 1.33, 1.33,
1.33, 1.33, 1.33, 0.55, 1.33, 1.33, 1.33, 1.33, 1.33, 0.55,
0.55, 1.33, 1.33, 0.55, 0.89, 0.55, 0.55, 0.55, 1.33, 1.33,
1.33, 0.55, 0.55, 0.55, 0.55, 0.55, 0.55, 1.33, 0.55, 0.55,
0.55, 0.55, 0.89, 0.89, 1.33, 1.33, 0.55, 0.55, 0.55, 0.55,
0.89, 0.89, 1.33, 0.89, 0.89, 1.33, 1.33, 1.33, 1.33, 1.33,
1.33, 1.33, 1.33, 1.33, 1.33, 0.89, 0.89, 0.89, 0.55, 0.55,
0.55, 0.89, 0.55, 0.55, 0.89, 0.89, 0.89, 0.89, 0.55, 0.55,
0.55, 0.89, 0.89, 0.89, 0.89, 0.89, 0.55, 0.55, 0.55, 0.55,
0.55, 0.55, 0.55, 0.89, 0.55, 0.55, 0.55, 0.55, 0.89, 0.55,
0.55, 0.55, 0.89, 0.55, 0.55, 0.55, 0.89, 1.33, 0.89, 0.89,
0.89, 0.89, 0.89, 0.89, 0.89, 0.55, 0.89, 0.55, 0.55, 0.89,
0.89, 0.89, 0.55, 1.33, 0.55, 0.55, 0.89, 1.33, 0.55, 0.89,
0.55, 0.55, 0.89, 0.89, 1.33, 0.89, 0.89, 1.33, 1.33, 1.33,
0.55, 0.55, 0.89, 0.89, 0.55, 0.89, 0.55, 0.55, 0.55, 0.55,
1.33, 0.89, 0.89, 0.89, 0.89, 0.89, 0.89, 0.55, 0.89, 0.55,
0.55, 0.55, 0.55, 0.55, 1.33, 0.89, 0.89, 0.55, 0.55, 0.55,
0.55, 0.89, 0.89, 0.55, 0.89, 0.55, 0.55, 0.55, 0.55, 0.55,
0.55, 0.55, 0.55, 0.55, 0.55, 0.55, 0.55, 0.55, 0.55, 0.55,
0.55, 0.55, 0.55, 0.55, 0.55, 0.55, 1.33, 0.55, 0.55, 0.55,
0.55, 0.89, 0.89, 0.89, 0.89, 0.55, 0.55, 0.55, 0.89, 1.33,
0.55, 0.55, 0.55, 0.55, 0.89, 0.89, 0.89, 0.55, 0.55, 0.55,
0.55, 0.55, 0.89, 0.89, 0.89, 0.55, 0.55, 0.55, 0.55, 0.89,
0.55, 0.55, 0.89, 0.89, 0.55, 0.55, 0.89, 0.55, 0.55, 0.55,
0.89, 0.89, 0.89, 1.33, 0.89, 0.55, 0.55, 0.55, 0.89, 0.89,
0.89, 0.89, 0.89, 1.33, 1.33, 1.33, 0.89, 1.33, 1.33, 1.33,
1.33, 1.33, 1.33, 1.33, 1.33, 1.33, 0.89, 0.89, 0.89, 0.55,
0.55, 0.89, 0.55, 0.55, 0.89, 0.89, 0.89, 0.89, 0.89, 0.89,
0.89, 0.89, 0.89, 0.89, 0.89, 0.89, 0.89, 0.89, 0.89, 0.89,
0.89, 0.89, 0.89, 0.89, 0.89, 0.89, 0.89, 0.89, 0.89, 0.89,
0.89, 0.89, 0.89, 0.89, 0.89, 0.89, 0.89, 0.89, 0.55, 0.55,
0.89, 0.89, 0.89, 0.89, 0.89, 0.89, 0.89, 0.89, 0.89, 0.89,
0.89, 0.89, 0.89, 0.89, 0.89, 0.89, 0.89, 0.55, 0.89, 0.89,
0.89, 0.89, 0.89, 0.55, 0.89, 0.89, 0.55, 0.89, 0.89, 0.89,
0.55, 0.89, 0.55, 0.89, 0.89, 0.89, 0.55, 0.55, 1.33, 0.89,
0.89, 0.89, 0.89, 0.89, 0.89, 0.89, 0.89, 0.89, 0.89, 0.89,
1.33, 0.89, 0.89, 0.89, 0.89, 0.89, 0.89, 0.89, 0.89, 0.89,
0.55, 0.89, 0.89, 0.89, 0.89, 0.89)), .Names = c("geo_zone",
"constit", "constit_pop", "grouped_types", "area", "per_capita"
), class = "data.frame", row.names = c(NA, -437L))
【问题讨论】:
您使用的是什么版本的 dplyr?也许您应该更新到最新的 CRAN 版本。我问是因为您使用的是已弃用的%.%
(由%>%
替换)。
【参考方案1】:
没有加载数据进行检查,差异似乎是在第一个版本中,废物被定义为(在所有突变之后):
area / sum(area) * constit_pop * wg_rate / 1000
而在第二个是:
area / sum(area) * 939370 * wg_rate / 1000
在第一个版本中,您似乎将百分比应用于不完整的总体,这可能是您得到较低数字的原因。
更新 1
在玩了一下你的数据之后,我相信 757 的数字更准确。您将area
值视为实际大小,并使用它来划分位置。由于您的面积比例总和为 1,并且您希望将该百分比应用于分配人口,因此您必须将其应用于 total 人口,而不是选区的总人口,后者小于或等于总数。这就是为什么您得到的答案少于 757。
我的建议是在纸上绘制出您将如何根据您拥有的数据计算某个区域的总浪费,并使用它来创建一列实际(或预期)浪费。一旦你有每个区域的浪费,用dplyr
总结应该是轻而易举的事。如果我正确理解您的问题,它可能类似于以下内容。可能有更优雅的方法可以做到这一点,但我希望以下内容应该相对清楚。
new_resi <- tbl_df(resi_type)
total_pop <- group_by(new_resi, constit_pop) %>% summarize() %>% sum()
new_resi <- new_resi %>% mutate(prop = area / sum(area),
area_pop = total_pop * prop, waste = per_capita * area_pop / 1000)
现在您应该能够在您想要的任何轴上进行聚合:
> sum(new_resi$waste)
[1] 757.8447
> new_resi %>% group_by(constit) %>% summarize(sum(waste))
Source: local data frame [6 x 2]
constit sum(waste)
1 changamwe 43.77569
2 jomvu 58.32272
3 kisauni 177.14872
4 likoni 122.57831
5 mvita 102.38562
6 nyali 253.63362
> new_resi %>% group_by(geo_zone) %>% summarize(sum(waste))
Source: local data frame [4 x 2]
geo_zone sum(waste)
1 Mainland North 430.7823
2 Mainland South 122.5783
3 Mainland West 102.0984
4 Mombasa Island 102.3856
> 430.7823+122.5783+102.0984+102.3856
[1] 757.8446
> new_resi %>% group_by(grouped_types) %>% summarize(sum(waste))
Source: local data frame [3 x 2]
grouped_types sum(waste)
1 low 276.8951
2 medium 199.9059
3 high 281.0437
【讨论】:
嘿嘿,我用%.%;好像没问题但是@Avraham 你的评论引起了共鸣。我将百分比应用于不完整的人口。我知道总人口,而且我知道各个选区的人口总和。我不知道的是每个选区中每个社会经济群体的人口 - 因此必须根据每个选区中每个社会经济类型所覆盖的面积比例来计算它。我目前看不到这个,你能告诉我如何确保这些总和正确吗?谢谢 @marty_carea
和 per_capita
究竟代表什么? area
是公鸡的实际大小,还是像美国的 zip code 这样的标识符? per_capita
是每人每年产生的垃圾吨数吗?
是的 area 是该区域的物理大小(m2),per_capita 是每人/每天产生的废物的重量,例如0.55公斤
@marty_c 这就是我的想法。请查看答案中的更新。
哇,非常感谢!正是我所追求的,我非常感谢您的投入。欢呼以上是关于dplyr R组总数因因素而异的主要内容,如果未能解决你的问题,请参考以下文章
R语言单因素、多因素方差分析ANOVA analysis of variance
R语言aov函数进行重复测量方差分析(Repeated measures ANOVA其中一个组内因素和一个组间因素)分别使用interaction.plot函数和boxplot对交互作用进行可视化
R语言Tukey检验进行事后检验(post hoc)实战:单因素方差分析告诉我们并不是所有的群体手段的效果是均等的,确切地找出哪些组彼此不同使用Tukey检验
R语言使用aov函数进行单因素协方差分析(One-way ANCOVA)使用multcomp包的glht函数检验组均值之间所有成对对比差异通过contrast参数自定义对比组进行组间两两方差分析