使用 Vegalite 的多直方图
Posted
技术标签:
【中文标题】使用 Vegalite 的多直方图【英文标题】:multi-histogram plot with Vegalite 【发布时间】:2020-10-27 02:20:40 【问题描述】:我想创建显示多个直方图的单一视觉效果。我有简单的值数组,如下所示:
"data": "values": "foo": [0,0,0,1,1,1,2,2,2], "baz": [2,2,2,3,3,3,4,4,4]
我想使用不同的颜色条来显示“foo”和“baz”值的分布。我可以像这样为“foo”制作一个直方图:
"data": "values": "foo": [0,0,0,1,1,1,2,2,2],
"mark": "bar",
"transform": ["flatten": ["foo"]],
"encoding":
"x": "field": "foo", "type": "quantitative",
"y": "field": "foo", "type": "quantitative", "aggregate": "count"
但是,我找不到展平数组的正确方法。这不起作用:
"data": "values": "foo": [0,0,0,1,1,1,2,2,2], "bar": [0,0,0,1,1,1,2,2,2],
"mark": "bar",
"transform": ["flatten": ["foo", "baz"]],
"encoding":
"x": "field": "foo", "type": "quantitative",
"y": "field": "foo", "type": "quantitative", "aggregate": "count"
,
"layer": [
"mark": "bar",
"encoding":
"y": "field": "baz", "type": "quantitative", "aggregate": "count"
]
https://vega.github.io/editor/#/url/vega-lite/N4IgJghgLhIFygG4QDYFcCmBneoBmA9gfANoAMANJZQIwV10BMFzjAuhSAEYQBep1KvWFMWLNgF8JnALYQATgGt43BSE5R5EAHZZC8maXwpoUDNtIhCxTj36SOIcwGMCYAJbaA5rhAAPXzx3DBQwFWt1ECgATwAHDBUARzQdKHcYNMQE6RBowODQ8KJImPiklO00jPcsyIgvL3kML2gEuBBXNEqQKU4TaIx5IxA5JRUeIc4XN08fBFz8kLD2uxK4tpBk1PToGoTOesbm1pVO7qkJSSA
检查 data_0,有 foo
的列及其计数,但 baz
没有。
这也不行:
"data":
"values":
"foo": [0, 0, 0, 1, 1, 1, 2, 2, 2],
"baz": [0, 0, 0, 1, 1, 1, 2, 2, 2]
,
"mark": "bar",
"transform": ["flatten": ["foo"],"flatten": ["baz"]],
"encoding":
"x": "field": "foo", "type": "quantitative",
"y": "field": "foo", "type": "quantitative", "aggregate": "count"
,
"layer": [
"mark": "bar",
"encoding":
"y": "field": "baz", "type": "quantitative", "aggregate": "count"
]
https://vega.github.io/editor/#/url/vega-lite/N4IgJghgLhIFygG4QDYFcCmBneoBmA9gfANoAMANJZQIwV10BMFzjAuhSAEYQBep1KvWFMWLNgF8JnALYQATgGt43BSE5R5EAHZZC8maXwpoUDNtIhCxSRWOnzlnv0kcQ5gMYEwAS20BzXBAADyC8HwwUMBVrdRAoAE8ABwwVAEc0HSgfGGzEVOkQBLCIqJiiOMSU9MztbNyffLiIf395DH9oVLgQLzQ6kClOEwSMeSMQOSUVHnHOT28-QIQiksjonudK5O6QDKyc6EbUzha2jq6VPoGpCUkgA
这仍然只提供 foo
的列及其计数,但现在每个存储桶的计数为 27!
如何完成以数组数据开头的多直方图?
【问题讨论】:
【参考方案1】:您可以使用flatten transform 后跟fold transform 来执行此操作,然后使用颜色编码来分隔两个数据集。例如(open in editor):
"data":
"values":
"foo": [0, 0, 1, 1, 1, 1, 2, 2, 2],
"baz": [4, 4, 5, 5, 6, 6, 6, 6, 7]
,
"transform": ["flatten": ["foo", "baz"], "fold": ["foo", "baz"]],
"mark": "bar",
"encoding":
"x": "field": "value", "type": "quantitative",
"y":
"field": "value",
"type": "quantitative",
"aggregate": "count",
"stack": null
,
"color": "field": "key", "type": "nominal"
顺便说一句,如果您将编码放在单独的层中,您的层方法也可以工作,这样外部 foo
聚合不会破坏 baz
数据,但它比基于折叠:
"data":
"values":
"foo": [0, 0, 1, 1, 1, 1, 2, 2, 2],
"baz": [4, 4, 5, 5, 6, 6, 6, 6, 7]
,
"transform": ["flatten": ["foo", "baz"]],
"layer": [
"mark": "type": "bar", "color": "orange",
"encoding":
"x": "field": "foo", "type": "quantitative",
"y": "field": "foo", "type": "quantitative", "aggregate": "count"
,
"mark": "bar",
"encoding":
"x": "field": "baz", "type": "quantitative",
"y": "field": "baz", "type": "quantitative", "aggregate": "count"
]
【讨论】:
以上是关于使用 Vegalite 的多直方图的主要内容,如果未能解决你的问题,请参考以下文章
使用OpenCV,Numpy计算直方图,Matplot绘制直方图及分析
R语言plotly可视化:可视化直方图归一化的直方图水平直方图互相重叠的直方图堆叠的直方图累积直方图通过bingroup参数设置多个直方图使用相同的bins设置自定义直方图条形的间距