使用 Plotly Graph 对象按中值对箱线图进行排序
Posted
技术标签:
【中文标题】使用 Plotly Graph 对象按中值对箱线图进行排序【英文标题】:Sorting Box Plots by Median using Plotly Graph Objects 【发布时间】:2021-10-10 05:23:43 【问题描述】:我几乎是 plotly/pandas/data 的初学者,但我正在尝试制作这个图表,无论我搜索什么,我都找不到任何与字典兼容的属性。我使用的数据是 9 种不同软件的时间序列下载速度。我正在尝试按中值降序显示箱线图。
这是我的代码:
import pandas as pd
import plotly.graph_objs as go
from plotly.offline import plot
import numpy as np
olddf = pd.read_csv("justice.csv")
df = olddf.interpolate()
col = df.loc[:,'Bfy':'Sfy']
df["1"] = col.mean(axis=1)
col2 = df.loc[:,'Bakamai':'Sakamai']
df["2"] = col2.mean(axis=1)
col4 = df.loc[:,'Bazure':'Sazure']
df["4"] = col4.mean(axis=1)
col5 = df.loc[:,'Bcloudflare':'Scloudflare']
df["5"] = col5.mean(axis=1)
col6 = df.loc[:,'Bfastly':'Sfastly']
df["6"] = col6.mean(axis=1)
col7 = df.loc[:,'BAWS':'SAWS']
df["7"] = col7.mean(axis=1)
col8 = df.loc[:,'Bali':'Sali']
df["8"] = col8.mean(axis=1)
col9 = df.loc[:,'Bgoog':'Sgoog']
df["9"] = col9.mean(axis=1)
trace_one = go.Box(
y=df['1'],
name="Fy",
line = dict(color='#8235EA'),
opacity = 0.8)
trace_two = go.Box(
y=df['2'],
name="Akamai",
line = dict(color='#EA8933'),
opacity = 0.8)
trace_four = go.Box(
y=df['4'],
name="Azure",
line = dict(color='#62F92C'),
opacity = 0.8)
trace_five = go.Box(
y=df['5'],
name="Cloudflare",
line = dict(color='#3548EA'),
opacity = 0.8)
trace_six = go.Box(
y=df['6'],
name="Fastly",
line = dict(color='#D735EA'),
opacity = 0.8)
trace_seven = go.Box(
y=df['7'],
name="AWS Cloudfront",
line = dict(color='#29E5B7'),
opacity = 0.8)
trace_eight = go.Box(
y=df['8'],
name="Alibaba Cloud",
line = dict(color='#3597EA'),
opacity = 0.8)
trace_nine = go.Box(
y=df['9'],
name="Google Cloud",
line = dict(color='#EA4833'),
opacity = 0.8,
)
data=[trace_one, trace_four, trace_seven, trace_eight, trace_nine, trace_five, trace_two]
layout = dict(
title = "CHINA - Software vs Mb loaded per second")
fig = dict(data=data, layout=layout)
plot(fig)
csv 布局示例:
datetime,Bfy,Sfy,Gfy,Bakamai,Sakamai,Gakamai,Bazuaka,Sazuaka,Gazuaka,Bazure,Sazure,Gazure,Bcloudflare,Scloudflare,Gcloudflare,Bfastly,Sfastly,Gfastly,BAWS,SAWS,GAWS,Bali,Sali,Gali,Bgoog,Sgoog,Ggoog
23/07/21 10:02PM,,,215200,1489,1571,,1897,12400,173600,6551,,,1556,769,,,,749,6124,9347,2179,4160,,4473,4635,906,3426
23/07/21 10:12PM,22653,21520,,,1670,,17360,,,,10850,,,18261,1522,,3414,2010,5148,10447,2030,2667,4160,4119,5837,1592,3216
23/07/21 10:22PM,23911,,,1535,1615,815,3156,13354,177,6313,,,,825,586,873,,885,4280,6458,2114,4039,4119,6303,5629,1072,3283
【问题讨论】:
请提供数据集,或至少提供一个虚拟示例。 嗨,我加了一点,够了吗? 【参考方案1】: 采用不同的方法进行数据准备-
对列,计算均值
从这些成对的列方法创建新的数据框
import plotly.graph_objects as go
import pandas as pd
import io
df = pd.read_csv(io.StringIO("""datetime,Bfy,Sfy,Gfy,Bakamai,Sakamai,Gakamai,Bazuaka,Sazuaka,Gazuaka,Bazure,Sazure,Gazure,Bcloudflare,Scloudflare,Gcloudflare,Bfastly,Sfastly,Gfastly,BAWS,SAWS,GAWS,Bali,Sali,Gali,Bgoog,Sgoog,Ggoog
23/07/21 10:02PM,,,215200,1489,1571,,1897,12400,173600,6551,,,1556,769,,,,749,6124,9347,2179,4160,,4473,4635,906,3426
23/07/21 10:12PM,22653,21520,,,1670,,17360,,,,10850,,,18261,1522,,3414,2010,5148,10447,2030,2667,4160,4119,5837,1592,3216
23/07/21 10:22PM,23911,,,1535,1615,815,3156,13354,177,6313,,,,825,586,873,,885,4280,6458,2114,4039,4119,6303,5629,1072,3283"""))
# different approach to getting means per provider to plot
df2 = pd.DataFrame(c[1:]:df.loc[:,[c, "S"+c[1:]]].mean(axis=1).values for c in df.columns if c[0]=="B")
# re-order columns on ascending median
df2 = df2.reindex(df2.median().sort_values().index, axis=1)
meta = 'fy': 'color': '#8235EA', 'name': 'Fy',
'azure': 'color': '#62F92C', 'name': 'Azure',
'AWS': 'color': '#29E5B7', 'name': 'AWS Cloudfront',
'ali': 'color': '#3597EA', 'name': 'Alibaba Cloud',
'goog': 'color': '#EA4833', 'name': 'Google Cloud',
'cloudflare': 'color': '#3548EA', 'name': 'Cloudflare',
'akamai': 'color': '#EA8933', 'name': 'Akamai',
# next two were missing
'fastly': 'color': 'pink', 'name': 'Fastly',
'azuaka': 'color': 'purple', 'name': 'azuaka',
go.Figure([go.Box(y=df2[c], name=meta[c]["name"], line="color":meta[c]["color"]) for c in df2.columns])
【讨论】:
以上是关于使用 Plotly Graph 对象按中值对箱线图进行排序的主要内容,如果未能解决你的问题,请参考以下文章