使用 Plotly Graph 对象按中值对箱线图进行排序

Posted

技术标签:

【中文标题】使用 Plotly Graph 对象按中值对箱线图进行排序【英文标题】:Sorting Box Plots by Median using Plotly Graph Objects 【发布时间】:2021-10-10 05:23:43 【问题描述】:

我几乎是 plotly/pandas/data 的初学者,但我正在尝试制作这个图表,无论我搜索什么,我都找不到任何与字典兼容的属性。我使用的数据是 9 种不同软件的时间序列下载速度。我正在尝试按中值降序显示箱线图。

这是我的代码:

import pandas as pd
import plotly.graph_objs as go
from plotly.offline import plot
import numpy as np
olddf = pd.read_csv("justice.csv")
df = olddf.interpolate()



col = df.loc[:,'Bfy':'Sfy']
df["1"] = col.mean(axis=1)
col2 = df.loc[:,'Bakamai':'Sakamai']
df["2"] = col2.mean(axis=1)
col4 = df.loc[:,'Bazure':'Sazure']
df["4"] = col4.mean(axis=1)
col5 = df.loc[:,'Bcloudflare':'Scloudflare']
df["5"] = col5.mean(axis=1)
col6 = df.loc[:,'Bfastly':'Sfastly']
df["6"] = col6.mean(axis=1)
col7 = df.loc[:,'BAWS':'SAWS']
df["7"] = col7.mean(axis=1)
col8 = df.loc[:,'Bali':'Sali']
df["8"] = col8.mean(axis=1)
col9 = df.loc[:,'Bgoog':'Sgoog']
df["9"] = col9.mean(axis=1)

trace_one = go.Box(
    y=df['1'],
    name="Fy",
    line = dict(color='#8235EA'),
    opacity = 0.8)
trace_two = go.Box(
    y=df['2'],
    name="Akamai",
    line = dict(color='#EA8933'),
    opacity = 0.8)
trace_four = go.Box(
    y=df['4'],
    name="Azure",
    line = dict(color='#62F92C'),
    opacity = 0.8)
trace_five = go.Box(
    y=df['5'],
    name="Cloudflare",
    line = dict(color='#3548EA'),
    opacity = 0.8)
trace_six = go.Box(
    y=df['6'],
    name="Fastly",
    line = dict(color='#D735EA'),
    opacity = 0.8)
trace_seven = go.Box(
    y=df['7'],
    name="AWS Cloudfront",
    line = dict(color='#29E5B7'),
    opacity = 0.8)
trace_eight = go.Box(
    y=df['8'],
    name="Alibaba Cloud",
    line = dict(color='#3597EA'),
    opacity = 0.8)
trace_nine = go.Box(
    y=df['9'],
    name="Google Cloud",
    line = dict(color='#EA4833'),
    opacity = 0.8,
    )
data=[trace_one, trace_four, trace_seven, trace_eight, trace_nine, trace_five, trace_two]

layout = dict(
    
        title = "CHINA - Software vs Mb loaded per second")

fig = dict(data=data, layout=layout)

plot(fig)



csv 布局示例:

datetime,Bfy,Sfy,Gfy,Bakamai,Sakamai,Gakamai,Bazuaka,Sazuaka,Gazuaka,Bazure,Sazure,Gazure,Bcloudflare,Scloudflare,Gcloudflare,Bfastly,Sfastly,Gfastly,BAWS,SAWS,GAWS,Bali,Sali,Gali,Bgoog,Sgoog,Ggoog
23/07/21 10:02PM,,,215200,1489,1571,,1897,12400,173600,6551,,,1556,769,,,,749,6124,9347,2179,4160,,4473,4635,906,3426
23/07/21 10:12PM,22653,21520,,,1670,,17360,,,,10850,,,18261,1522,,3414,2010,5148,10447,2030,2667,4160,4119,5837,1592,3216
23/07/21 10:22PM,23911,,,1535,1615,815,3156,13354,177,6313,,,,825,586,873,,885,4280,6458,2114,4039,4119,6303,5629,1072,3283

【问题讨论】:

请提供数据集,或至少提供一个虚拟示例。 嗨,我加了一点,够了吗? 【参考方案1】: 采用不同的方法进行数据准备
    对列,计算均值 从这些成对的列方法创建新的数据框
根据中位数对该数据准备的列进行排序 按照与有序列相同的顺序创建箱线图 找到了您的代码未绘制的两个提供程序...
import plotly.graph_objects as go
import pandas as pd
import io

df = pd.read_csv(io.StringIO("""datetime,Bfy,Sfy,Gfy,Bakamai,Sakamai,Gakamai,Bazuaka,Sazuaka,Gazuaka,Bazure,Sazure,Gazure,Bcloudflare,Scloudflare,Gcloudflare,Bfastly,Sfastly,Gfastly,BAWS,SAWS,GAWS,Bali,Sali,Gali,Bgoog,Sgoog,Ggoog
23/07/21 10:02PM,,,215200,1489,1571,,1897,12400,173600,6551,,,1556,769,,,,749,6124,9347,2179,4160,,4473,4635,906,3426
23/07/21 10:12PM,22653,21520,,,1670,,17360,,,,10850,,,18261,1522,,3414,2010,5148,10447,2030,2667,4160,4119,5837,1592,3216
23/07/21 10:22PM,23911,,,1535,1615,815,3156,13354,177,6313,,,,825,586,873,,885,4280,6458,2114,4039,4119,6303,5629,1072,3283"""))

# different approach to getting means per provider to plot
df2 = pd.DataFrame(c[1:]:df.loc[:,[c, "S"+c[1:]]].mean(axis=1).values for c in df.columns if c[0]=="B")

# re-order columns on ascending median
df2 = df2.reindex(df2.median().sort_values().index, axis=1)

meta = 'fy': 'color': '#8235EA', 'name': 'Fy',
 'azure': 'color': '#62F92C', 'name': 'Azure',
 'AWS': 'color': '#29E5B7', 'name': 'AWS Cloudfront',
 'ali': 'color': '#3597EA', 'name': 'Alibaba Cloud',
 'goog': 'color': '#EA4833', 'name': 'Google Cloud',
 'cloudflare': 'color': '#3548EA', 'name': 'Cloudflare',
 'akamai': 'color': '#EA8933', 'name': 'Akamai',
        # next two were missing
 'fastly': 'color': 'pink', 'name': 'Fastly',
 'azuaka': 'color': 'purple', 'name': 'azuaka',
       

go.Figure([go.Box(y=df2[c], name=meta[c]["name"], line="color":meta[c]["color"]) for c in df2.columns])

【讨论】:

以上是关于使用 Plotly Graph 对象按中值对箱线图进行排序的主要内容,如果未能解决你的问题,请参考以下文章

如何按熊猫中的中值对箱线图进行排序

如何按中值对熊猫中的箱线图进行排序?

使用 ggplotly 对箱线图进行分组时不考虑分组

当输入是 DataFrame 时在 seaborn 中对箱线图进行分组

等高线图未在 plotly python 中显示

如何使用 Plotly 创建垂直滚动条?