在 seaborn 中绘制不同的组时如何将数据作为一组包含在内
Posted
技术标签:
【中文标题】在 seaborn 中绘制不同的组时如何将数据作为一组包含在内【英文标题】:How to include data as one group, when plotting separate groups in seaborn 【发布时间】:2021-08-12 23:19:31 【问题描述】:我有一个数据框,我必须将具有来自我的数据框(过滤)的特定值的列的中位数与具有原始数据框中所有值的同一列的中位数进行比较。 这是我到达的最远的地方,我展示了两个图表,我认为它们应该在同一个图表中:
我的目标是将这两个图合并到一个图中。 这是我给出该输出的代码。filt_waterfront = df['waterfront'] == 1
fig, axs = plt.subplots(1,2)
sns.boxplot(y='price', data = df[filt_waterfront], ax=axs[0], color= 'red')
sns.boxplot(y='price', data = df, ax=axs[1], color = 'orange')
fig.set_size_inches(9,6)
fig.suptitle('Price plots of properties with waterfront and general properties')
fig.axes[1].set_ylabel("Price")
fig.axes[0].set_ylabel("Price")
fig.axes[1].set_xlabel("General Properties")
fig.axes[0].set_xlabel("Properties with Waterfront") <br>
我的过滤器是具有海滨的属性,右侧的图表显示一般属性,这意味着原始列,左侧的过滤器,我试图找到一种方法将这两个图表合二为一图表(因为它看起来会更干净,除了我没有这样做之外,没有真正的理由展示两个图表)。 任何帮助都非常感谢,在此先感谢!
【问题讨论】:
【参考方案1】: 不涉及手动创建 x 轴和分配不同箱线图的最简单方法是创建一个单独的 DataFrame,其中所有数据都按照所需的 x 轴标签进行标记。 如果您不想要两个原始类别,请过滤掉要使用的所需数据:filtered = df[['price', 'waterfront']][df.waterfront.eq(1)].copy()
以下示例使用来自 OP 的示例 DataFrame,它有很多列。 comb
是通过选择要绘制的特定列来创建的,以防止创建带有不必要信息的潜在大型 DataFrame。
如果 DataFrame 已经只有必需的列,则使用comb = df.assign(waterfront="All")
import pandas as pd
import seaborn as sns
# Using the sample data from the OP, which has many columns
# Create a copy of the columns to plot and all rows, with waterfront as "All"
comb = df[['price', 'waterfront']].assign(waterfront="All")
# Append it to the original columns with the original categories
data = df[['price', 'waterfront']].append(comb).reset_index(drop=True)
# display(data.head())
price waterfront
0 221900.0 0
1 538000.0 1
2 180000.0 0
3 604000.0 1
4 510000.0 0
# display(data.tail())
price waterfront
95 488000.0 All
96 210490.0 All
97 785000.0 All
98 450000.0 All
99 1350000.0 All
# plot
sns.boxplot(data=data, y='price', x='waterfront')
样本数据
data = 'id': [7129300520, 6414100192, 5631500400, 2487200875, 1954400510, 7237550310, 1321400060, 2008000270, 2414600126, 3793500160, 1736800520, 9212900260, 114101516, 6054650070, 1175000570, 9297300055, 1875500060, 6865200140, 16000397, 7983200060, 6300500875, 2524049179, 7137970340, 8091400200, 3814700200, 1202000200, 1794500383, 3303700376, 5101402488, 1873100390, 8562750320, 2426039314, 461000390, 7589200193, 7955080270, 9547205180, 9435300030, 2768000400, 7895500070, 2078500320, 5547700270, 7766200013, 7203220400, 9270200160, 1432701230, 8035350320, 8945200830, 4178300310, 9215400105, 822039084], 'date': ['20141013T000000', '20141209T000000', '20150225T000000', '20141209T000000', '20150218T000000', '20140512T000000', '20140627T000000', '20150115T000000', '20150415T000000', '20150312T000000', '20150403T000000', '20140527T000000', '20140528T000000', '20141007T000000', '20150312T000000', '20150124T000000', '20140731T000000', '20140529T000000', '20141205T000000', '20150424T000000', '20140514T000000', '20140826T000000', '20140703T000000', '20140516T000000', '20141120T000000', '20141103T000000', '20140626T000000', '20141201T000000', '20140624T000000', '20150302T000000', '20141110T000000', '20141201T000000', '20140624T000000', '20141110T000000', '20141203T000000', '20140613T000000', '20140528T000000', '20141230T000000', '20150213T000000', '20140620T000000', '20140715T000000', '20140811T000000', '20140707T000000', '20141028T000000', '20140729T000000', '20140718T000000', '20150325T000000', '20140716T000000', '20150428T000000', '20150311T000000'], 'price': [221900.0, 538000.0, 180000.0, 604000.0, 510000.0, 1225000.0, 257500.0, 291850.0, 229500.0, 323000.0, 662500.0, 468000.0, 310000.0, 400000.0, 530000.0, 650000.0, 395000.0, 485000.0, 189000.0, 230000.0, 385000.0, 2000000.0, 285000.0, 252700.0, 329000.0, 233000.0, 937000.0, 667000.0, 438000.0, 719000.0, 580500.0, 280000.0, 687500.0, 535000.0, 322500.0, 696000.0, 550000.0, 640000.0, 240000.0, 605000.0, 625000.0, 775000.0, 861990.0, 685000.0, 309000.0, 488000.0, 210490.0, 785000.0, 450000.0, 1350000.0], 'bedrooms': [3, 3, 2, 4, 3, 4, 3, 3, 3, 3, 3, 2, 3, 3, 5, 4, 3, 4, 2, 3, 4, 3, 5, 2, 3, 3, 3, 3, 3, 4, 3, 2, 4, 3, 4, 3, 4, 4, 4, 4, 4, 4, 5, 3, 3, 3, 3, 4, 3, 3], 'bathrooms': [1.0, 2.25, 1.0, 3.0, 2.0, 4.5, 2.25, 1.5, 1.0, 2.5, 2.5, 1.0, 1.0, 1.75, 2.0, 3.0, 2.0, 1.0, 1.0, 1.0, 1.75, 2.75, 2.5, 1.5, 2.25, 2.0, 1.75, 1.0, 1.75, 2.5, 2.5, 1.5, 1.75, 1.0, 2.75, 2.5, 1.0, 2.0, 1.0, 2.5, 2.5, 2.25, 2.75, 1.0, 1.0, 2.5, 1.0, 2.5, 1.75, 2.5], 'sqmeters_living': [109.624675, 238.758826, 71.534745, 182.088443, 156.075808, 503.530286, 159.327388, 98.476403, 165.366035, 175.585284, 330.73207, 107.76663, 132.850242, 127.276106, 168.153103, 274.061687, 175.585284, 148.643627, 111.48272, 116.127834, 150.501672, 283.351914, 210.888146, 99.405425, 227.610554, 158.862876, 227.610554, 130.063174, 141.211446, 238.758826, 215.533259, 110.553698, 216.462282, 101.263471, 191.37867, 213.675214, 154.217763, 219.24935, 113.340766, 243.403939, 238.758826, 392.047566, 333.983649, 145.856559, 118.914902, 293.571163, 91.973244, 212.746191, 116.127834, 255.759941], 'sqmeters_lot': [524.897808, 672.798216, 929.022668, 464.511334, 750.650316, 9469.528056, 633.500557, 902.173913, 693.979933, 609.43887, 910.070606, 557.413601, 1848.848012, 899.293943, 450.575994, 464.511334, 1304.347826, 399.479747, 915.087328, 908.026756, 462.653289, 4168.246005, 585.284281, 895.856559, 603.864734, 436.361947, 250.0, 146.878484, 592.716462, 666.38796, 369.751022, 117.521368, 464.511334, 278.7068, 618.636195, 284.280936, 3237.458194, 557.413601, 750.185805, 701.690821, 512.820513, 2246.934225, 523.875883, 211.817168, 897.064288, 1263.749535, 792.270531, 1246.376812, 553.976217, 6039.111854], 'floors': [1.0, 2.0, 1.0, 1.0, 1.0, 1.0, 2.0, 1.0, 1.0, 2.0, 1.0, 1.0, 1.5, 1.0, 1.5, 2.0, 2.0, 1.5, 1.0, 1.0, 1.0, 1.0, 2.0, 1.0, 2.0, 1.5, 2.0, 1.5, 1.0, 2.0, 2.0, 3.0, 1.5, 1.5, 1.0, 1.5, 1.0, 2.0, 1.0, 2.0, 2.0, 1.0, 2.0, 2.0, 1.0, 2.0, 1.0, 2.0, 1.0, 1.0], 'waterfront': [0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1], 'view': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2], 'grade': [7, 7, 6, 7, 8, 11, 7, 7, 7, 7, 8, 7, 7, 7, 7, 9, 7, 7, 7, 7, 7, 9, 8, 7, 8, 6, 8, 8, 7, 8, 8, 7, 7, 8, 7, 8, 5, 8, 7, 8, 9, 8, 9, 7, 6, 8, 6, 9, 7, 9], 'sqmeters_above': [109.624675, 201.597919, 71.534745, 97.54738, 156.075808, 361.389818, 159.327388, 98.476403, 97.54738, 175.585284, 172.798216, 79.895949, 132.850242, 127.276106, 168.153103, 183.946488, 175.585284, 148.643627, 111.48272, 116.127834, 79.895949, 216.462282, 210.888146, 99.405425, 227.610554, 158.862876, 162.578967, 130.063174, 73.392791, 238.758826, 215.533259, 110.553698, 140.282423, 101.263471, 118.914902, 140.282423, 86.399108, 219.24935, 82.683017, 243.403939, 238.758826, 241.545894, 333.983649, 145.856559, 85.470085, 293.571163, 91.973244, 212.746191, 116.127834, 201.133408], 'sqmeters_basement': [0.0, 37.160907, 0.0, 84.541063, 0.0, 142.140468, 0.0, 0.0, 67.818655, 0.0, 157.933854, 27.87068, 0.0, 0.0, 0.0, 90.115199, 0.0, 0.0, 0.0, 0.0, 70.605723, 66.889632, 0.0, 0.0, 0.0, 0.0, 65.031587, 0.0, 67.818655, 0.0, 0.0, 0.0, 76.179859, 0.0, 72.463768, 73.392791, 67.818655, 0.0, 30.657748, 0.0, 0.0, 150.501672, 0.0, 0.0, 33.444816, 0.0, 0.0, 0.0, 0.0, 0.0], 'yr_built': [1955.0, 1951.0, 1933.0, 1965.0, 1987.0, 2001.0, 1995.0, 1963.0, 1960.0, 2003.0, 1965.0, 1942.0, 1927.0, 1977.0, 1900.0, 1979.0, 1994.0, 1916.0, 1921.0, 1969.0, 1947.0, 1968.0, 1995.0, 1985.0, 1985.0, 1941.0, 1915.0, 1909.0, 1948.0, 2005.0, 2003.0, 2005.0, 1929.0, 1929.0, 1981.0, 1930.0, 1933.0, 1904.0, 1969.0, 1996.0, 2000.0, 1984.0, 2014.0, 1922.0, 1959.0, 2003.0, 1966.0, 1981.0, 1953.0, 0.0], 'yr_renovated': [0.0, 1991.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 2002.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], 'zipcode': [98178.0, 98125.0, 98028.0, 98136.0, 98074.0, 98053.0, 98003.0, 98198.0, 98146.0, 98038.0, 98007.0, 98115.0, 98028.0, 98074.0, 98107.0, 98126.0, 98019.0, 98103.0, 98002.0, 98003.0, 98133.0, 98040.0, 98092.0, 98030.0, 98030.0, 98002.0, 98119.0, 98112.0, 98115.0, 98052.0, 98027.0, 98133.0, 98117.0, 98117.0, 98058.0, 98115.0, 98052.0, 98107.0, 98001.0, 98056.0, 98074.0, 98166.0, 98053.0, 98119.0, 98058.0, 98019.0, 98023.0, 98007.0, 98115.0, 0.0], 'lat': [47.5112, 47.721, 47.7379, 47.5208, 47.6168, 47.6561, 47.3097, 47.4095, 47.5123, 47.3684, 47.6007, 47.69, 47.7558, 47.6127, 47.67, 47.5714, 47.7277, 47.6648, 47.3089, 47.3343, 47.7025, 47.5316, 47.3266, 47.3533, 47.3739, 47.3048, 47.6386, 47.6221, 47.695, 47.7073, 47.5391, 47.7274, 47.6823, 47.6889, 47.4276, 47.6827, 47.6621, 47.6702, 47.3341, 47.5301, 47.6145, 47.445, 47.6848, 47.6413, 47.4485, 47.7443, 47.3066, 47.6194, 47.6796, 0.0], 'long': [-122.257, -122.319, -122.233, -122.393, -122.045, -122.005, -122.327, -122.315, -122.337, -122.031, -122.145, -122.292, -122.229, -122.045, -122.394, -122.375, -121.962, -122.343, -122.21, -122.306, -122.341, -122.233, -122.169, -122.166, -122.172, -122.218, -122.36, -122.314, -122.304, -122.11, -122.07, -122.357, -122.368, -122.375, -122.157, -122.31, -122.132, -122.362, -122.282, -122.18, -122.027, -122.347, -122.016, -122.364, -122.175, -121.977, -122.371, -122.151, -122.301, 0.0], 'sqmeters_living15': [124.489038, 157.004831, 252.694166, 126.347083, 167.22408, 442.21479, 207.915273, 153.28874, 165.366035, 222.036418, 205.31401, 123.560015, 165.366035, 127.276106, 126.347083, 198.810851, 175.585284, 149.57265, 98.476403, 118.914902, 130.063174, 381.828317, 208.101078, 113.340766, 204.384987, 95.689335, 163.50799, 172.798216, 141.211446, 244.332962, 239.687848, 129.134151, 135.63731, 145.856559, 187.662579, 147.714604, 200.668896, 160.720922, 119.843924, 243.403939, 229.468599, 223.894463, 336.770717, 146.785582, 124.489038, 283.351914, 114.083984, 248.978075, 90.115199, 0.0], 'sqmeters_lot15': [524.897808, 709.680416, 748.978075, 464.511334, 697.045708, 9469.528056, 633.500557, 902.173913, 753.716091, 703.27016, 829.152731, 557.413601, 1179.580082, 948.34634, 450.575994, 371.609067, 1302.303976, 399.479747, 473.337049, 822.185061, 462.653289, 1889.260498, 650.780379, 779.07841, 637.774062, 437.105165, 331.939799, 358.695652, 579.245634, 559.82906, 369.751022, 163.136381, 464.511334, 471.943515, 810.107767, 303.232999, 1065.310294, 436.640654, 724.637681, 1104.050539, 526.662951, 2844.388703, 523.875883, 245.261984, 818.283166, 857.673727, 821.256039, 1271.367521, 473.801561, 0.0]
df = pd.DataFrame(data)
# display(df[['price', 'waterfront']].head())
price waterfront
0 221900.0 0
1 538000.0 1
2 180000.0 0
3 604000.0 1
4 510000.0 0
【讨论】:
【参考方案2】:使用箱线图的色调属性:
sns.boxplot(y='price', data = meters_df, ax=axs[1], color = 'orange', hue=filt_waterfront)
【讨论】:
感谢您的回答,我可以理解色调背后的想法,它会显示过滤价格和未过滤价格的箱线图,但不幸的是它不起作用,它显示了我原来的“常规属性”图以上是关于在 seaborn 中绘制不同的组时如何将数据作为一组包含在内的主要内容,如果未能解决你的问题,请参考以下文章