使用 max() 函数的数据框列值

Posted 2023-03-12

技术标签:

【中文标题】使用 max() 函数的数据框列值【英文标题】：Dataframe column value using max() function 【发布时间】：2020-08-14 08:50:18 【问题描述】：

我正在尝试创建一个名为“阈值”的列，其中的值由计算 df['column']/30**0.5 确定，但我希望该列的最小值为 0.2。所以如果计算低于 0.2，我希望列值为 0.2。

例如： df['column2'] = (df['column']/30)**0.5 或 0.2（哪个数字更大）。

这是我目前拥有的：

df['Historical_MovingAverage_15'] = df['Historical_Average'].rolling(window=15).mean()
df['Threshold'] = max((((df['Historical_MovingAverage_15'])/30)**0.5), 0.2)

它给了我这个错误：

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

【问题讨论】：

【参考方案1】：

使用numpy.maximum:

df['Threshold'] = np.maximum((((df['Historical_MovingAverage_15'])/30)**0.5), 0.2)

或Series.clip 和lower 参数：

df['Threshold'] = (((df['Historical_MovingAverage_15'])/30)**0.5).clip(lower=0.2)

示例：

df = pd.DataFrame('Historical_MovingAverage_15':[.21,2,3])
df['Threshold'] = np.maximum((((df['Historical_MovingAverage_15'])/30)**0.5), 0.2)
print (df)
   Historical_MovingAverage_15  Threshold
0                         0.21   0.200000
1                         2.00   0.258199
2                         3.00   0.316228

详情：

print ((((df['Historical_MovingAverage_15'])/30)**0.5))
0    0.083666
1    0.258199
2    0.316228
Name: Historical_MovingAverage_15, dtype: float64

【讨论】：

谢谢！我决定使用 .clip() 因为我不需要导入另一个库。完美运行。过滤也可以用apply:df[df["column"].apply(lambda x: return x >= 0.2)]

以上是关于使用 max() 函数的数据框列值的主要内容，如果未能解决你的问题，请参考以下文章

使用 pandas 数据框列值来透视其他列

如果数据框列值匹配字典键，检查不同的列是不是匹配字典值

将附加信息（数据）附加到数据框列值

使用python正则表达式用字符串的小数部分替换数据框列值

python用额外的列连接替换数据框列值

（Python）如何修复数据框列值中的数值表示错误