在熊猫中应用```apply()```时出错

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了在熊猫中应用```apply()```时出错相关的知识,希望对你有一定的参考价值。

我正在尝试计算与数据框中各个条目相关联的百分位数(使用列中的值分布)。我确定我缺少一些[[basic东西,但无法弄清楚为什么在运行以下代码时出现错误,

from scipy.stats import percentileofscore as pctl import pandas as pd import numpy as np data = np.arange(100).reshape(20,5) df = pd.DataFrame(data) def f(series): r= series.index return pctl(series.values, series.iloc[r]) df.apply(f)
这是我得到的错误,

--------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-7-4d3ad4c6f441> in <module> ----> 1 df.apply(f) C:PythonMinicondaenvsleiaplibsite-packagespandascoreframe.py in apply(self, func, axis, broadcast, raw, reduce, result_type, args, **kwds) 6012 args=args, 6013 kwds=kwds) -> 6014 return op.get_result() 6015 6016 def applymap(self, func): C:PythonMinicondaenvsleiaplibsite-packagespandascoreapply.py in get_result(self) 316 *self.args, **self.kwds) 317 --> 318 return super(FrameRowApply, self).get_result() 319 320 def apply_broadcast(self): C:PythonMinicondaenvsleiaplibsite-packagespandascoreapply.py in get_result(self) 140 return self.apply_raw() 141 --> 142 return self.apply_standard() 143 144 def apply_empty_result(self): C:PythonMinicondaenvsleiaplibsite-packagespandascoreapply.py in apply_standard(self) 246 247 # compute the result using the series generator --> 248 self.apply_series_generator() 249 250 # wrap results C:PythonMinicondaenvsleiaplibsite-packagespandascoreapply.py in apply_series_generator(self) 275 try: 276 for i, v in enumerate(series_gen): --> 277 results[i] = self.f(v) 278 keys.append(v.name) 279 except Exception as e: <ipython-input-6-347aa35ccd44> in f(series) 1 def f(series): 2 r= series.index ----> 3 return pctl(series.values, series.iloc[r]) C:PythonMinicondaenvsleiaplibsite-packagesscipystatsstats.py in percentileofscore(a, score, kind) 1785 1786 """ -> 1787 if np.isnan(score): 1788 return np.nan 1789 a = np.asarray(a) C:PythonMinicondaenvsleiaplibsite-packagespandascoregeneric.py in __nonzero__(self) 1574 raise ValueError("The truth value of a {0} is ambiguous. " 1575 "Use a.empty, a.bool(), a.item(), a.any() or a.all()." -> 1576 .format(self.__class__.__name__)) 1577 1578 __bool__ = __nonzero__ ValueError: ('The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().', 'occurred at index 0')

答案
如何使用NumPy向量化解决方案:

data = np.arange(100).reshape(20,5) df = pd.DataFrame(data) def f(series): return np.percentile(series,series) df.apply(f)

以上是关于在熊猫中应用```apply()```时出错的主要内容,如果未能解决你的问题,请参考以下文章

在熊猫数据框中按行应用时如何保留数据类型?

如何在熊猫数据框中使用应用时创建列时间戳?

使用熊猫数据框时出错

对索引熊猫系列进行排序时出错

访问熊猫数据框索引时出错

熊猫:GroupBy .pipe() 与 .apply()