如何将df.loc []应用于多行并应用转换?

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了如何将df.loc []应用于多行并应用转换?相关的知识,希望对你有一定的参考价值。

我正在尝试对df [“ Text_str”]中的所有行应用转换,以便可以利用我的padding函数。

到目前为止,我可以通过执行df.loc [i,“ Text_str”]进行手动测试,但是我需要浏览一些文本行并将结果附加到df。

如何将df.loc [i,“ Text_str”]转换为函数,或者更好的是,将其应用于填充函数?

# percentage of words that are considered stopwords
def padding(text):
    #from nltk.corpus import stopwords
    nltk.download('stopwords')
    stopwords = nltk.corpus.stopwords.words('english')
    text = re.findall('[A-z]+', text)
    content = [w for w in text if w.lower() in stopwords] # you can calculate %stopwords using "in"
    return round(float(len(content)) / len(text), 2)

test_data = df.loc[1, "Text_str"]

print(padding(test_data))

错误:

TypeError: expected string or bytes-like object

[df [“ Text_str”的1行示例]

0    Parker-Hannifin Corp. (NYSE: Q2 2016 Earnings Call January 26, 2016 11:00 am ET Executives Jon P. Marten - Executive Vice President-Finance & Adminstration and Chief Financial Officer Thomas L. Williams - Chairman & Chief Executive Officer Lee C. Banks - President and Chief Operating Officer Analysts James A. Picariello - KeyBanc Capital Markets, Inc. Nicole Deblase - Morgan Stanley & Co. LLC Eli Lustgarten - Longbow Research LLC Andrew M. Casey - Wells Fargo Securities LLC Ann P. Duignan - JPMorgan Securities LLC Jamie L. Cook - Credit Suisse Securities (NYSE: Joseph Alfred Ritchie - Goldman Sachs & Co. Nathan Jones - Stifel, Nicolaus & Co., Inc. Andrew Burris Obin - Bank of America Merrill Lynch Joel Gifford Tiss - BMO Capital Markets (United States) Operator Good day, ladies and gentlemen, and welcome to the Parker-Hannifin Corp. Fiscal 2016 Second Quarter Earnings Conference Call. At this time, all participants are in a listen-only mode. Later, we will conduct a question-and-an...

类型:

<class 'pandas.core.series.Series'>
答案

我知道了:

df['Padding'] = [padding(x) for x in list(df.loc[:,'Text_str'])]

以上是关于如何将df.loc []应用于多行并应用转换?的主要内容,如果未能解决你的问题,请参考以下文章

如何将 LaTeX 命令和环境包装器应用于 ViM 中的单行或多行可视选择?

将 append() 与 df.loc == 语句一起使用 Pandas Python

如何直接将代码应用于a中的所有文件并将xml文件转换为txt文件

从记录类型单列拆分/提取值,将用户定义的函数应用于多行 CTE

如何让 df.loc 从数据帧的特定单元格返回值(数字)?

如何将 pandas.core.frame.DataFrame 转换为列表?