Python:函数返回的值未在熊猫数据框中更新
Posted
技术标签:
【中文标题】Python:函数返回的值未在熊猫数据框中更新【英文标题】:Python: Value returned by function not getting updated in pandas dataframe 【发布时间】:2021-06-06 12:47:21 【问题描述】:我有一个带有列的fruits
数据框:(Name, Color)
和一个带有列的sentence
数据框:(Sentence)
。
水果数据框
Name Color
0 Apple Red
1 Mango Yellow
2 Grapes Green
3 Strawberry Pink
句子数据框
Sentence
0 I like Apple, Mango, Grapes
1 I like ripe Mango
2 Grapes are juicy
3 Oranges are citric
我需要将水果数据帧的每一行与句子数据帧的每一行进行比较,如果水果名称在句子中完全如此,请将其颜色连接到句子中水果名称之前。
这是我使用dataframe.apply()
所做的:
import pandas as pd
import regex as re
# create fruit dataframe
fruit_data = [['Apple', 'Red'], ['Mango', 'Yellow'], ['Grapes', 'Green']]
fruit_df = pd.DataFrame(fruit_data, columns = ['Name', 'Color'])
print(fruit_df)
# create sentence dataframe
sentence = ['I like Apple, Mango, Grapes', 'I like ripe Mango', 'Grapes are juicy']
sentence_df = pd.DataFrame(sentence, columns = ['Sentence'])
print(sentence_df)
def search(desc, name, color):
if re.findall(r"\b" + name + r"\b", desc):
# for loop is used because fruit can appear more than once in sentence
all_indexes = []
for match in re.finditer(r"\b" + name + r"\b", desc):
all_indexes.append(match.start())
arr = list(desc)
for idx in sorted(all_indexes, reverse=True):
arr.insert(idx, color + " ")
new_desc = ''.join(arr)
return new_desc
def compare(name, color):
sentence_df['Result'] = sentence_df['Sentence'].apply(lambda x: search(x, name, color))
fruit_df.apply(lambda x: compare(x['Name'], x['Color']), axis=1)
print ("The final result is: ")
print(sentence_df['Result'])
我得到的结果是:
Sentence Result
0 I like Apple, Mango, Grapes None
1 I like ripe Mango None
2 Grapes are juicy None
3 Oranges are citric None
预期结果:
Sentence Result
0 I like Apple, Mango, Grapes I like Red Apple, Yellow Mango, Green Grapes
1 I like ripe Mango I like ripe Yellow Mango
2 Grapes are juicy Green Grapes are juicy
3 Oranges are citric
我也尝试使用itertuples()
遍历fruits_df
,但结果仍然相同
for row in fruit_df.itertuples():
result = sentence_df['Sentence'].apply(lambda x: search(x, getattr(row, 'Name'), getattr(row, 'Color')))
print(result)
我不明白为什么search
函数返回的值没有存储在新列中。这是正确的做法还是我错过了什么?
【问题讨论】:
【参考方案1】:问题是您为Fruit
的每一行调用compare
,但每次传递都使用相同的输入。
我刚刚在compare
函数中添加了一些调试打印以了解发生了什么:
def compare(name, color):
print(name, color)
sentence_df['Result'] = sentence_df['Sentence'].apply(lambda x: search(x, name, color))
print(sentence_df['Result'])
得到:
Apple Red
0 I like Red Apple, Mango, Grapes
1 None
2 None
Name: Result, dtype: object
Mango Yellow
0 I like Apple, Yellow Mango, Grapes
1 I like ripe Yellow Mango
2 None
Name: Result, dtype: object
Grapes Green
0 I like Apple, Mango, Green Grapes
1 None
2 Green Grapes are juicy
Name: Result, dtype: object
因此,当水果存在时您成功添加颜色,但在不存在时返回 None,并且每次通过时从原始列开始,因此只保留最后一个。
如何解决:
首先在搜索中添加一个缺少的return desc
,以避免出现None
结果
def search(desc, name, color):
if re.findall(r"\b" + name + r"\b", desc):
...
new_desc = ''.join(arr)
return new_desc
return desc
在应用比较之前初始化df['Result']
,并将其用作输入:
def compare(name, color):
sentence_df['Result'] = sentence_df['Result'].apply(lambda x: search(x, name, color))
sentence_df['Result'] = sentence_df['Sentence']
fruit_df.apply(lambda x: compare(x['Name'], x['Color']), axis=1)
最终达到预期:
The final result is:
0 I like Red Apple, Yellow Mango, Green Grapes
1 I like ripe Yellow Mango
2 Green Grapes are juicy
Name: Result, dtype: object
【讨论】:
很好的解释! 感谢您的解决方案!初始化结果列就可以了。【参考方案2】:我们可以在fruits
数据框的帮助下创建一个mapping
系列,然后使用这个mapping
系列和Series.replace
替换出现在Sentence
列中的水果名称与 mapping
系列中的相应替换 (Color
+ Fruit name
):
fruit = r'\b' + fruits['Name'] + r'\b'
fruit_replacement = list(fruits['Color'] + ' ' + fruits['Name'])
mapping = pd.Series(fruit_replacement, index=fruit)
sentence['Result'] = sentence['Sentence'].replace(mapping, regex=True)
>>> sentence
Sentence Result
0 I like Apple, Mango, Grapes I like Red Apple, Yellow Mango, Green Grapes
1 I like ripe Mango I like ripe Yellow Mango
2 Grapes are juicy Green Grapes are juicy
3 Oranges are citric Oranges are citric
【讨论】:
感谢您的解决方案!这种方法比我目前的方法耗时更少。 @Animeartist 编码快乐!【参考方案3】:创建地图字典,然后替换。
尝试:
di = fr: f"co fr" for fr, co in fruit_df.values
res = sentence_df.replace(di, regex=True)
分辨率:
Sentence
0 I like Red Apple, Yellow Mango, Green Grapes
1 I like ripe Yellow Mango
2 Green Grapes are juicy
【讨论】:
感谢您的解决方案。以上是关于Python:函数返回的值未在熊猫数据框中更新的主要内容,如果未能解决你的问题,请参考以下文章
“运行时检查失败 #0 - ESP 的值未在函数调用中正确保存”从 C++ 代码成功 C# 回调后
从C ++代码成功进行C#回调后,“运行时检查失败#0 - ESP的值未在函数调用中正确保存”
文本字段更改的值未在 OnSubmit 中更新 - React-Hook-Form 和 React Js