pandas:只替换单词而不是整个句子

Posted

技术标签:

【中文标题】pandas:只替换单词而不是整个句子【英文标题】:pandas: replace only the word and not the entire sentence 【发布时间】:2022-01-13 09:04:19 【问题描述】:

我有一个如下的数据框:(e,g)

import pandas as pd
df = pd.DataFrame('text':['Lary Page is visiting on Saturday',' On Monday his boss, Maria Jackson is here .'])

我想将以下列表中收集的星期几替换为来自假库中的随机日期,用于每次出现的日子。我做了以下事情:

from faker import Faker
import numpy as np
fake = Faker()

days_list = ['Saturday','Monday','Tuesday']

我尝试了以下方法,但都返回了替换的日期而不是整个句子

df.text = np.where(df.text.str.contains('|'.join(days_list)),
               fake.day_of_week(), df.text)

df.text.str.replace('|'.join(days_list), fake.day_of_week())

我想要的输出:

print(df): (e,g)
'Lary Page is visiting on Tuesday'
'On Thursday his boss, Maria Jackson is here .'

【问题讨论】:

【参考方案1】:

使用 lambda 函数替换回调:

regex = '|'.join(days_list)
df['text'] = df.text.str.replace(regex, lambda x: fake.day_of_week(), regex=True)
print (df)
                                             text
0                Lary Page is visiting on Tuesday
1   On Thursday his boss, Maria Jackson is here .

【讨论】:

【参考方案2】:
from faker import Faker
import pandas as pd
df = pd.DataFrame('text':['Lary Page is visiting on Saturday',' On Monday his boss, Maria Jackson is here .'])

fake = Faker()
days_list = ['Saturday','Monday','Tuesday']
df['text'] = df['text'].apply(lambda x: ' '.join(fake.day_of_week() if i in days_list else i for i in x.split()))

print(df)

输出:

                                          text
0             Lary Page is visiting on Tuesday
1  On Monday his boss, Maria Jackson is here .

【讨论】:

以上是关于pandas:只替换单词而不是整个句子的主要内容,如果未能解决你的问题,请参考以下文章

Pandas str 替换删除整个值而不是替换

如何确保 replaceAll 将替换整个单词而不是子字符串

使用字典替换 Pandas 列中字符串中的字符串

前端打包,怎么只替换修改的文件,而不是整个项目替换?

字符串中的 Pyspark 双字符替换避免某些单词而不映射到 pandas 或 rdd

如何只替换子字符串而不是整个字符串?