遍历 pandas 数据框中的列和行并将字符串转换为浮点数

Posted 2023-03-24

技术标签:

【中文标题】遍历 pandas 数据框中的列和行并将字符串转换为浮点数【英文标题】：Iterate over columns and rows in a pandas dataframe and convert string to float 【发布时间】：2022-01-14 17:28:00 【问题描述】：

我有以下数据框：

col1  col2  col3
25,4  34,2  33,2
33,25 30.2  10,2
.................

我想遍历这个数据集中的所有列和行。

df_range = len(df)

for column in df:
  for i in range(df_range):
    str.replace(',', '.').astype(float)
    print(df)

我收到以下错误：

TypeError                                 Traceback (most recent call last)

<ipython-input-38-47f6c96d2e67> in <module>()
      3 for column in df2:
      4   for i in range(df_range):
----> 5     str.replace(',', '.').astype(float)
      6 
      7     print(df)

TypeError: replace() takes at least 2 arguments (1 given)

【问题讨论】：

【参考方案1】：

为什么str.replace(',', '.').astype(float) 会给你任何有用的东西？该表达式中没有任何内容涉及您正在迭代的内容。即使它要在没有错误的情况下评估某些东西，它也会在循环的每次迭代中评估相同的东西。

如果您使用df.loc[i,column].replace(',','.')，那么replace 是来自字符串对象df.loc[i,column] 的方法，并接受两个参数old 和new。但是，当您执行 str.replace(',','.') 时，replace 是来自 str type 的方法，而不是来自字符串 instance，因此需要参数 self old 和 new。第一个参数',' 被解释为self，而'.' 则为old，而new 则一无所有。当您使用replace 时，您必须将原始字符串作为参数提供给它，或者从原始字符串中获取replace 方法。

此外，您不应该使用索引遍历 df。我们 applymap 代替。

【讨论】：

【参考方案2】：

假设您想在所有行和列中将逗号更改为点，您应该这样做：

df = df.applymap(lambda x: x.replace(',','.')).astype(float)

对于特定的列，您可以这样做：

df['col1'] = df['col1'].str.replace(',','.').astype(float)

或

df['col1'] = df['col1'].map(lambda x: x.replace(',','.')).astype(float)

或

df['col1'] = df['col1'].apply(lambda x: x.replace(',','.')).astype(float)

【讨论】：

以上是关于遍历 pandas 数据框中的列和行并将字符串转换为浮点数的主要内容，如果未能解决你的问题，请参考以下文章

迭代 Pandas Dataframe 中的列和行

切换数据框中的列和行，并在单独的列标题下列出观察结果以执行 Anova：单因素

将 JSON 数据从 Request 转换为 Pandas DataFrame

Pandas：DataFrame数据的更改插入新增的列和行

Pandas列表的列，通过迭代（选择）三列的每个列表元素作为新列和行来创建多列[重复]

重新排序矩阵元素以反映朴素python中的列和行聚类