将值插入在已知列 pandas 中命名的列中

Posted

技术标签:

【中文标题】将值插入在已知列 pandas 中命名的列中【英文标题】:Insert value into column which is named in known column pandas 【发布时间】:2017-04-26 09:14:37 【问题描述】:

我正在为机器学习准备数据,其中数据位于 pandas DataFrame 中,如下所示:

Column   v1    v2
first    1      2
second   3      4
third    5      6

现在我想把它变成:

Column  v1  v2  first-v1  first-v2  second-v1  econd-v2  third-v1  third-v2
first   1   2     1        2         Nan        Nan       Nan      Nan
second  3   4     Nan      Nan       3          4         Nan      Nan
third   5   6     Nan      Nan       Nan        Nan       5        6

我尝试过做这样的事情:

# we know how many values there are but 
# length can be changed into length of [1, 2, 3, ...] values
values = ['v1', 'v2']

# data with description from above is saved in data 
for value in values:
    data[ str(data['Column'] + '-' + value)] = data[ value]

结果是具有名称的列: ['first-v1' 'second-v1'..], ['first-v2' 'second-v2'..] 有正确值的地方。我做错了什么?因为我的数据很大,有没有更优化的方法来做到这一点?

感谢您的宝贵时间!

【问题讨论】:

【参考方案1】:

您可以使用unstack 在列中交换和排序MultiIndex

df = data.set_index('Column', append=True)[values].unstack()
         .swaplevel(0,1, axis=1).sort_index(1)
df.columns = df.columns.map('-'.join)
print (df)
   first-v1  first-v2  second-v1  second-v2  third-v1  third-v2
0       1.0       2.0        NaN        NaN       NaN       NaN
1       NaN       NaN        3.0        4.0       NaN       NaN
2       NaN       NaN        NaN        NaN       5.0       6.0

或者stack + unstack:

df = data.set_index('Column', append=True).stack().unstack([1,2])
df.columns = df.columns.map('-'.join)
print (df)
   first-v1  first-v2  second-v1  second-v2  third-v1  third-v2
0       1.0       2.0        NaN        NaN       NaN       NaN
1       NaN       NaN        3.0        4.0       NaN       NaN
2       NaN       NaN        NaN        NaN       5.0       6.0

最后join 到原来的:

df = data.join(df)
print (df)
   Column  v1  v2  first-v1  first-v2  second-v1  second-v2  third-v1  \
0   first   1   2       1.0       2.0        NaN        NaN       NaN   
1  second   3   4       NaN       NaN        3.0        4.0       NaN   
2   third   5   6       NaN       NaN        NaN        NaN       5.0   

   third-v2  
0       NaN  
1       NaN  
2       6.0  

【讨论】:

哇,谢谢你的回答,我不会自己解决这个问题,再次感谢!

以上是关于将值插入在已知列 pandas 中命名的列中的主要内容,如果未能解决你的问题,请参考以下文章

Sequelize 不会将值插入到新添加的列中

选中复选框时,数据表视图将值插入到字段中

如何通过 switch compact 将值 1 插入 sq-lite 中的列

pandas在dataframe数据列中插入全是全是固定数值或者固定文本内容的数据列(add a column to pandas dataframe with constant values)

pandas使用assign函数在dataframe数据列中插入全是全是缺失值(NaN)的数据列(add an empty column in dataframe)

如何在 Pandas 数据框中的特定位置插入一列? (更改熊猫数据框中的列顺序)