将列表中的索引附加到列表列表以创建 pandas df

Posted 2023-03-12

技术标签:

【中文标题】将列表中的索引附加到列表列表以创建 pandas df【英文标题】：Attach index from list to a list of lists to create pandas df 【发布时间】：2020-09-29 01:08:45 【问题描述】：

我想知道是否可以从列表列表中创建一个数据框，其中 index_list 中的每个项目都作为索引附加到 lst 中的每个值：

index_list = ['phase1', 'phase2', 'phase3']
lst = [['a', 'b', 'c'], ['d', 'e', 'f', 'g'], ['h', 'i', 'j']]

感谢您的帮助！！

编辑：内部列表的大小不一定相同。

【问题讨论】：

貌似这个问题***.com/q/62284286/12416453 内部子列表的大小总是一样吗？不完全是，因为这是嵌套数组的问题，而我这里有一个列表和一个列表列表。不，内部子列表的大小可能不同。我会更新问题以反映这一点，谢谢没问题。我很高兴这个问题现在更清楚了。也发布了一个适用于不同大小子列表的答案。 【参考方案1】：

您可以在此处使用pd.Series.explode。

pd.Series(lst,index=index_list).explode()
phase1    a
phase1    b
phase1    c
phase2    d
phase2    e
phase2    f
phase2    g
phase3    h
phase3    i
phase3    j
dtype: object

使用np.repeat 和np.concatenate 的另一种解决方案

r_len = [len(r) for r in lst]
pd.Series(np.concatenate(lst), index=np.repeat(index_list,r_len))

phase1    a
phase1    b
phase1    c
phase2    d
phase2    e
phase2    f
phase2    g
phase3    h
phase3    i
phase3    j
dtype: object

Timeit 结果：


In [501]: %%timeit
     ...: pd.Series(lst,index=index_list).explode()
     ...:
     ...:
363 µs ± 16.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [503]: %%timeit
     ...: r_len = [len(r) for r in lst]
     ...: pd.Series(np.concatenate(lst), index=np.repeat(index_list,r_len))
     ...:
     ...:
236 µs ± 17.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

【讨论】：

【参考方案2】：

这个问题看起来类似于 R 的 expand.grid() 函数，并列在 this pandas cookbook（页面底部）中。此函数允许您使用给定输入值的所有组合创建数据框。

首先定义一个函数：

def expand_grid(data_dict):
rows = itertools.product(*data_dict.values())
return pd.DataFrame.from_records(rows, columns=data_dict.keys())

那么你可以这样使用它：

df = expand_grid('index': ['phase1', 'phase2', 'phase3'],
'Col1': [['a', 'b', 'c'], ['d', 'e', 'f', 'g'], ['h', 'i', 'j']])

【讨论】：

以上是关于将列表中的索引附加到列表列表以创建 pandas df的主要内容，如果未能解决你的问题，请参考以下文章

将提取的列附加到没有索引的列表中：Pandas

Python Pandas Dataframe：如何同时将多个索引附加到列表中？

使用 pandas 根据条件将 csv 值附加到列表

将列表列表中的值映射到 Pandas 数据框列

循环遍历 pandas 列名以创建列表

如何在不跟踪索引的情况下将元素附加到列表？