如何将数据从合并的单元格拆分为Python数据帧同一行中的其他单元格?

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了如何将数据从合并的单元格拆分为Python数据帧同一行中的其他单元格?相关的知识,希望对你有一定的参考价值。

我有一个数据帧的示例,看起来像这样:

+---+--------------------------------------------------------------------------------------+---------------+--------------------------------------------+
|   | Date                                                                                 | Professional  | Description                                |
+---+--------------------------------------------------------------------------------------+---------------+--------------------------------------------+
| 0 | 2019-12-19 00:00:00                                                                  | Katie Cool    | Travel to Space ...                        |
+---+--------------------------------------------------------------------------------------+---------------+--------------------------------------------+
| 1 | 2019-12-20 00:00:00                                                                  | Jenn Blossoms | Review stuff; prepare cancellations of ... |
+---+--------------------------------------------------------------------------------------+---------------+--------------------------------------------+
| 2 | 2019-12-27 00:00:00                                                                  | Jenn Blossoms | Review lots of stuff/o...                  |
+---+--------------------------------------------------------------------------------------+---------------+--------------------------------------------+
| 3 | 2019-12-27 00:00:00                                                                  | Jenn Blossoms | Draft email to world leader...             |
+---+--------------------------------------------------------------------------------------+---------------+--------------------------------------------+
| 4 | 2019-12-30 00:00:00                                                                  | Jenn Blossoms | Review this thing.                         |
+---+--------------------------------------------------------------------------------------+---------------+--------------------------------------------+
| 5 | 12-30-2019 Jenn Blossoms Telephone   Call   to   A.   Bell   return   her   multiple | NaN           | NaN                                        |
|   | voicemails.                                                                          |               |                                            |
+---+--------------------------------------------------------------------------------------+---------------+--------------------------------------------+

该行的许多数据都在日期单元格中。

我希望样本看起来像这样:

+---+---------------------+---------------+-------------------------------------------------------------+
|   | Date                | Professional  | Description                                                 |
+---+---------------------+---------------+-------------------------------------------------------------+
| 0 | 2019-12-19 00:00:00 | Katie Cool    | Travel to Space ...                                         |
+---+---------------------+---------------+-------------------------------------------------------------+
| 1 | 2019-12-20 00:00:00 | Jenn Blossoms | Review stuff; prepare cancellations of ...                  |
+---+---------------------+---------------+-------------------------------------------------------------+
| 2 | 2019-12-27 00:00:00 | Jenn Blossoms | Review lots of stuff/o...                                   |
+---+---------------------+---------------+-------------------------------------------------------------+
| 3 | 2019-12-27 00:00:00 | Jenn Blossoms | Draft email to world leader...                              |
+---+---------------------+---------------+-------------------------------------------------------------+
| 4 | 2019-12-30 00:00:00 | Jenn Blossoms | Review this thing.                                          |
+---+---------------------+---------------+-------------------------------------------------------------+
| 5 | 12-30-2019          | Jenn Blossoms | Telephone   Call   to   A.   Bell   return   her   multiple |
|   |                     |               | voicemails.                                                 |
+---+---------------------+---------------+-------------------------------------------------------------+

我已经尝试过此代码:

date = dftopdata['Date'].str.extract('(\d2-\d2-\d4)(\s\w+\s\w+)\s(\w+.*)')[0]
name = dftopdata['Date'].str.extract('(\d2-\d2-\d4)(\s\w+\s\w+)\s(\w+.*)')[1]
description = dftopdata['Date'].str.extract('(\d2-\d2-\d4)(\s\w+\s\w+)\s(\w+.*)')[2]

dftopdata.loc[pd.to_datetime(dftopdata['Date'],errors='coerce').isnull(),'Professional'] = name
dftopdata.loc[pd.to_datetime(dftopdata['Date'],errors='coerce').isnull(),'Description'] = description
dftopdata.loc[pd.to_datetime(dftopdata['Date'],errors='coerce').isnull(),'Date'] = date

但是当我运行上面的代码时,数据帧示例如下所示:

+---+------------+---------------+--------------------------------------------+
|   | Date       | Professional  | Description                                |
+---+------------+---------------+--------------------------------------------+
| 0 | 12/19/2019 | Katie Cool    | Travel to space ...                        |
+---+------------+---------------+--------------------------------------------+
| 1 | 12/20/2019 | Jenn Blossoms | Review stuff; prepare cancellations of ... |
+---+------------+---------------+--------------------------------------------+
| 2 | 12/27/2019 | Jenn Blossoms | Review lots of stuff/o…                    |
+---+------------+---------------+--------------------------------------------+
| 3 | 12/27/2019 | Jenn Blossoms | Draft email to world leader...             |
+---+------------+---------------+--------------------------------------------+
| 4 | 12/30/2019 | Jenn Blossoms | Review this thing.                         |
+---+------------+---------------+--------------------------------------------+
| 5 | NaN        | NaN           | NaN                                        |
+---+------------+---------------+--------------------------------------------+
答案

您可以使用str.split方法将字符串拆分为“单词”。

df['list_of_words'] = dftopdata['Date'].str.split()

如果有一种模式可以从此list_of_words中拆分专业描述部分,则可以使用它。例如,如果list_of_words的前2个单词组成了专业人士的名称,那么您可以-->

df['Professional'] = df.apply(lambda x: ' '.join(x['list_of_words'][:2]), axis=1)
df['Description'] = df.apply(lambda x: ' '.join(x['list_of_words'][2:]), axis=1)

以上是关于如何将数据从合并的单元格拆分为Python数据帧同一行中的其他单元格?的主要内容,如果未能解决你的问题,请参考以下文章

在excel中,修改单元格数据的方法有几种?

如何通过Java 合并和取消合并 Excel 单元格

如何在excel中拆分单元格

python处理Excel实现自动化办公教学(数据筛选公式操作单元格拆分合并冻结窗口图表绘制等)

C#/VB.NET 如何在Word表格中拆分或合并单元格?

如何将EXCEL表格中的同一列有相同的内容 合并成一个单元格?