在 Pandas 中创建列联表

Posted

技术标签:

【中文标题】在 Pandas 中创建列联表【英文标题】:Creating a Contingency table in Pandas 【发布时间】:2019-03-11 19:15:09 【问题描述】:

我想在 Pandas 中创建一个列联表。我可以用下面的代码做到这一点,但我想知道是否有一个 pandas 函数可以为我做到这一点。

对于一个可重现的例子:

toy_data #json
'"Light":"321":"no_light","476":"night_light","342":"lamp","454":"lamp","25":"night_light","53":"night_light","120":"night_light","346":"night_light","360":"lamp","55":"no_light","391":"night_light","243":"no_light","101":"night_light","377":"night_light","124":"no_light","368":"lamp","400":"no_light","247":"night_light","270":"lamp","208":"night_light","Nearsightedness":"321":"No","476":"Yes","342":"Yes","454":"Yes","25":"No","53":"Yes","120":"Yes","346":"No","360":"No","55":"Yes","391":"Yes","243":"No","101":"No","377":"Yes","124":"No","368":"No","400":"No","247":"No","270":"Yes","208":"No"'

toy_data.head()
    Light       Nearsightedness
321 no_light       No
476 night_light    Yes
342 lamp           Yes
454 lamp           Yes
25  night_light    No

df = pd.DataFrame(toy_data.groupby(['Light', 'Nearsightedness']).size())

df = df.unstack('Nearsightedness')

df.columns = df.columns.droplevel()

df
Nearsightedness No  Yes
Light       
lamp             2  3
night_light      5  5
no_light         4  1

【问题讨论】:

很高兴看到一个完美的问题 - 一个很棒的MCVE,一个可行的解决方案,以及想要的输出! 【参考方案1】:

pd.crosstab 可以解决问题:

pd.crosstab(df.Light, df.Nearsightedness)

输出:

Nearsightedness  No  Yes
Light
lamp              2    3
night_light       5    5
no_light          4    1

【讨论】:

【参考方案2】:

你可以使用pd.crosstab:

res = pd.crosstab(df['Light'], df['Nearsightedness'].eq('Yes'))

print(res)

Nearsightedness  False  True 
Light                        
lamp                 2      3
night_light          5      5
no_light             4      1

【讨论】:

以上是关于在 Pandas 中创建列联表的主要内容,如果未能解决你的问题,请参考以下文章

在 Pandas 中是不是有一种 pythonic 的方法来做一个列联表?

根据其他两个列和表在 pandas 中创建列

根据现有的列名和列值在 python 数据框中创建列

SPSS—描述性统计分析—列联表

Pandas之:深入理解Pandas的数据结构

关于rc列联表的卡方检验 求助!!!