在col1中查找python中列表的col2中的值的最大值

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了在col1中查找python中列表的col2中的值的最大值相关的知识,希望对你有一定的参考价值。

我是python的新手。我想从col2中找到关于列表col1中'men','women'和'people'值的最大值。比如,['men', 12, '1946-Truman.txt'], ['women', 7, '1946-Truman.txt']and['people', 49, '1946-Truman.txt']包含男性,女性和人的最大col2值。

一种可能的解决方案是将这个元组列表转换为男性,女性和人的三个独立数组,然后从所有数组中查找最大值。但是,我想要一个更好的解决方案。

数据:

[['men', 2, '1945-Truman.txt']
['women', 2, '1945-Truman.txt']
['people', 10, '1945-Truman.txt']
['men', 12, '1946-Truman.txt']
['women', 7, '1946-Truman.txt']
['people', 49, '1946-Truman.txt']
['men', 7, '1947-Truman.txt']
['women', 2, '1947-Truman.txt']
['people', 12, '1947-Truman.txt']
['men', 4, '1948-Truman.txt']
['women', 1, '1948-Truman.txt']
['people', 22, '1948-Truman.txt']
['men', 2, '1949-Truman.txt']
['women', 1, '1949-Truman.txt']
['people', 15, '1949-Truman.txt']
['men', 6, '1950-Truman.txt']
['women', 2, '1950-Truman.txt']
['people', 15, '1950-Truman.txt']
['men', 8, '1951-Truman.txt']
['women', 2, '1951-Truman.txt']
['people', 9, '1951-Truman.txt']
['men', 3, '1953-Eisenhower.txt']
['women', 0, '1953-Eisenhower.txt']
['people', 17, '1953-Eisenhower.txt']]

提前致谢。

答案

pandas是一个很好的,但你可以使用maxlambda

men = max(data, key=lambda x: x[1] if x[0] == 'men' else 0)
women = max(data, key=lambda x: x[1] if x[0] == 'women' else 0)
people = max(data, key=lambda x: x[1] if x[0] == 'people' else 0)
另一答案

您可以使用pandas包。通过定义数据框:

import pandas as pd
df = pd.DataFrame([['men', 2, '1945-Truman.txt'],
                   ['women', 2, '1945-Truman.txt'],
                   ['people', 10, '1945-Truman.txt'],
                   ['men', 12, '1946-Truman.txt'],
                    ['women', 7, '1946-Truman.txt'],
                   ['people', 49, '1946-Truman.txt'],
                   ['men', 7, '1947-Truman.txt'],
                   ['women', 2, '1947-Truman.txt'],
                   ['people', 12, '1947-Truman.txt'],
                   ['men', 4, '1948-Truman.txt'],
                   ['women', 1, '1948-Truman.txt'],
                   ['people', 22, '1948-Truman.txt'],
                   ['men', 2, '1949-Truman.txt'],
                   ['women', 1, '1949-Truman.txt'],
                   ['people', 15, '1949-Truman.txt'],
                   ['men', 6, '1950-Truman.txt'],
                   ['women', 2, '1950-Truman.txt'],
                   ['people', 15, '1950-Truman.txt'],
                   ['men', 8, '1951-Truman.txt'],
                   ['women', 2, '1951-Truman.txt'],
                   ['people', 9, '1951-Truman.txt'],
                   ['men', 3, '1953-Eisenhower.txt'],
                   ['women', 0, '1953-Eisenhower.txt'],
                   ['people', 17, '1953-Eisenhower.txt']])

然后

df.groupby([0], sort=False)[1].max()

返回

0 
men       12
women      7
people    49
Name: 1, dtype: int64

那是你要的吗 ?

另一答案

如果您使用的列表列表如下:

lst=[['men', 2123, '1945-Truman.txt'],
['women', 2, '1945-Truman.txt'],
['people', 10, '1945-Truman.txt'],
['men', 12, '1946-Truman.txt'],
['women', 7, '1946-Truman.txt'],
['people', 49, '1946-Truman.txt'],
['men', 7, '1947-Truman.txt'],
['women', 2, '1947-Truman.txt']]

然后,您可以使用以下代码。

max_men=0
max_women=0
max_people =0
for item in lst:
    if((item[0]=="men") and (item[1]>max_men)):
        max_men=item[1]
    elif((item[0]=="women") and (item[1]>max_women)):
        max_women=item[1]
    elif((item[0]=="people") and (item[1]>max_people)):
        max_people=item[1]

print max_men
print max_women
print max_people

这将进入名为lst的位列表中的每个列表,并找到男性,女性和人的最大值。

另一答案

您可以从第一列创建一个集合,然后找到最大值:

data = [
    ['men', 2, '1945-Truman.txt'],
    ['women', 2, '1945-Truman.txt'],
    ...
]

keys = set([col[0] for col in data])

for k in keys:
        print (k, max([col[1] for col in data if col[0] == k]))

返回:

women 7
people 49
men 12
另一答案

你可以使用itertools.groupby

import itertools
new_data = [(a, list(b)) for a, b in itertools.groupby(sorted(data, key=lambda x:x[0]), key=lambda x:x[0])]
new_final_data = [max(b, key=lambda x:x[1]) for a, b in new_data]

输出:

[['men', 12, '1946-Truman.txt'], ['people', 49, '1946-Truman.txt'], ['women', 7, '1946-Truman.txt']]

或者,每个键的字典是个人的类型:

new_final_data = {a:max(b, key=lambda x:x[1]) for a, b in new_data}

输出:

{'women': ['women', 7, '1946-Truman.txt'], 'men': ['men', 12, '1946-Truman.txt'], 'people': ['people', 49, '1946-Truman.txt']}
另一答案

你可以使用pandas,我想数据是一个列表列表:

import pandas as pd

df = pd.DataFrame(data)

df.loc[df.groupby([0])[1].idxmax()]

        0   1                2
3     men  12  1946-Truman.txt
5  people  49  1946-Truman.txt
4   women   7  1946-Truman.txt

对于相同格式的结果:

df.loc[df.groupby([0])[1].idxmax()].values.tolist()

[['men', 12, '1946-Truman.txt'], ['people', 49, '1946-Truman.txt'], ['women', 7, '1946-Truman.txt']]
另一答案
men = [t for t in yourlist if t[0] == 'men']
women = [t for t in yourlist  if t[0] == 'women']
people = [t for t in yourlist  if t[0] == 'people']
sorted(men, key=operator.itemgetter(1), reverse=True)[0][1]
sorted(women, key=operator.itemgetter(1), reverse=True)[0][1]
sorted(people, key=operator.itemgetter(1), reverse=True)[0][1]

以上是关于在col1中查找python中列表的col2中的值的最大值的主要内容,如果未能解决你的问题,请参考以下文章

如何使用 SQL 或 Python 在下面提到的标准中查找表中的唯一记录(在所有列中)?

PySpark查找另一列中是否存在一列中的模式

从 SQL 表中查找部分和完全重复

SQL Server:查询以获取表 1 的 Col1 中的值的总和,以获取表 2 的 Col2 中的条件

在Python中转置和乘以列表

使用 Scala 在以 Spark 中的列值为条件的广播 Map 上执行查找