将项目附加到字典 Python

Posted

技术标签:

【中文标题】将项目附加到字典 Python【英文标题】:Append items to dictionary Python 【发布时间】:2015-08-22 02:40:23 【问题描述】:

我正在尝试在 python 中编写一个打开文件并将其解析为字典的函数。我正在尝试使列表中的第一项 block 成为字典 data 中每个项目的键。然后每个项目应该是列表的其余部分block 减去第一项。但是由于某种原因,当我运行以下函数时,它会错误地解析它。我在下面提供了输出。我怎样才能像我上面所说的那样解析它?任何帮助将不胜感激。

功能:

def parseData() :
    filename="testdata.txt"
    file=open(filename,"r+")

    block=[]
    for line in file:
        block.append(line)
        if line in ('\n', '\r\n'):
            album=block.pop(1)
            data[block[1]]=album
            block=[]
    print data

输入:

Bob Dylan
1966 Blonde on Blonde
-Rainy Day Women #12 & 35
-Pledging My Time
-Visions of Johanna
-One of Us Must Know (Sooner or Later)
-I Want You
-Stuck Inside of Mobile with the Memphis Blues Again
-Leopard-Skin Pill-Box Hat
-Just Like a Woman
-Most Likely You Go Your Way (And I'll Go Mine)
-Temporary Like Achilles
-Absolutely Sweet Marie
-4th Time Around
-Obviously 5 Believers
-Sad Eyed Lady of the Lowlands

输出:

'-Rainy Day Women #12 & 35\n': '1966 Blonde on Blonde\n',
 '-Whole Lotta Love\n': '1969 II\n', '-In the Evening\n': '1979 In Through the Outdoor\n'

【问题讨论】:

你在哪里初始化“数据”?将该行添加到代码中 更有趣的是,您应该提供一个输出应该是什么样子的示例。 对不起,但我不明白 - 我试图让列表中的第一项阻止字典数据中每个项目的键。 List Jay 说,你应该提供输出应该是什么样的。 您能添加预期的输出作为示例吗? 你想创建一个艺术家:专辑喜欢列表吗? 【参考方案1】:

您可以使用groupby 使用空行作为分隔符对数据进行分组,使用defaultdict 作为重复键,在提取键/第一个元素后扩展从 groupby 返回的每个值的其余值。

from itertools import groupby
from collections import defaultdict
d = defaultdict(list)
with open("file.txt") as f:
    for k, val in groupby(f, lambda x: x.strip() != ""):
        # if k is True we have a section
       if k:
            # get key  "k" which is the first line
           # from each section, val will be the remaining lines
           k,*v = val
           # add or add to the existing key/value pairing
           d[k].extend(map(str.rstrip,v))
from pprint import pprint as pp
pp(d)

输出:

'Bob Dylan\n': ['1966 Blonde on Blonde',
                 '-Rainy Day Women #12 & 35',
                 '-Pledging My Time',
                 '-Visions of Johanna',
                 '-One of Us Must Know (Sooner or Later)',
                 '-I Want You',
                 '-Stuck Inside of Mobile with the Memphis Blues Again',
                 '-Leopard-Skin Pill-Box Hat',
                 '-Just Like a Woman',
                 "-Most Likely You Go Your Way (And I'll Go Mine)",
                 '-Temporary Like Achilles',
                 '-Absolutely Sweet Marie',
                 '-4th Time Around',
                 '-Obviously 5 Believers',
                 '-Sad Eyed Lady of the Lowlands'],
 'Led Zeppelin\n': ['1979 In Through the Outdoor',
                    '-In the Evening',
                    '-South Bound Saurez',
                    '-Fool in the Rain',
                    '-Hot Dog',
                    '-Carouselambra',
                    '-All My Love',
                    "-I'm Gonna Crawl",
                    '1969 II',
                    '-Whole Lotta Love',
                    '-What Is and What Should Never Be',
                    '-The Lemon Song',
                    '-Thank You',
                    '-Heartbreaker',
                    "-Living Loving Maid (She's Just a Woman)",
                    '-Ramble On',
                    '-Moby Dick',
                    '-Bring It on Home']

对于python2,解包语法略有不同:

with open("file.txt") as f:
    for k, val in groupby(f, lambda x: x.strip() != ""):
        if k:
            k, v = next(val), val
            d[k].extend(map(str.rstrip, v))

如果您想保留换行符,请删除 map(str.rstrip..

如果您希望为每位艺术家分别提供专辑和歌曲:

from itertools import groupby
from collections import defaultdict

d = defaultdict(lambda: defaultdict(list))
with open("file.txt") as f:
    for k, val in groupby(f, lambda x: x.strip() != ""):
        if k:
            k, alb, songs = next(val),next(val), val
            d[k.rstrip()][alb.rstrip()] = list(map(str.rstrip, songs))

from pprint import pprint as pp

pp(d)



'Bob Dylan': '1966 Blonde on Blonde': ['-Rainy Day Women #12 & 35',
                                         '-Pledging My Time',
                                         '-Visions of Johanna',
                                         '-One of Us Must Know (Sooner or '
                                         'Later)',
                                         '-I Want You',
                                         '-Stuck Inside of Mobile with the '
                                         'Memphis Blues Again',
                                         '-Leopard-Skin Pill-Box Hat',
                                         '-Just Like a Woman',
                                         '-Most Likely You Go Your Way '
                                         "(And I'll Go Mine)",
                                         '-Temporary Like Achilles',
                                         '-Absolutely Sweet Marie',
                                         '-4th Time Around',
                                         '-Obviously 5 Believers',
                                         '-Sad Eyed Lady of the Lowlands'],
 'Led Zeppelin': '1969 II': ['-Whole Lotta Love',
                              '-What Is and What Should Never Be',
                              '-The Lemon Song',
                              '-Thank You',
                              '-Heartbreaker',
                              "-Living Loving Maid (She's Just a Woman)",
                              '-Ramble On',
                              '-Moby Dick',
                              '-Bring It on Home'],
                  '1979 In Through the Outdoor': ['-In the Evening',
                                                  '-South Bound Saurez',
                                                  '-Fool in the Rain',
                                                  '-Hot Dog',
                                                  '-Carouselambra',
                                                  '-All My Love',
                                                  "-I'm Gonna Crawl"]

【讨论】:

应该是:d[k].append(list(map(str.rstrip, v)))?否则,您必须重新解析列表才能找到所有专辑。为什么键有尾随换行符? defaultdict(dict) 可能会更好,然后 for 循环块就变成了简单的:i = list(map(str.rstrip, val)); if len(i) > 1: d[i[0]][i[1]] = i[2:] @ekhumoro,答案的第二部分将专辑和歌曲添加为单独的元素,我在第一部分使用了扩展,因为最初我没有看到数据之间的关系,换行是因为我如果需要,忘记 rstrip OP 可以添加,我也不认为 if len(i) > 1: d[i[0]][i[1]] = i[2:] 比我使用的更具可读性,或者为什么我会使用 list(map(str.rstrip, val)) 除非我真的知道有什么东西添加 if k 所做的事情。 创建一个列表意味着您可以剥离所有部分并检查其长度(这可以防止记录不完整)。不过,我不想对这一切做太多 - 我只是提出一些我认为 OP 可能想要的改进建议。 有没有办法删除艺术家和专辑名称中的尾随换行符?对不起,我忘了说明。 @user4959809,我将答案的第二部分更新为strip【参考方案2】:

我猜这就是你想要的?

即使这不是您想要的格式,您也可以从答案中学到一些东西:

使用with 处理文件 很高兴拥有: PEP8 编译代码,见http://pep8online.com/ shebang numpydoc if __name__ == '__main__'

而且 SE 不喜欢列表由代码继续...

#!/usr/bin/env python

""""Parse text files with songs, grouped by album and artist."""


def add_to_data(data, block):
    """
    Parameters
    ----------
    data : dict
    block : list

    Returns
    -------
    dict
    """
    artist = block[0]
    album = block[1]
    songs = block[2:]
    if artist in data:
        data[artist][album] = songs
    else:
        data[artist] = album: songs
    return data


def parseData(filename='testdata.txt'):
    """
    Parameters
    ----------
    filename : string
        Path to a text file.

    Returns
    -------
    dict
    """
    data = 
    with open(filename) as f:
        block = []
        for line in f:
            line = line.strip()
            if line == '':
                data = add_to_data(data, block)
                block = []
            else:
                block.append(line)
        data = add_to_data(data, block)
    return data

if __name__ == '__main__':
    data = parseData()
    import pprint
    pp = pprint.PrettyPrinter(indent=4)
    pp.pprint(data)

给出:

   'Bob Dylan':    '1966 Blonde on Blonde': [   '-Rainy Day Women #12 & 35',
                                                  '-Pledging My Time',
                                                  '-Visions of Johanna',
                                                  '-One of Us Must Know (Sooner or Later)',
                                                  '-I Want You',
                                                  '-Stuck Inside of Mobile with the Memphis Blues Again',
                                                  '-Leopard-Skin Pill-Box Hat',
                                                  '-Just Like a Woman',
                                                  "-Most Likely You Go Your Way (And I'll Go Mine)",
                                                  '-Temporary Like Achilles',
                                                  '-Absolutely Sweet Marie',
                                                  '-4th Time Around',
                                                  '-Obviously 5 Believers',
                                                  '-Sad Eyed Lady of the Lowlands'],
    'Led Zeppelin':    '1969 II': [   '-Whole Lotta Love',
                                       '-What Is and What Should Never Be',
                                       '-The Lemon Song',
                                       '-Thank You',
                                       '-Heartbreaker',
                                       "-Living Loving Maid (She's Just a Woman)",
                                       '-Ramble On',
                                       '-Moby Dick',
                                       '-Bring It on Home'],
                        '1979 In Through the Outdoor': [   '-In the Evening',
                                                           '-South Bound Saurez',
                                                           '-Fool in the Rain',
                                                           '-Hot Dog',
                                                           '-Carouselambra',
                                                           '-All My Love',
                                                           "-I'm Gonna Crawl"]

【讨论】:

以上是关于将项目附加到字典 Python的主要内容,如果未能解决你的问题,请参考以下文章

Python - 将字典列表附加到嵌套的默认字典时出现关键错误

Python:将字典附加到熊猫数据框行

如何将 Python 字典附加到 Pandas DataFrame,将键与列名匹配

在将结果附加到字典的 for 循环上使用 python 多处理

将列表值附加到字典

将值附加到 Python 字典