将项目附加到字典 Python
Posted
技术标签:
【中文标题】将项目附加到字典 Python【英文标题】:Append items to dictionary Python 【发布时间】:2015-08-22 02:40:23 【问题描述】:我正在尝试在 python 中编写一个打开文件并将其解析为字典的函数。我正在尝试使列表中的第一项 block
成为字典 data
中每个项目的键。然后每个项目应该是列表的其余部分block
减去第一项。但是由于某种原因,当我运行以下函数时,它会错误地解析它。我在下面提供了输出。我怎样才能像我上面所说的那样解析它?任何帮助将不胜感激。
功能:
def parseData() :
filename="testdata.txt"
file=open(filename,"r+")
block=[]
for line in file:
block.append(line)
if line in ('\n', '\r\n'):
album=block.pop(1)
data[block[1]]=album
block=[]
print data
输入:
Bob Dylan
1966 Blonde on Blonde
-Rainy Day Women #12 & 35
-Pledging My Time
-Visions of Johanna
-One of Us Must Know (Sooner or Later)
-I Want You
-Stuck Inside of Mobile with the Memphis Blues Again
-Leopard-Skin Pill-Box Hat
-Just Like a Woman
-Most Likely You Go Your Way (And I'll Go Mine)
-Temporary Like Achilles
-Absolutely Sweet Marie
-4th Time Around
-Obviously 5 Believers
-Sad Eyed Lady of the Lowlands
输出:
'-Rainy Day Women #12 & 35\n': '1966 Blonde on Blonde\n',
'-Whole Lotta Love\n': '1969 II\n', '-In the Evening\n': '1979 In Through the Outdoor\n'
【问题讨论】:
你在哪里初始化“数据”?将该行添加到代码中 更有趣的是,您应该提供一个输出应该是什么样子的示例。 对不起,但我不明白 - 我试图让列表中的第一项阻止字典数据中每个项目的键。 List Jay 说,你应该提供输出应该是什么样的。 您能添加预期的输出作为示例吗? 你想创建一个艺术家:专辑喜欢列表吗? 【参考方案1】:您可以使用groupby
使用空行作为分隔符对数据进行分组,使用defaultdict
作为重复键,在提取键/第一个元素后扩展从 groupby 返回的每个值的其余值。
from itertools import groupby
from collections import defaultdict
d = defaultdict(list)
with open("file.txt") as f:
for k, val in groupby(f, lambda x: x.strip() != ""):
# if k is True we have a section
if k:
# get key "k" which is the first line
# from each section, val will be the remaining lines
k,*v = val
# add or add to the existing key/value pairing
d[k].extend(map(str.rstrip,v))
from pprint import pprint as pp
pp(d)
输出:
'Bob Dylan\n': ['1966 Blonde on Blonde',
'-Rainy Day Women #12 & 35',
'-Pledging My Time',
'-Visions of Johanna',
'-One of Us Must Know (Sooner or Later)',
'-I Want You',
'-Stuck Inside of Mobile with the Memphis Blues Again',
'-Leopard-Skin Pill-Box Hat',
'-Just Like a Woman',
"-Most Likely You Go Your Way (And I'll Go Mine)",
'-Temporary Like Achilles',
'-Absolutely Sweet Marie',
'-4th Time Around',
'-Obviously 5 Believers',
'-Sad Eyed Lady of the Lowlands'],
'Led Zeppelin\n': ['1979 In Through the Outdoor',
'-In the Evening',
'-South Bound Saurez',
'-Fool in the Rain',
'-Hot Dog',
'-Carouselambra',
'-All My Love',
"-I'm Gonna Crawl",
'1969 II',
'-Whole Lotta Love',
'-What Is and What Should Never Be',
'-The Lemon Song',
'-Thank You',
'-Heartbreaker',
"-Living Loving Maid (She's Just a Woman)",
'-Ramble On',
'-Moby Dick',
'-Bring It on Home']
对于python2,解包语法略有不同:
with open("file.txt") as f:
for k, val in groupby(f, lambda x: x.strip() != ""):
if k:
k, v = next(val), val
d[k].extend(map(str.rstrip, v))
如果您想保留换行符,请删除 map(str.rstrip..
如果您希望为每位艺术家分别提供专辑和歌曲:
from itertools import groupby
from collections import defaultdict
d = defaultdict(lambda: defaultdict(list))
with open("file.txt") as f:
for k, val in groupby(f, lambda x: x.strip() != ""):
if k:
k, alb, songs = next(val),next(val), val
d[k.rstrip()][alb.rstrip()] = list(map(str.rstrip, songs))
from pprint import pprint as pp
pp(d)
'Bob Dylan': '1966 Blonde on Blonde': ['-Rainy Day Women #12 & 35',
'-Pledging My Time',
'-Visions of Johanna',
'-One of Us Must Know (Sooner or '
'Later)',
'-I Want You',
'-Stuck Inside of Mobile with the '
'Memphis Blues Again',
'-Leopard-Skin Pill-Box Hat',
'-Just Like a Woman',
'-Most Likely You Go Your Way '
"(And I'll Go Mine)",
'-Temporary Like Achilles',
'-Absolutely Sweet Marie',
'-4th Time Around',
'-Obviously 5 Believers',
'-Sad Eyed Lady of the Lowlands'],
'Led Zeppelin': '1969 II': ['-Whole Lotta Love',
'-What Is and What Should Never Be',
'-The Lemon Song',
'-Thank You',
'-Heartbreaker',
"-Living Loving Maid (She's Just a Woman)",
'-Ramble On',
'-Moby Dick',
'-Bring It on Home'],
'1979 In Through the Outdoor': ['-In the Evening',
'-South Bound Saurez',
'-Fool in the Rain',
'-Hot Dog',
'-Carouselambra',
'-All My Love',
"-I'm Gonna Crawl"]
【讨论】:
应该是:d[k].append(list(map(str.rstrip, v)))
?否则,您必须重新解析列表才能找到所有专辑。为什么键有尾随换行符? defaultdict(dict)
可能会更好,然后 for 循环块就变成了简单的:i = list(map(str.rstrip, val)); if len(i) > 1: d[i[0]][i[1]] = i[2:]
。
@ekhumoro,答案的第二部分将专辑和歌曲添加为单独的元素,我在第一部分使用了扩展,因为最初我没有看到数据之间的关系,换行是因为我如果需要,忘记 rstrip OP 可以添加,我也不认为 if len(i) > 1: d[i[0]][i[1]] = i[2:]
比我使用的更具可读性,或者为什么我会使用 list(map(str.rstrip, val)) 除非我真的知道有什么东西添加 if k
所做的事情。
创建一个列表意味着您可以剥离所有部分并检查其长度(这可以防止记录不完整)。不过,我不想对这一切做太多 - 我只是提出一些我认为 OP 可能想要的改进建议。
有没有办法删除艺术家和专辑名称中的尾随换行符?对不起,我忘了说明。
@user4959809,我将答案的第二部分更新为strip【参考方案2】:
我猜这就是你想要的?
即使这不是您想要的格式,您也可以从答案中学到一些东西:
使用with
处理文件
很高兴拥有:
PEP8 编译代码,见http://pep8online.com/
shebang
numpydoc
if __name__ == '__main__'
而且 SE 不喜欢列表由代码继续...
#!/usr/bin/env python
""""Parse text files with songs, grouped by album and artist."""
def add_to_data(data, block):
"""
Parameters
----------
data : dict
block : list
Returns
-------
dict
"""
artist = block[0]
album = block[1]
songs = block[2:]
if artist in data:
data[artist][album] = songs
else:
data[artist] = album: songs
return data
def parseData(filename='testdata.txt'):
"""
Parameters
----------
filename : string
Path to a text file.
Returns
-------
dict
"""
data =
with open(filename) as f:
block = []
for line in f:
line = line.strip()
if line == '':
data = add_to_data(data, block)
block = []
else:
block.append(line)
data = add_to_data(data, block)
return data
if __name__ == '__main__':
data = parseData()
import pprint
pp = pprint.PrettyPrinter(indent=4)
pp.pprint(data)
给出:
'Bob Dylan': '1966 Blonde on Blonde': [ '-Rainy Day Women #12 & 35',
'-Pledging My Time',
'-Visions of Johanna',
'-One of Us Must Know (Sooner or Later)',
'-I Want You',
'-Stuck Inside of Mobile with the Memphis Blues Again',
'-Leopard-Skin Pill-Box Hat',
'-Just Like a Woman',
"-Most Likely You Go Your Way (And I'll Go Mine)",
'-Temporary Like Achilles',
'-Absolutely Sweet Marie',
'-4th Time Around',
'-Obviously 5 Believers',
'-Sad Eyed Lady of the Lowlands'],
'Led Zeppelin': '1969 II': [ '-Whole Lotta Love',
'-What Is and What Should Never Be',
'-The Lemon Song',
'-Thank You',
'-Heartbreaker',
"-Living Loving Maid (She's Just a Woman)",
'-Ramble On',
'-Moby Dick',
'-Bring It on Home'],
'1979 In Through the Outdoor': [ '-In the Evening',
'-South Bound Saurez',
'-Fool in the Rain',
'-Hot Dog',
'-Carouselambra',
'-All My Love',
"-I'm Gonna Crawl"]
【讨论】:
以上是关于将项目附加到字典 Python的主要内容,如果未能解决你的问题,请参考以下文章
Python - 将字典列表附加到嵌套的默认字典时出现关键错误
如何将 Python 字典附加到 Pandas DataFrame,将键与列名匹配