Dictionaries and Sets
Posted neozheng
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Dictionaries and Sets相关的知识,希望对你有一定的参考价值。
1. Handling missing keys with setdefault
import sys import re WORD_RE = re.compile(‘w+‘) index = {} print(sys.argv) # Example 3-2 with open(sys.argv[1], encoding=‘utf-8‘) as fp: for line_no, line in enumerate(fp, 1): for match in WORD_RE.finditer(line): # finditer 返回的格式: <_sre.SRE_Match object; span=(0, 4), match=‘User‘> ; # 既有匹配到的内容,也有该内容的位置, match.start() 和 match.end()分别表起始位置和结束位置 word = match.group() # match.group() 返回匹配到的内容: 如 User column_no = match.start() + 1 location = (line_no, column_no) # 以下为常规写法: occurrences = index.get(word, []) occurrences.append(location) index[word] = occurrences for word in sorted(index, key=str.upper): # 对字典进行排序 print(word, index[word]) print("-----------------------") # Example 3-4:handling missing keys with setdefault index2 = {} with open(sys.argv[1], encoding=‘utf-8‘) as fp: for line_no, line in enumerate(fp, 1): for match in WORD_RE.finditer(line): word = match.group() column_no = match.start() + 1 occurrences = (line_no, column_no) # Missing keys with setdefault index2.setdefault(word, []).append(occurrences) # setdefault :有就用它原来的,没有则设置 # Get the list of occurrences for word, or set it to [] if not found; # setdefault returns the value, so it can be updated without requiring a second search. for word in sorted(index2, key=str.upper): print(word, index2[word]) # Output 示例: # flasgger [(3, 6), (4, 6)] # flask [(2, 6)] # Flask [(2, 19)] # from [(2, 1), (3, 1), (4, 1)] # import [(1, 1), (2, 12), (3, 15), (4, 21)] # jsonify [(2, 26)] # random [(1, 8)] # request [(2, 35)] # Swagger [(3, 22)] # swag_from [(4, 28)] # utils [(4, 15)] """ The result of this line ... my_dict.setdefault(key, []).append(new_value) ... is the same as running ... if key not in my_dict: my_dict[key] = [] my_dict[key].append(new_value) ... except that the latter code performs at least two searches for key --- three if not found --- while setdefault does it all with a single lookup. """
2. Mapping with Flexible Key Lookup
2.1 defaultdict: Another Take on Missing Keys
示例代码如下:
import re import sys import collections WORD_RE = re.compile(‘w+‘) index = collections.defaultdict(list) with open(sys.argv[1], encoding=‘utf-8‘) as fp: for line_no, line in enumerate(fp, 1): for match in WORD_RE.finditer(line): word = match.group() column_no = match.start() + 1 occurrences = (line_no, column_no) # defaultdict 示例: index[word].append(occurrences) for word in sorted(index, key=str.upper): print(word, index[word]) # Output: # flasgger [(3, 6), (4, 6)] # flask [(2, 6)] # Flask [(2, 19)] # from [(2, 1), (3, 1), (4, 1)] # import [(1, 1), (2, 12), (3, 15), (4, 21)] # jsonify [(2, 26)] # random [(1, 8)] # request [(2, 35)] # Swagger [(3, 22)] # swag_from [(4, 28)] # utils [(4, 15)] """ defaultdict: How defaultdict works: When instantiating a defaultdict, you provide a callable that is used to produce default value whenever __getitem__ is passed a nonexistent key argument. For example, given an empty defaultdict created as dd = defaultdict(list), if ‘new_key‘ is not in dd, the expression dd[‘new_key‘] does the following steps: 1. Call list() to create a new list. 2. Inserts the list into dd using ‘new_key‘ as key. 3. Returns a reference to that list. The callable that produces the default values is held in an instance attribute called default_factory. If no default_factory is provided, the usual KeyError is raised for missing keys. The default_factory of a defaultdict is only invoked to provide default values for __getitem__ calls, and not for the other methods. For example, if dd is a defaultdict, and k is a missing key, dd[k] will call the default_factory to create a default value, but dd.get(k) still returns None. The mechanism that makes defaultdict work by calling default_factory is actually the __missing__ special method, a feature supported by all standard mapping. """
2.2 The __missing__ Method
示例代码如下:
""" StrKeyDict0 converts nonstring keys to str on lookup """ class StrKeyDict0(dict): def __missing__(self, key): if isinstance(key, str): # 如果没有这个判断,self[k] 在没有的情况下会无限递归调用 __missing__ raise KeyError(key) return self[str(key)] def get(self, key, default=None): """ The get method delegates to __getitem__ by using the self[key] notation; that gives the opportunity for our __missing__ to act. :param key: :param default: :return: """ try: return self[key] except KeyError: return default def __contains__(self, key): # 此时不能用 key in self (self 指 StrKeyDict0 的实例,就是一个字典)进行判断, # 因为 k in dict 也会调用 __contains__ ,所以会出现无限递归调用 __contains__ return key in self.keys() or str(key) in self.keys() # A better way to create a user-defined mapping type is to subclass collections.UserDict instead of dict. """ Underlying the way mappings deal with missing keys is the aptly named __missing__ method. This method is not defined in the base dict class, but dict is aware of it: if you subclass dict and provide a __missing__ method, the standard dict.__getitem__ will call it whenever a key is not found, instead of raising KeyError. The __missing__ method is just called by __getitem__ (i.e., for the d[k] operator). The presence of a __missing__ method has no effect on the behavior of other methods that look up keys, such as get or __contains__ . """
小结: 对于字典中不存在的 key ,有三种方式进行处理: 1. setdefault 2. collections.defaultdict 3. __missing__ 方法
end
以上是关于Dictionaries and Sets的主要内容,如果未能解决你的问题,请参考以下文章
Python文摘:Unicode and Character Sets
Codeforces 388 D. Fox and Perfect Sets
Kernels and image sets for an operator and its dual
论文简析+解读+Pytorch实现PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation