1. collections模块介绍
collections模块在原生python数据类型的基础上提供了新的数据类型:
- namedtuple: 生成可以使用名字来访问元素内容的tuple子类
- deque: 双端队列,可以快速的从另外一侧追加和推出对象
- Counter: 计数器,主要用来计数
- OrderedDict: 有序字典
- defaultdict: 带有默认值的字典
2. 使用介绍
2.1 namedtuple
namedtuple主要用来产生可以使用名称来访问元素的数据对象,通常用来增强代码的可读性,例如:
In [1]: from collections import namedtuple
In [2]: websites = [
...: ('google', 'http://www.google.com/', 'search engine'),
...: ('Sina', 'http://www.sina.com.cn/', 'blog'),
...: ('taobao', 'http://www.taobao.com/', 'shopping store')
...: ]
In [3]: Website = namedtuple('Website', ['name', 'url', 'remark'])
In [4]: for website in websites:
...: website = Website._make(website)
...: print website
...:
Website(name='google', url='http://www.google.com/', remark='search engine')
Website(name='Sina', url='http://www.sina.com.cn/', remark='blog')
Website(name='taobao', url='http://www.taobao.com/', remark='shopping store')
In [5]: p = Website('w3', 'www.w3school.com.cn', 'study')
In [6]: p
Out[6]: Website(name='w3', url='www.w3school.com.cn', remark='study')
In [7]: d = p._asdict()
In [8]: d['name']
Out[8]: 'w3'
In [9]: d['url']
Out[9]: 'www.w3school.com.cn'
2.2 deque
deque是双端队列,可以从队列的头部增加或取出对象,相较于list,在数据量大的情况下能提高运行速度。
In [1]: from collections import deque
In [2]: p = deque([1, 2, 3, 4])
In [3]: p.append(5)
In [4]: p.appendleft(0)
In [5]: p
Out[5]: deque([0, 1, 2, 3, 4, 5])
2.3 defaultdict
defaultdict可传入一个工厂函数,在请求字典内不存在的key时,将调用工厂函数方法,将结果作为key的默认值,避免标准字典类型的KeyError错误。
In [1]: from collections import
In [2]: d = defaultdict(lambda: 'None')
In [3]: d[1]
Out[4]: 'None'
In [5]: members = [
...: ['male', 'John'],
...: ['male', 'Jack'],
...: ['female', 'Lily'],
...: ['male', 'Pony'],
...: ['female', 'Lucy'],
...: ]
In [6]: result = defaultdict(list)
In [7]: for sex, name in members:
...: result[sex].append(name)
...:
In [8]: print result
defaultdict(<type 'list'>, {'male': ['John', 'Jack', 'Pony'], 'female': ['Lily', 'Lucy']})
2.4 OrderedDict
OrderedDict不同于Dict, 它是一个有序的字典对象。
In [1]: from collections import OrderedDict
In [2]: items = (
...: ('A', 1),
...: ('B', 2),
...: ('C', 3),
...: ('D', 4)
...: )
In [3]: regular_dict = dict(items)
In [4]: ordered_dict = OrderedDict(items)
In [5]: for k, v in regular_dict.iteritems():
...: print "regular dict %s = %s" % (k, v)
...:
regular dict A = 1
regular dict C = 3
regular dict B = 2
regular dict D = 4
In [6]: for k, v in ordered_dict.iteritems():
...: print "ordered dict %s = %s" % (k, v)
...:
ordered dict A = 1
ordered dict B = 2
ordered dict C = 3
ordered dict D = 4
2.5 Counter
Counter是一个计数器,可以用来统计字符出现次数
In [1]: from collections import Counter
In [2]: c = Counter()
In [3]: for s in '111345eeabab':
...: c[s] = c[s] + 1
...:
In [4]: c
Out[4]: Counter({'1': 3, '3': 1, '4': 1, '5': 1, 'a': 2, 'b': 2, 'e': 2})