python collections包

作者: 仁暮 | 来源:发表于2017-10-15 21:52 被阅读0次

python collections包
Python包之collections
from compiler.ast import flatten
Python 入门之内置模块 -- collections模块
Python 字典基础回顾
Python的collections模块
Python高阶部分扩展 (collections ,iter
用go实现python中的collections中的Counte
Python collections
Python collections模块--defaultdic

defaultdict
Counter
deque
namedtuple
OrderedDict

defaultdict

在python原生的数据结构中如果用d[key]访问，当指定key不存在时，会抛出KeyError异常

from collections import defaultdict

members = [
    # Age, name
    ['male', 'John'],
    ['male', 'Jack'],
    ['female', 'Lily'],
    ['male', 'Pony'],
    ['female', 'Lucy'],
]

result = defaultdict(list)
for sex, name in members:
    result[sex].append(name)  # 当key不存在时仍能插入

print result

# Result:
defaultdict(<type 'list'>, {'male': ['John', 'Jack', 'Pony'], 'female': ['Lily', 'Lucy']})

如果希望key不存在时返回默认值，就可以用defaultdict

>>> from collections import defaultdict
>>> dd = defaultdict(lambda: 'N/A')
>>> dd['key1'] = 'abc'
>>> dd['key1'] # key1存在
'abc'
>>> dd['key2'] # key2不存在，返回默认值
'N/A'

注意默认值是调用函数返回的，而函数在创建defaultdict对象时传入。

Counter

Counter是一个简单的计数器，例如统计字符出现的个数

创建

    >>> c = Counter()  # 创建一个空的Counter类
    >>> c = Counter('gallahad')  # 从一个可iterable对象（list、tuple、dict、字符串等）创建
    >>> c = Counter({'a': 4, 'b': 2})  # 从一个字典对象创建
    >>> c = Counter(a=4, b=2)  # 从一组键值对创建

计数值的访问与缺失的键

当所访问的键不存在时，返回0，而不是KeyError；否则返回它的计数。

>>> c = Counter("abcdefgab")
>>> c["a"]
2
>>> c["c"]
1
>>> c["h"]
0

计数器的更新（update和subtract）

增加使用update()函数

>>> c = Counter('which')
>>> c.update('witch')  # 使用另一个iterable对象更新
>>> c['h']
3
>>> d = Counter('watch')
>>> c.update(d)  # 使用另一个Counter对象更新
>>> c['h']
4

减少使用subtract()函数

>>> c = Counter('which')
>>> c.subtract('witch')  # 使用另一个iterable对象更新
>>> c['h']
1
>>> d = Counter('watch')
>>> c.subtract(d)  # 使用另一个Counter对象更新
>>> c['a']
-1

键的删除
当计数值为0时，并不意味着元素被删除，删除元素应当使用del。

>>> c = Counter("abcdcba")
>>> c
Counter({'a': 2, 'c': 2, 'b': 2, 'd': 1})
>>> c["b"] = 0
>>> c
Counter({'a': 2, 'c': 2, 'd': 1, 'b': 0})
>>> del c["a"]
>>> c
Counter({'c': 2, 'b': 2, 'd': 1})

elements()
返回一个迭代器。元素被重复了多少次，在该迭代器中就包含多少个该元素。元素排列无确定顺序，个数小于1的元素不被包含。

>>> c = Counter(a=4, b=2, c=0, d=-2)
>>> list(c.elements())
['a', 'a', 'a', 'a', 'b', 'b']

most_common([n])
返回一个TopN列表。如果n没有被指定，则返回所有元素。当多个元素计数值相同时，排列是无确定顺序的。

>>> c = Counter('abracadabra')
>>> c.most_common()
[('a', 5), ('r', 2), ('b', 2), ('c', 1), ('d', 1)]
>>> c.most_common(3)
[('a', 5), ('r', 2), ('b', 2)]

浅拷贝copy

>>> c = Counter("abcdcba")
>>> c
Counter({'a': 2, 'c': 2, 'b': 2, 'd': 1})
>>> d = c.copy()
>>> d
Counter({'a': 2, 'c': 2, 'b': 2, 'd': 1})

算术和集合操作
+、-、&、|操作也可以用于Counter。其中&和|操作分别返回两个Counter对象各元素的最小值和最大值。需要注意的是，得到的Counter对象将删除小于1的元素。

>>> c = Counter(a=3, b=1)
>>> d = Counter(a=1, b=2)
>>> c + d  # c[x] + d[x]
Counter({'a': 4, 'b': 3})
>>> c - d  # subtract（只保留正数计数的元素）
Counter({'a': 2})
>>> c & d  # 交集:  min(c[x], d[x])
Counter({'a': 1, 'b': 1})
>>> c | d  # 并集:  max(c[x], d[x])
Counter({'a': 3, 'b': 2})

常用操作

sum(c.values())  # 所有计数的总数
c.clear()  # 重置Counter对象，注意不是删除
list(c)  # 将c中的键转为列表
set(c)  # 将c中的键转为set
dict(c)  # 将c中的键值对转为字典
c.items()  # 转为(elem, cnt)格式的列表
Counter(dict(list_of_pairs))  # 从(elem, cnt)格式的列表转换为Counter类对象
c.most_common()[:-n:-1]  # 取出计数最少的n-1个元素
c += Counter()  # 移除0和负值

deque

deque提供了一个双端队列，你可以从头/尾两端添加或删除元素。要想使用它，首先我们要从collections中导入deque模块：

from collections import deque

你可以创建一个deque对象

d = deque()

它的用法就像python的list，并且提供了类似的方法

d = deque()
d.append('1')
d.append('2')
d.append('3')

print(len(d))

## 输出: 3

print(d[0])

## 输出: '1'

print(d[-1])

## 输出: '3'

可以从两端取出数据

d = deque(range(5))
print(len(d))

## 输出: 5

d.popleft()

## 输出: 0

d.pop()

## 输出: 4

print(d)

## 输出: deque([1, 2, 3])

我们也可以限制这个列表的大小，当超出你设定的限制时，数据会从对队列另一端被挤出去(pop)。
最好的解释是给出一个例子：

d = deque(maxlen=30)

现在当你插入30条数据时，最左边一端的数据将从队列中删除。
你还可以从任一端扩展这个队列中的数据：

d = deque([1,2,3,4,5])
d.extendleft([0])
d.extend([6,7,8])
print(d)

## 输出: deque([0, 1, 2, 3, 4, 5, 6, 7, 8])

官方文档

def append(self, *args, **kwargs): # real signature unknown
    """ Add an element to the right side of the deque. """
    pass

def appendleft(self, *args, **kwargs): # real signature unknown
    """ Add an element to the left side of the deque. """
    pass

def clear(self, *args, **kwargs): # real signature unknown
    """ Remove all elements from the deque. """
    pass

def copy(self, *args, **kwargs): # real signature unknown
    """ Return a shallow copy of a deque. """
    pass

def count(self, value): # real signature unknown; restored from __doc__
    """ D.count(value) -> integer -- return number of occurrences of value """
    return 0

def extend(self, *args, **kwargs): # real signature unknown
    """ Extend the right side of the deque with elements from the iterable """
    pass

def extendleft(self, *args, **kwargs): # real signature unknown
    """ Extend the left side of the deque with elements from the iterable """
    pass

def index(self, value, start=None, stop=None): # real signature unknown; restored from __doc__
    """
    D.index(value, [start, [stop]]) -> integer -- return first index of value.
    Raises ValueError if the value is not present.
    """
    return 0

def insert(self, index, p_object): # real signature unknown; restored from __doc__
    """ D.insert(index, object) -- insert object before index """
    pass

def pop(self, *args, **kwargs): # real signature unknown
    """ Remove and return the rightmost element. """
    pass

def popleft(self, *args, **kwargs): # real signature unknown
    """ Remove and return the leftmost element. """
    pass

def remove(self, value): # real signature unknown; restored from __doc__
    """ D.remove(value) -- remove first occurrence of value. """
    pass

def reverse(self): # real signature unknown; restored from __doc__
    """ D.reverse() -- reverse *IN PLACE* """
    pass

def rotate(self, *args, **kwargs): # real signature unknown
    """ Rotate the deque n steps to the right (default n=1).  If n is negative, rotates left. """
    pass

namedtuple

元组是一个不可变的列表，你可以存储一个数据的序列，它和命名元组（namedtuples）非常像，但有几个关键的不同。
主要相似点是你不能修改元组中的数据，为了获取数据，你需要用整数作为索引
嗯，那namedtuples是什么呢？它把元组变成一个针对简单任务的容器。你不必使用整数索引来访问一个namedtuples的数据。你可以像字典(dict)一样访问namedtuples，但namedtuples是不可变的。
from collections import namedtuple

Animal = namedtuple('Animal', 'name age type')
perry = Animal(name="perry", age=31, type="cat")

print(perry)

## 输出: Animal(name='perry', age=31, type='cat')

print(perry.name)

## 输出: 'perry'

将命运元组转换为字典
from collections import namedtuple

Animal = namedtuple('Animal', 'name age type')
perry = Animal(name="Perry", age=31, type="cat")
print(perry._asdict())

## 输出: OrderedDict([('name', 'Perry'), ('age', 31), ...

官方实例

"""Returns a new subclass of tuple with named fields.

>>> Point = namedtuple('Point', ['x', 'y'])
>>> Point.__doc__                   # docstring for the new class
'Point(x, y)'
>>> p = Point(11, y=22)             # instantiate with positional args or keywords
>>> p[0] + p[1]                     # indexable like a plain tuple
33
>>> x, y = p                        # unpack like a regular tuple
>>> x, y
(11, 22)
>>> p.x + p.y                       # fields also accessible by name
33
>>> d = p._asdict()                 # convert to a dictionary
>>> d['x']
11
>>> Point(**d)                      # convert from a dictionary
Point(x=11, y=22)
>>> p._replace(x=100)               # _replace() is like str.replace() but targets named fields
Point(x=100, y=22)

"""

OrederedDict

在python中，dict这个数据结构由于hash的特性，是无序的，在collections模块中的OrederDict提供了有序的字典对象

from collections import OrderedDict

items = (
    ('A', 1),
    ('B', 2),
    ('C', 3)
)
regular_dict = dict(items)
ordered_dict = OrderedDict(items)

print('Regular Dict:')
for k, v in regular_dict.items():
    print(k, v)

print('Ordered Dict:')
for k, v in ordered_dict.items():
    print(k, v)



# Result:
Regular Dict:
A 1
C 3
B 2
Ordered Dict:
A 1
B 2
C 3

注意,OrderedDict的key会依照插入的顺序排列，不是Key本身
官方实例
获得排序后的字典

>>> # regular unsorted dictionary
>>> d = {'banana': 3, 'apple': 4, 'pear': 1, 'orange': 2}

>>> # dictionary sorted by key
>>> OrderedDict(sorted(d.items(), key=lambda t: t[0]))
OrderedDict([('apple', 4), ('banana', 3), ('orange', 2), ('pear', 1)])

>>> # dictionary sorted by value
>>> OrderedDict(sorted(d.items(), key=lambda t: t[1]))
OrderedDict([('pear', 1), ('orange', 2), ('banana', 3), ('apple', 4)])

>>> # dictionary sorted by length of the key string
>>> OrderedDict(sorted(d.items(), key=lambda t: len(t[0])))
OrderedDict([('pear', 1), ('apple', 4), ('orange', 2), ('banana', 3)])

python collections包
defaultdict Counter deque namedtuple OrderedDict defaultd...
Python包之collections
collections提供了特殊的容器类型，可以作为Python内建容器类型的替代选择：容器类型说明namedt...
from compiler.ast import flatten
python3删除了flatten包但可用一下代码段代替 import collections def flat...
Python 入门之内置模块 -- collections模块
Python 入门之内置模块 -- collections模块 1、collections -- 基于Pytho...
Python 字典基础回顾
关键词 python、dict、data struct、python字典、python collections、...
Python的collections模块
python的collections模块 collections模块 Counter：字典的子类，提供了可哈希对象...
Python高阶部分扩展 (collections ,iter
Python高阶部分扩展 (collections ,itertools) python3 collection...
用go实现python中的collections中的Counte
python中collections包中的Counter功能很强大,用过的人都知道,下面用go实现一个简单版本的C...
Python collections
Python Collections (Arrays)There are four collection data...
Python collections模块--defaultdic
http://python.usyiyi.cn/python_278/library/collections.ht...

python collections包

defaultdict

Counter

deque

namedtuple

OrederedDict

相关文章

python collections包

Python包之collections

from compiler.ast import flatten

Python 入门之内置模块 -- collections模块

Python 字典基础回顾

Python的collections模块

Python高阶部分扩展 (collections ,iter

用go实现python中的collections中的Counte

Python collections

Python collections模块--defaultdic

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读