Effective Python 笔记摘录2

作者: KyoDante | 来源:发表于2022-04-20 21:06 被阅读0次

Effective Python 笔记摘录2
Effective Python 笔记摘录1
Effective Python 笔记摘录5.2
Effective Python 笔记摘录4
Effective Python 笔记摘录5.1
Effective Python 笔记摘录3
Effective Python 学习笔记 2
Effective Python --编写高质量Python代码
Effective Python 学习笔记 3
Effective Python 学习笔记 4

Item5：写Helper函数而不是复杂的表达式。（DRY原则: 不要重复自己（Don’t repeat yourself.），尽量封装常用的方法。）

# 解码URL的请求串（参数？）
from urllib.parse import parse_qs
my_values = parse_qs('red=5&blue=0&green=',
                     keep_blank_values=True)
print(repr(my_values))
>>>
{'red': ['5'], 'blue': ['0'], 'green': ['']}

# 用get可以获得对应的 参数
print('Red:     ', my_values.get('red'))
print('Green:   ', my_values.get('green'))
print('Opacity: ', my_values.get('opacity'))
>>>
Red:      ['5']
Green:    ['']
Opacity:  None

# For query string 'red=5&blue=0&green='
red = my_values.get('red', [''])[0] or 0
green = my_values.get('green', [''])[0] or 0
opacity = my_values.get('opacity', [''])[0] or 0
print(f'Red:     {red!r}')
print(f'Green:   {green!r}')
print(f'Opacity: {opacity!r}')
>>>
Red:     '5'
Green:   0
Opacity: 0

# 利用字符串判空的trick，or上一个0，即默认为0，最后转为整型。虽然整体的表达整洁，但是难以理解，需要拆分语句来阅读。
red = int(my_values.get('red', [''])[0] or 0)

# 转换成三元组会稍微好一点，但是仍然不像封装的函数。
red_str = my_values.get('red', [''])
red = int(red_str[0]) if red_str[0] else 0

# 拆分成原始的if-else语句段，更加可读
green_str = my_values.get('green', [''])
if green_str[0]:
    green = int(green_str[0])
else:
green = 0

# 最后抽象成一个函数
def get_first_int(values, key, default=0):
    found = values.get(key, [''])
    if found[0]:
       return int(found[0])
    return default

# 只要调用就可以，方便后续复用：
green = get_first_int(my_values, 'green')

Item6：用多个变量赋值而不是下标的方式来拆分变量

# 用tuple把可变的字典变成不可变的元组。
snack_calories = {
    'chips': 140,
    'popcorn': 80,
    'nuts': 190,
}
items = tuple(snack_calories.items())
print(items)
>>>
(('chips', 140), ('popcorn', 80), ('nuts', 190))

# 对元组可以进行下标访问。
item = ('Peanut butter', 'Jelly')
first = item[0]
second = item[1]
print(first, 'and', second)
>>>
Peanut butter and Jelly

# 但是元组不允许下标赋值（不可变）
pair = ('Chocolate', 'Peanut butter')
pair[0] = 'Honey'
>>>
Traceback ...
TypeError: 'tuple' object does not support item assignment

# 可以用另一种方式来获取元组中的元素
item = ('Peanut butter', 'Jelly')
first, second = item # Unpacking
print(first, 'and', second)
>>>
Peanut butter and Jelly

# 同理，可以这样获取元素，但是不建议：
favorite_snacks = {
    'salty': ('pretzels', 100),
    'sweet': ('cookies', 180),
    'veggie': ('carrots', 20),
}
((type1, (name1, cals1)),
 (type2, (name2, cals2)),
 (type3, (name3, cals3))) = favorite_snacks.items()
print(f'Favorite {type1} is {name1} with {cals1} calories')
print(f'Favorite {type2} is {name2} with {cals2} calories')
print(f'Favorite {type3} is {name3} with {cals3} calories')
>>>
Favorite salty is pretzels with 100 calories
Favorite sweet is cookies with 180 calories
Favorite veggie is carrots with 20 calories

其中，最隐晦且有用的一点就是：利用此特性来完成数值交换。

# 传统的数值交换
def bubble_sort(a):
    for _ in range(len(a)):
        for i in range(1, len(a)):
            if a[i] < a[i-1]:
                temp = a[i]
                a[i] = a[i-1]
                a[i-1] = temp
names = ['pretzels', 'carrots', 'arugula', 'bacon']
bubble_sort(names)
print(names)
>>>
['arugula', 'bacon', 'carrots', 'pretzels']

# 利用了unpacking特性的数值交换
def bubble_sort(a):
    for _ in range(len(a)):
        for i in range(1, len(a)):
            if a[i] < a[i-1]:
                a[i-1], a[i] = a[i], a[i-1] # Swap
names = ['pretzels', 'carrots', 'arugula', 'bacon']
bubble_sort(names)
print(names)
>>>
['arugula', 'bacon', 'carrots', 'pretzels']

可以这么理解：取得了右边的数值，然后包装成一个隐藏的元组(a[i], a[i-1])，然后再进行unpack（拆包）之后，分别赋值给a[i-1]和a[i]。

unpacking的特性可以再参照下边的例子：

# 不用unpacking
snacks = [('bacon', 350), ('donut', 240), ('muffin', 190)]
for i in range(len(snacks)):
    item = snacks[i]
    name = item[0]
    calories = item[1]
    print(f'#{i+1}: {name} has {calories} calories')
>>>
#1: bacon has 350 calories
#2: donut has 240 calories
#3: muffin has 190 calories

# 用了unpacking
for rank, (name, calories) in enumerate(snacks, 1):
    print(f'#{rank}: {name} has {calories} calories')
>>>
#1: bacon has 350 calories
#2: donut has 240 calories
#3: muffin has 190 calories

当元组的长度较短，且含义明确时，直接拆包来获得对应的变量，比直接使用下标去取得数据，逻辑意义上更加明确，代码的可读性上也会更好。

Item7：用enumerate而不是range

# 使用range来指定迭代次数
from random import randint
random_bits = 0
for i in range(32):
    if randint(0, 1):
        random_bits |= 1 << i
print(bin(random_bits))
>>>
0b11101000100100000111000010000001

# 直接for-in来迭代
flavor_list = ['vanilla', 'chocolate', 'pecan', 'strawberry']
for flavor in flavor_list:
    print(f'{flavor} is delicious')
>>>
vanilla is delicious
chocolate is delicious
pecan is delicious
strawberry is delicious

# 当需要下标的时候，可能就会用到range
for i in range(len(flavor_list)):
    flavor = flavor_list[i]
    print(f'{i + 1}: {flavor}')
>>>
1: vanilla
2: chocolate
3: pecan
4: strawberry

# enumerate本质是iterator（迭代器），每次next返回元组（由下标以及迭代对象的元素组成）。
it = enumerate(flavor_list)
print(next(it))
print(next(it))
>>>
(0, 'vanilla')
(1, 'chocolate')

# 自然可以使用for-in以及unpacking的组合方式来迭代enumerate。
for i, flavor in enumerate(flavor_list, 1): # 可以指定下标的编号从多少开始。
    print(f'{i}: {flavor}')
>>>
1: vanilla
2: chocolate
3: pecan
4: strawberry

Item8：用zip来同时处理迭代器。

# 可以用列表推导式来生成一个list
names = ['Cecilia', 'Lise', 'Marie']
counts = [len(n) for n in names]
print(counts)
>>>
[7, 4, 5]

# 如果要找到最长的名字，用这种普通的range迭代比较麻烦。
longest_name = None
max_count = 0
for i in range(len(names)):
    count = counts[i]
    if count > max_count:
       longest_name = names[i]
       max_count = count
print(longest_name)
>>>
Cecilia

# enumerate稍微好一点
for i, name in enumerate(names):
    count = counts[i]
    if count > max_count:
        longest_name = name
        max_count = count

# zip则是结合两个列表简明完成任务。（用一个懒惰生成器包裹了两个或更多的迭代器，每次next是所有迭代器的下一个值组成的元组）
for name, count in zip(names, counts):
    if count > max_count:
        longest_name = name
        max_count = count

但是，需要注意zip的迭代器们可能存在长度不一的问题：

# 比如用了之前的counts，但是对names添加了一个新名字，打印的时候，会按照最短长度来迭代。
names.append('Rosalind')
for name, count in zip(names, counts):
    print(name)
>>>
Cecilia
Lise
Marie

# 如果不确定长度是否一致，则可以使用itertools的zip_longest函数来迭代。（由于最后一个名字对应没有数值，则用None替代。）
import itertools
for name, count in itertools.zip_longest(names, counts):
    print(f'{name}: {count}')
>>>
Cecilia: 7 Lise: 4
Marie: 5
Rosalind: None

Item9：避免在for和while循环后面用else代码块

for i in range(3):
    print('Loop', i)
else:
    print('Else block!')
>>>
Loop 0 Loop 1
Loop 2
Else block!

# else此处违反直觉：因为在try/except和if/else组合中，后者都是表示如果前者失败的话，做什么操作。
# 而try/finally则是无论前者如何，最终后者会做什么操作。
# for/else此处则是相反的。
for i in range(3):
    print('Loop', i)
    if i == 1:
        break
else:
    print('Else block!')
>>>
Loop 0
Loop 1

# 如果是空列表，则直接执行else块。
for x in []:
    print('Never runs')
else:
    print('For Else block!')
>>>
For Else block!

# while/else也是一样
while False:
    print('Never runs')
else:
    print('While Else block!')
>>>
While Else block!

# 不通过break的时候，说明是互为质数。
a = 4
b = 9
for i in range(2, min(a, b) + 1):
    print('Testing', i)
    if a % i == 0 and b % i == 0:
        print('Not coprime')
        break
else:
    print('Coprime')
>>>
Testing 2
Testing 3
Testing 4
Coprime

#（上面的场景适用，但实际不推荐使用。写一个helper函数来做计算更合适。）
def coprime(a, b):
    for i in range(2, min(a, b) + 1):
        if a % i == 0 and b % i == 0:
            return False
    return True
assert coprime(4, 9)
assert not coprime(3, 6)

# 或者通过一个变量来表示是否互质。
def coprime_alternate(a, b):
    is_coprime = True
    for i in range(2, min(a, b) + 1):
        if a % i == 0 and b % i == 0:
            is_coprime = False
            break
    return is_coprime
assert coprime_alternate(4, 9)
assert not coprime_alternate(3, 6)

通过helper函数可以提高可读性（相比for/else语句）。

Item10：避免重复赋值表达式时可读性差的问题（walrus操作符python3.8之后的语法，因为“:=”像眼睛和长牙，所以就叫walrus。）

# 水果篮子里面有什么
fresh_fruit = {
    'apple': 10,
    'banana': 8,
    'lemon': 5,
}
# 做水果汁
def make_lemonade(count):
    ...
def out_of_stock():
    ...
# 原实现-1
count = fresh_fruit.get('lemon', 0)
if count:
    make_lemonade(count)
else:
    out_of_stock()
# 实际count只在if这一块使用到，放到if前，似乎有点放大了作用域。
# 使用walrus重写了上面的片段，实现-2
if count := fresh_fruit.get('lemon', 0):
    make_lemonade(count)
else:
    out_of_stock()

def make_cider(count):
    ...
count = fresh_fruit.get('apple', 0)
if count >= 4:
    make_cider(count)
else:
    out_of_stock()

# 同样用walrus操作符重写片段。
if (count := fresh_fruit.get('apple', 0)) >= 4:
    make_cider(count)
else:
    out_of_stock()

def slice_bananas(count):
    ...
class OutOfBananas(Exception):
    pass
def make_smoothies(count):
    ...
pieces = 0
count = fresh_fruit.get('banana', 0)
if count >= 2:
    pieces = slice_bananas(count)
try:
    smoothies = make_smoothies(pieces)
except OutOfBananas:
    out_of_stock()

# 闭包问题，为了逻辑通顺，可以放在if/else里面来赋值变量。
count = fresh_fruit.get('banana', 0)
if count >= 2:
    pieces = slice_bananas(count)
else:
    pieces = 0
try:
    smoothies = make_smoothies(pieces)
except OutOfBananas:
    out_of_stock()

# 用walrus继续来重写
pieces = 0
if (count := fresh_fruit.get('banana', 0)) >= 2:
    pieces = slice_bananas(count)
try:
    smoothies = make_smoothies(pieces)
except OutOfBananas:
    out_of_stock()

if (count := fresh_fruit.get('banana', 0)) >= 2:
    pieces = slice_bananas(count)
else:
    pieces = 0
try:
    smoothies = make_smoothies(pieces)
except OutOfBananas:
    out_of_stock()

# if/else实现switch的可读性比较差
count = fresh_fruit.get('banana', 0)
if count >= 2:
    pieces = slice_bananas(count)
    to_enjoy = make_smoothies(pieces)
else:
    count = fresh_fruit.get('apple', 0)
    if count >= 4:
        to_enjoy = make_cider(count)
    else:
        count = fresh_fruit.get('lemon', 0)
        if count:
           to_enjoy = make_lemonade(count)
        else:
           to_enjoy‘= 'Nothing'

# 用walrus来配合实现就稍微好一点
if (count := fresh_fruit.get('banana', 0)) >= 2:
    pieces = slice_bananas(count)
    to_enjoy = make_smoothies(pieces)
elif (count := fresh_fruit.get('apple', 0)) >= 4:
    to_enjoy = make_cider(count)
elif count := fresh_fruit.get('lemon', 0):
    to_enjoy = make_lemonade(count)
else:
    to_enjoy = 'Nothing'

# while循环
def pick_fruit():
    ...
def make_juice(fruit, count):
    ...
bottles = []
fresh_fruit = pick_fruit()
while fresh_fruit:
    for fruit, count in fresh_fruit.items():
        batch = make_juice(fruit, count)
        bottles.extend(batch)
    fresh_fruit = pick_fruit()

# 整体为loop-and-a-half的结构。
bottles = []
while True:                    # Loop
    fresh_fruit = pick_fruit()
    if not fresh_fruit:        # And a half
        break
    for fruit, count in fresh_fruit.items():
        batch = make_juice(fruit, count)
        bottles.extend(batch)

# 可以通过walrus表达式来重建，提升可读性。
bottles = []
while fresh_fruit := pick_fruit():
    for fruit, count in fresh_fruit.items():
        batch = make_juice(fruit, count)
        bottles.extend(batch)

列表和字典

一个常见的方式是用list来处理序列相关的工作。
dict是list的一个自然补充。键值对也倍叫做联合数组或者哈希表。提供常数级的赋值和访问的均摊时间复杂度。

Item11：知道如何去切分序列

a = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
print('Middle two:  ', a[3:5])
print('All but ends:', a[1:7])
>>>
Middle two:   ['d', 'e']
All but ends: ['b', 'c', 'd', 'e', 'f', 'g']

# 为了提高可读性，开始为0或者结束为len应该省略。
assert a[:5] == a[0:5]
assert a[5:] == a[5:len(a)]

a[:]      # ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
a[:5]     # ['a', 'b', 'c', 'd', 'e']
a[:-1]    # ['a', 'b', 'c', 'd', 'e', 'f', 'g']
a[4:]     #                     ['e', 'f', 'g', 'h']
a[-3:]    #                          ['f', 'g', 'h']
a[2:5]    #           ['c', 'd', 'e']
a[2:-1]   #           ['c', 'd', 'e', 'f', 'g']
a[-3:-1]  #                          ['f', 'g']

# 切片会避免一些问题。
first_twenty_items = a[:20]
last_twenty_items = a[-20:]

# 比如取到不存在的下标：
a[20]
>>>
Traceback ...
IndexError: list index out of range

# 可以通过切片来生成新的列表，不过这个列表是浅拷贝的新列表。
a = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
b = a[3:]
print('Before:   ', b)
b[1] = 99
print('After:    ', b)
print('No change:', a)
>>>
Before:    ['d', 'e', 'f', 'g', 'h']
After:     ['d', 99, 'f', 'g', 'h']
No change: ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']

# 赋值长度可以不一样，但是最终会收缩或者伸长。
print('Before ', a)
a[2:7] = [99, 22, 14]
print('After  ', a)
>>>
Before  ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
After   ['a', 'b', 99, 22, 14, 'h']


print('Before ', a)
a[2:3] = [47, 11]
print('After  ', a)
>>>
Before  ['a', 'b', 99, 22, 14, 'h']
After   ['a', 'b', 47, 11, 22, 14, 'h']

# 浅拷贝的问题，所以元素相同，但是对应的指针不同。
b = a[:]
assert b == a and b is not a

# 再次加深浅拷贝的印象，如果使用切片来赋值，最终是替换了指向的数值，而不是给一个新的列表：
b = a
print('Before a', a)
print('Before b', b)
a[:] = [101, 102, 103]
assert a is b             # Still the same list object
print('After a ', a)      # Now has different contents
print('After b ', b)      # Same list, so same contents as a
>>>
Before a ['a', 'b', 47, 11, 22, 14, 'h']
Before b ['a', 'b', 47, 11, 22, 14, 'h']
After a  [101, 102, 103]
After b  [101, 102, 103]

Item12：避免在一个表达式中切分和步幅(striding)操作
列表的切片语法：somelist[start:end:stride]，建议用正的stride的同时，避免用start和end下标。

# 利用切片来奇数和偶数遍历
x = ['red', 'orange', 'yellow', 'green', 'blue', 'purple']
odds = x[::2]
evens = x[1::2]
print(odds)
print(evens)
>>>
['red', 'yellow', 'blue']
['orange', 'green', 'purple']

# 利用切片striding为-1来反向遍历（bytes和unicode的字符串都可以）
x = b'mongoose'
y = x[::-1]
print(y)
>>>
b'esoognom'

x = '寿司'
y = x[::-1]
print(y)
>>>
司寿

# 但是utf-8不行：
w = '寿司'
x = w.encode('utf-8')
y = x[::-1]
z = y.decode('utf-8')
>>>
Traceback ...
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb8 in position 0: invalid start byte

-1很有用嘛？看看下面的例子：

x = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
x[::2]   # ['a', 'c', 'e', 'g']
x[::-2]  # ['h', 'f', 'd', 'b']

# 如果结合起来呢？
x[2::2]     # ['c', 'e', 'g']
x[-2::-2]   # ['g', 'e', 'c', 'a']
x[-2:2:-2]  # ['g', 'e']
x[2:2:-2]   # []

由于语法上的密度较大，可读性降低。可以先striding再slicing。（额外创建了浅拷贝，如果想优化，可以考虑itertools的islice方法）

y = x[::2]   # ['a', 'c', 'e', 'g']
z = y[1:-1]  # ['c', 'e']

Item13：使用Catch-All Unpacking（捕捉-全部的拆包）而不是切片。

# 拆箱的时候，应该注意变量的个数。
car_ages = [0, 9, 4, 8, 7, 20, 19, 1, 6, 15]
car_ages_descending = sorted(car_ages, reverse=True)
oldest, second_oldest = car_ages_descending
>>>
Traceback ...
ValueError: too many values to unpack (expected 2)

# 当然可以通过下标来访问，但是比较麻烦。
oldest = car_ages_descending[0]
second_oldest = car_ages_descending[1]
others = car_ages_descending[2:]
print(oldest, second_oldest, others)
>>>
20 19 [15, 9, 8, 7, 6, 4, 1, 0]

# 可以用带星表达式(starred expression)来接收拆包的结果。
oldest, second_oldest, *others = car_ages_descending
print(oldest, second_oldest, others)
>>>
20 19 [15, 9, 8, 7, 6, 4, 1, 0]

oldest, *others, youngest = car_ages_descending
print(oldest, youngest, others)
*others, second_youngest, youngest = car_ages_descending
print(youngest, second_youngest, others)
>>>
20 0 [19, 15, 9, 8, 7, 6, 4, 1]
0 1 [20, 19, 15, 9, 8, 7, 6, 4]

# 不能在其本身用*表达式。
*others = car_ages_descending
>>>
Traceback ...
SyntaxError: starred assignment target must be in a list or tuple

# 单行拆出来不可以。
first, *middle, *second_middle, last = [1, 2, 3, 4]
>>>
Traceback ...
SyntaxError: two starred expressions in assignment

# 可以通过下面的例子提供对*表达式的直觉。
car_inventory = {
    'Downtown': ('Silver Shadow', 'Pinto', 'DMC'),
    'Airport': ('Skyline', 'Viper', 'Gremlin', 'Nova'),
}
((loc1, (best1, *rest1)),
 (loc2, (best2, *rest2))) = car_inventory.items()
print(f'Best at {loc1} is {best1}, {len(rest1)} others')
print(f'Best at {loc2} is {best2}, {len(rest2)} others')
>>>
Best at Downtown is Silver Shadow, 2 others
Best at Airport is Skyline, 3 others

如果被拆包的部分比变量短，最终得到的是空列表。

short_list = [1, 2]
first, second, *rest = short_list
print(first, second, rest)
>>>
1 2 []

任意迭代器可以拆包

it = iter(range(1, 3))
first, second = it
print(f'{first} and {second}')
>>>
1 and 2

最后再通过处理CSV表格的例子来加深拆包和*表达式的使用：

def generate_csv():
    yield ('Date', 'Make', 'Model', 'Year', 'Price')
    ...

# 不用*表达式拆行
all_csv_rows = list(generate_csv())
header = all_csv_rows[0]
rows = all_csv_rows[1:]
print('CSV Header:', header)
print('Row count: ', len(rows))
>>>
CSV Header: ('Date', 'Make', 'Model', 'Year', 'Price')
Row count:  200

# 用*表达式拆行。
it = generate_csv()
header, *rows = it
print('CSV Header:', header)
print('Row count: ', len(rows))
>>>
CSV Header: ('Date', 'Make', 'Model', 'Year', 'Price')
Row count:  200

小心*号表达式最终爆内存（因为返回的是一个list）

Item14：通过复杂的标准和key参数来排序
sort可以对列表进行排序。

numbers = [93, 86, 11, 68, 70]
numbers.sort()
print(numbers)
>>>
[11, 68, 70, 86, 93]

class Tool:
    def __init__(self, name, weight):
        self.name = name
        self.weight = weight
    def __repr__(self):
        return f'Tool({self.name!r}, {self.weight})'
tools = [
    Tool('level', 3.5),
    Tool('hammer', 1.25),
    Tool('screwdriver', 0.5),
    Tool('chisel', 0.25),
]

# 没有指定对比的key，排序失效
tools.sort()
>>>
Traceback ...
TypeError: '<' not supported between instances of 'Tool' and
'Tool'

# 指定了使用Tool的name来排序。
print('Unsorted:', repr(tools))
tools.sort(key=lambda x: x.name)
print('\nSorted: ', tools)
>>>
Unsorted: [Tool('level',        3.5),
           Tool('hammer',       1.25),
           Tool('screwdriver',  0.5),
           Tool('chisel',       0.25)]
Sorted: [Tool('chisel',         0.25),
         Tool('hammer',         1.25),
         Tool('level',          3.5),
         Tool('screwdriver',    0.5)]

# 当然也可以指定用体重来排序。
tools.sort(key=lambda x: x.weight)
print('By weight:', tools)
>>>
By weight: [Tool('chisel',      0.25),
            Tool('screwdriver', 0.5),
            Tool('hammer',      1.25),
            Tool('level',       3.5)]

# 此处为了确保按照字母顺序排序，所以转成了小写(lower())
places = ['home', 'work', 'New York', 'Paris']
places.sort()
print('Case sensitive: ', places)
places.sort(key=lambda x: x.lower())
print('Case insensitive:', places)
>>>
Case sensitive: ['New York', 'Paris', 'home', 'work']
Case insensitive: ['home', 'New York', 'Paris', 'work']

同时排序多个条件怎么操作？最简单的方式是用元组（默认是自然排序，意味着实现了lt等sort需要的方法）。

saw = (5, 'circular saw')
jackhammer = (40, 'jackhammer')
assert not (jackhammer < saw) # Matches expectations

drill = (4, 'drill')
sander = (4, 'sander')
assert drill[0] == sander[0] # Same weight
assert drill[1] < sander[1]  # Alphabetically less
assert drill < sander        # Thus, drill comes first

power_tools = [
    Tool('drill', 4),
    Tool('circular saw', 5),
    Tool('jackhammer', 40),
    Tool('sander', 4),
]
power_tools.sort(key=lambda x: (x.weight, x.name))
print(power_tools)
>>>
[Tool('drill',        4),
 Tool('sander',       4),
 Tool('circular saw', 5),
 Tool('jackhammer',   40)]

# 当然可以指定reverse来使所有key逆序。
power_tools.sort(key=lambda x: (x.weight, x.name), reverse=True) # Makes all criteria 
descending
print(power_tools)
>>>
[Tool('jackhammer',   40),
 Tool('circular saw', 5),
 Tool('sander',       4),
 Tool('drill',        4)]

# 如果一部分需要逆序，一部分正序，怎么办？（一元负号可以稍微解决这个问题，但是其不支持所有类型）
power_tools.sort(key=lambda x: (-x.weight, x.name))
print(power_tools)
>>>
[Tool('jackhammer',   40),
 Tool('circular saw', 5),
 Tool('drill',        4),
 Tool('sander',       4)]

power_tools.sort(key=lambda x: (x.weight, -x.name),
                 reverse=True)
>>>
Traceback ...
TypeError: bad operand type for unary -: 'str'

迫不得已才需要用到多个sort的组合。

power_tools.sort(key=lambda x: x.name) # Name ascending
power_tools.sort(key=lambda x: x.weight, # Weight descending
                 reverse=True)
print(power_tools)

>>>
[Tool('jackhammer',   40),
 Tool('circular saw', 5),
 Tool('drill',        4),
 Tool('sander',       4)]

先对name，再对weight进行排序，所以得到最终的结果。相反，也可以先对weight排序，再对name排序，看个人的需求。

Item15：注意dict的插入顺序

Python3.5之前，迭代一个dict的时候，顺序是随机的。也就是和原本插入的顺序不一致。这个特性使得测试样例比较难以复现，难以debug。（主要是由于哈希表的实现用了内置的hash函数和随机种子）

# Python 3.5
baby_names = {
    'cat': 'kitten',
    'dog': 'puppy',
}
print(baby_names)
>>>
{'dog': 'puppy', 'cat': 'kitten'}

3.6之后正常：

baby_names = {
    'cat': 'kitten',
    'dog': 'puppy',
}
print(baby_names)
>>>
{'cat': 'kitten', 'dog': 'puppy'}

因此，3.6之前的依赖于dict的方法(keys, values, items, popitem)，都会有这种惊喜存在：

# Python 3.5
print(list(baby_names.keys()))
print(list(baby_names.values()))
print(list(baby_names.items()))
print(baby_names.popitem())  # Randomly chooses an item

>>>
['dog', 'cat']
['puppy', 'kitten']
[('dog', 'puppy'), ('cat', 'kitten')]
('dog', 'puppy')

# 3.6之后
print(list(baby_names.keys()))
print(list(baby_names.values()))
print(list(baby_names.items()))
print(baby_names.popitem()) # Last item inserted
>>>
['cat', 'dog']
['kitten', 'puppy']
[('cat', 'kitten'), ('dog', 'puppy')]
('dog', 'puppy')

比如用**kwargs捕捉所有键值对，由于顺序问题，难以debug。

# Python 3.5
def my_func(**kwargs):
    for key, value in kwargs.items():
        print('%s = %s' % (key, value))
my_func(goose='gosling', kangaroo='joey')
>>>
kangaroo = joey
goose = gosling

def my_func(**kwargs):
    for key, value in kwargs.items():
        print(f'{key} = {value}')
my_func(goose='gosling', kangaroo='joey')
>>>
goose = gosling
kangaroo = joey

类也用dict作为实例字典。早期版本也是存在顺序问题。

# Python 3.5
class MyClass:
    def __init__(self):
        self.alligator = 'hatchling'
        self.elephant = 'calf'
a = MyClass()
for key, value in a.__dict__.items():
    print('%s = %s' % (key, value))
>>>
elephant = calf
alligator = hatchling

# 之后的版本
class MyClass:
    def __init__(self):
        self.alligator = 'hatchling'
        self.elephant = 'calf'
a = MyClass()
for key, value in a.__dict__.items():
    print(f'{key} = {value}')
>>>
alligator = hatchling
elephant = calf

尽管从3.7开始，dict和collections的OrderedDict的表现相似，但是如果经常插入和popitem（比如实现LRU缓存），OrderedDict可能比dict更适合。

再看一个例子。比如现在有动物的投票数字典，然后需求是得到投票数最多的动物，作为赢家，那么可以实现如下：

votes = {
    'otter': 1281,
    'polar bear': 587,
    'fox': 863,
}
def populate_ranks(votes, ranks):
    names = list(votes.keys())
    names.sort(key=votes.get, reverse=True) # 通过得到对应的票数来进行排序。
    for i, name in enumerate(names, 1):
        ranks[name] = i
def get_winner(ranks):
    return next(iter(ranks))

ranks = {}
populate_ranks(votes, ranks)
print(ranks)
winner = get_winner(ranks)
print(winner)
>>>
{'otter': 1, 'fox': 2, 'polar bear': 3}
otter

但是，现在需求变了，现在要的不是投票数顺序来遍历，而是名字的顺序来遍历了。此时可以用collections.abc来定义一个新的字典类（引入了dict-like的类型，但是会引入一些奇怪的bugs）：

from collections.abc import MutableMapping
class SortedDict(MutableMapping):
    def __init__(self):
        self.data = {}
    def __getitem__(self, key):
        return self.data[key]
    def __setitem__(self, key, value):
        self.data[key] = value
    def __delitem__(self, key):
        del self.data[key]
    def __iter__(self):
        keys = list(self.data.keys())
        keys.sort()
        for key in keys:
            yield key
    def __len__(self):
          return len(self.data)

sorted_ranks = SortedDict()
populate_ranks(votes, sorted_ranks)
print(sorted_ranks.data)
winner = get_winner(sorted_ranks)
print(winner)
>>>
{'otter': 1, 'fox': 2, 'polar bear': 3}
fox

但是，存在一个问题就是，函数的参数类型并不明显（主要是Dict的顺序问题。）：sorted_ranks依赖于populate_ranks的顺序。而SortedDict打破了这个假设。可以通过：1）修改get_winner函数，或者2）直接抛出异常，或者3）显式限制参数的类型（要指定strict运行）来解决这个问题。

def get_winner(ranks):
    for name, rank in ranks.items():
        if rank == 1:
            return name
winner = get_winner(sorted_ranks)
print(winner)
>>>
otter

def get_winner(ranks):
    if not isinstance(ranks, dict):
        raise TypeError('must provide a dict instance')
    return next(iter(ranks))
get_winner(sorted_ranks)
>>>
Traceback ...
TypeError: must provide a dict instance

from typing import Dict, MutableMapping
def populate_ranks(votes: Dict[str, int],
                   ranks: Dict[str, int]) -> None:
    names = list(votes.keys())
    names.sort(key=votes.get, reverse=True)
    for i, name in enumerate(names, 1):
        ranks[name] = i
def get_winner(ranks: Dict[str, int]) -> str:
    return next(iter(ranks))
class SortedDict(MutableMapping[str, int]):
    ...
votes = {
    'otter': 1281,
    'polar bear': 587,
    'fox': 863,
}
sorted_ranks = SortedDict()
populate_ranks(votes, sorted_ranks)
print(sorted_ranks.data)
winner = get_winner(sorted_ranks)
print(winner)

$ python3 -m mypy --strict example.py
.../example.py:48: error: Argument 2 to "populate_ranks" has incompatible type "SortedDict"; expected "Dict[str, int]"
.../example.py:50: error: Argument 1 to "get_winner" has incompatible type "SortedDict"; expected "Dict[str, int]"

Effective Python 笔记摘录2
Item5：写Helper函数而不是复杂的表达式。（DRY原则: 不要重复自己（Don’t repeat your...
Effective Python 笔记摘录1
Item1：确认python版本（由于2失去官方支持了，强制要求使用3）： Item2：使用PEP8的风格来写代码...
Effective Python 笔记摘录5.2
元类(Metaclasses)和属性(Attributes) 元类经常被提及，但是很少知道实际如何使用。简单地说，...
Effective Python 笔记摘录4
Comprehensions and Generators（生成器） Item27: 用Comprehension...
Effective Python 笔记摘录5.1
类和接口 Python作为面向对象语言，继承多态和封装有良好的应用，如何编写可维护的代码呢？ Item37: 组合...
Effective Python 笔记摘录3
Item16：使用get而不是In和KeyError来处理缺失的字典键再次体会从in到try-except的过程...
Effective Python 学习笔记 2
Effective Python 学习笔记 2 用列表推导来取代map和filter 列表推导（list comm...
Effective Python --编写高质量Python代码
Effective Python --编写高质量Python代码的59个有效方法 (读书笔记每日更新一条 Up...
Effective Python 学习笔记 3
Effective Python 学习笔记 3 不要在for和while循环后面写else语句块要点只有当整个...
Effective Python 学习笔记 4
Effective Python 学习笔记 4 尽量使用异常来表示特殊情况，而不要返回None 要点用None这...

Effective Python 笔记摘录2

列表和字典

相关文章

Effective Python 笔记摘录2

Effective Python 笔记摘录1

Effective Python 笔记摘录5.2

Effective Python 笔记摘录4

Effective Python 笔记摘录5.1

Effective Python 笔记摘录3

Effective Python 学习笔记 2

Effective Python --编写高质量Python代码

Effective Python 学习笔记 3

Effective Python 学习笔记 4

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读