Python 语言技巧

1. 列表推导和生成器表达式

1.1 从字典中提取子集

你想构造一个字典，它是另外一个字典的子集，最简单的方式是使用字典推导：^[1]

python

prices = {
    'ACME': 45.23,
    'AAPL': 612.78,
    'IBM': 205.55,
    'HPQ': 37.20,
    'FB': 10.75
}

# Make a dictionary of all prices over 200
p1 = {key: value for key, value in prices.items() if value > 200}

# Make a dictionary of tech stocks
tech_names = {'AAPL', 'IBM', 'HPQ', 'MSFT'}
p2 = {key: value for key, value in prices.items() if key in tech_names}

大多数情况下字典推导能做到的，通过创建一个元组序列然后把它传给 dict() 函数也能实现：

python

p1 = dict((key, value) for key, value in prices.items() if value > 200)

但是，字典推导方式表意更清晰，并且实际上也会运行的更快些（在这个例子中，实际测试几乎比 dict() 函数方式快整整一倍）

有时候完成同一件事会有多种方式。比如，第二个例子程序也可以像这样重写：

python

# Make a dictionary of tech stocks
tech_names = { 'AAPL', 'IBM', 'HPQ', 'MSFT' }
p2 = { key: prices[key] for key in prices.keys() & tech_names }

但是，运行时间测试结果显示这种方案大概比第一种方案慢 1.6 倍。

1.2 列表推导的过滤和转换

列表推导可以同时进行过滤和转换：

python

# 获取所有偶数的平方
squares = [x ** 2 for x in range(10) if x % 2 == 0]
print(squares)  # [0, 4, 16, 36, 64]

# 多重循环
matrix = [[i * j for j in range(1, 4)] for i in range(1, 4)]
print(matrix)  # [[1, 2, 3], [2, 4, 6], [3, 6, 9]]

# 展平嵌套列表
nested = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
flat = [item for sublist in nested for item in sublist]
print(flat)  # [1, 2, 3, 4, 5, 6, 7, 8, 9]

1.3 生成器表达式节省内存

对于大数据集，生成器表达式比列表推导更节省内存：

python

# 列表推导：创建完整列表，占用大量内存
squares_list = [x ** 2 for x in range(1000000)]

# 生成器表达式：按需生成，节省内存
squares_gen = (x ** 2 for x in range(1000000))

# 可以直接用于需要迭代器的场景
sum_of_squares = sum(x ** 2 for x in range(1000000))

2. 解包操作

2.1 星号表达式解包

使用星号表达式可以轻松处理任意长度的序列：

python

# 基本解包
first, *middle, last = [1, 2, 3, 4, 5]
print(first)   # 1
print(middle)  # [2, 3, 4]
print(last)    # 5

# 在函数参数中使用
def func(a, b, *args, **kwargs):
    print(f"a={a}, b={b}")
    print(f"args={args}")
    print(f"kwargs={kwargs}")

func(1, 2, 3, 4, x=5, y=6)
# a=1, b=2
# args=(3, 4)
# kwargs={'x': 5, 'y': 6}

# 忽略不需要的值
first, *_, last = [1, 2, 3, 4, 5]
print(first, last)  # 1 5

2.2 字典解包

Python 3.5+ 支持字典解包：

python

# 合并字典
dict1 = {'a': 1, 'b': 2}
dict2 = {'c': 3, 'd': 4}
merged = {**dict1, **dict2}
print(merged)  # {'a': 1, 'b': 2, 'c': 3, 'd': 4}

# 更新字典值
defaults = {'host': 'localhost', 'port': 8080}
user_config = {'port': 9000}
config = {**defaults, **user_config}
print(config)  # {'host': 'localhost', 'port': 9000}

3. 默认值和空值处理

3.1 使用 or 提供默认值

python

# 为 None 或空值提供默认值
name = user_input or "Anonymous"

# 但要注意 0、False、空字符串都会被视为假值
count = user_count or 10  # 如果 user_count 是 0，这会有问题

# 更安全的方式
count = user_count if user_count is not None else 10

3.2 使用 defaultdict

python

from collections import defaultdict

# 自动创建默认值
word_count = defaultdict(int)
for word in ["apple", "banana", "apple", "cherry"]:
    word_count[word] += 1  # 不需要检查 key 是否存在

print(dict(word_count))  # {'apple': 2, 'banana': 1, 'cherry': 1}

# 使用 list 作为默认工厂
groups = defaultdict(list)
for name, age in [("Alice", 25), ("Bob", 30), ("Alice", 26)]:
    groups[name].append(age)

print(dict(groups))  # {'Alice': [25, 26], 'Bob': [30]}

3.3 使用 get 方法

python

config = {'timeout': 30}

# 传统方式
if 'timeout' in config:
    timeout = config['timeout']
else:
    timeout = 60

# 使用 get
timeout = config.get('timeout', 60)

# get 还可以避免 KeyError
value = config.get('missing_key')  # 返回 None 而不是抛出异常

4. 函数式编程技巧

4.1 使用 map、filter 和 reduce

python

# map：转换序列中的每个元素
numbers = [1, 2, 3, 4, 5]
squares = list(map(lambda x: x ** 2, numbers))
print(squares)  # [1, 4, 9, 16, 25]

# filter：过滤序列
evens = list(filter(lambda x: x % 2 == 0, numbers))
print(evens)  # [2, 4]

# reduce：累积操作
from functools import reduce
product = reduce(lambda x, y: x * y, numbers)
print(product)  # 120

# 通常列表推导更 Pythonic
squares = [x ** 2 for x in numbers]
evens = [x for x in numbers if x % 2 == 0]

4.2 使用 partial 固定函数参数

python

from functools import partial

def power(base, exponent):
    return base ** exponent

# 创建特定的函数版本
square = partial(power, exponent=2)
cube = partial(power, exponent=3)

print(square(5))  # 25
print(cube(5))    # 125

4.3 使用 lru_cache 缓存结果

python

from functools import lru_cache

@lru_cache(maxsize=128)
def fibonacci(n):
    if n < 2:
        return n
    return fibonacci(n - 1) + fibonacci(n - 2)

# 第一次调用会计算，后续调用直接返回缓存结果
print(fibonacci(100))  # 很快

# 查看缓存信息
print(fibonacci.cache_info())

5. 上下文管理器

5.1 自定义上下文管理器

python

from contextlib import contextmanager

@contextmanager
def timer():
    import time
    start = time.time()
    try:
        yield
    finally:
        end = time.time()
        print(f"Elapsed time: {end - start:.4f}s")

# 使用
with timer():
    # 执行耗时操作
    sum(range(1000000))

5.2 suppress 上下文管理器

python

from contextlib import suppress

# 忽略特定异常
with suppress(FileNotFoundError):
    import os
    os.remove('nonexistent_file.txt')  # 不会抛出异常

# 等价于
try:
    import os
    os.remove('nonexistent_file.txt')
except FileNotFoundError:
    pass

6. 字符串处理技巧

6.1 f-string 格式化

python

name = "Alice"
age = 30

# 基本用法
print(f"Name: {name}, Age: {age}")

# 表达式
print(f"Next year: {age + 1}")

# 格式化
pi = 3.14159
print(f"Pi: {pi:.2f}")  # Pi: 3.14

# 对齐
print(f"{name:>10}")   # 右对齐
print(f"{name:<10}")   # 左对齐
print(f"{name:^10}")   # 居中

# 调试（Python 3.8+）
x = 10
print(f"{x=}")  # x=10

6.2 字符串方法链

python

text = "  hello world  "

# 方法链
result = text.strip().upper().replace("WORLD", "PYTHON")
print(result)  # "HELLO PYTHON"

# 分割和连接
words = "apple,banana,cherry".split(',')
joined = ' | '.join(words)
print(joined)  # "apple | banana | cherry"

7. 迭代器和可迭代对象

7.1 enumerate 带索引遍历

python

fruits = ['apple', 'banana', 'cherry']

# 获取索引和值
for i, fruit in enumerate(fruits):
    print(f"{i}: {fruit}")

# 指定起始索引
for i, fruit in enumerate(fruits, start=1):
    print(f"{i}: {fruit}")

7.2 zip 并行迭代

python

names = ['Alice', 'Bob', 'Charlie']
ages = [25, 30, 35]
cities = ['New York', 'London', 'Tokyo']

# 并行迭代多个序列
for name, age, city in zip(names, ages, cities):
    print(f"{name} is {age} years old and lives in {city}")

# 创建字典
person_dict = dict(zip(names, ages))
print(person_dict)  # {'Alice': 25, 'Bob': 30, 'Charlie': 35}

7.3 itertools 强大工具

python

from itertools import chain, islice, cycle, groupby

# chain：连接多个迭代器
list1 = [1, 2, 3]
list2 = [4, 5, 6]
combined = list(chain(list1, list2))
print(combined)  # [1, 2, 3, 4, 5, 6]

# islice：切片迭代器
data = range(100)
first_10 = list(islice(data, 10))
print(first_10)  # [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

# cycle：无限循环
colors = ['red', 'green', 'blue']
color_cycle = cycle(colors)
print([next(color_cycle) for _ in range(5)])  # ['red', 'green', 'blue', 'red', 'green']

# groupby：分组
data = [('A', 1), ('A', 2), ('B', 1), ('B', 2), ('C', 1)]
for key, group in groupby(data, key=lambda x: x[0]):
    print(f"{key}: {list(group)}")

8. 类和对象技巧

8.1 dataclass 简化类定义

python

from dataclasses import dataclass

@dataclass
class Point:
    x: float
    y: float
    
    def distance(self):
        return (self.x ** 2 + self.y ** 2) ** 0.5

p = Point(3, 4)
print(p)  # Point(x=3, y=4)
print(p.distance())  # 5.0

8.2 property 装饰器

python

class Temperature:
    def __init__(self, celsius):
        self._celsius = celsius
    
    @property
    def celsius(self):
        return self._celsius
    
    @celsius.setter
    def celsius(self, value):
        if value < -273.15:
            raise ValueError("Temperature below absolute zero")
        self._celsius = value
    
    @property
    def fahrenheit(self):
        return self._celsius * 9/5 + 32

temp = Temperature(25)
print(temp.celsius)     # 25
print(temp.fahrenheit)  # 77.0
temp.celsius = 30       # 使用 setter

8.3 slots 节省内存

python

class Point:
    __slots__ = ['x', 'y']  # 只允许这些属性
    
    def __init__(self, x, y):
        self.x = x
        self.y = y

p = Point(1, 2)
# p.z = 3  # 会抛出 AttributeError

9. 异常处理技巧

9.1 else 和 finally 子句

python

try:
    result = 10 / 2
except ZeroDivisionError:
    print("Cannot divide by zero")
else:
    # 只有没有异常时才执行
    print(f"Result: {result}")
finally:
    # 无论是否有异常都会执行
    print("Cleanup")

9.2 自定义异常

python

class ValidationError(Exception):
    """自定义验证错误"""
    pass

def validate_age(age):
    if age < 0:
        raise ValidationError(f"Age cannot be negative: {age}")
    if age > 150:
        raise ValidationError(f"Age too large: {age}")
    return age

try:
    validate_age(-5)
except ValidationError as e:
    print(f"Validation failed: {e}")

10. 性能优化技巧

10.1 使用集合进行成员检查

python

# 慢：O(n)
large_list = list(range(10000))
print(5000 in large_list)

# 快：O(1)
large_set = set(range(10000))
print(5000 in large_set)

10.2 使用生成器而不是列表

python

# 占用大量内存
def get_numbers():
    return [i for i in range(1000000)]

# 节省内存
def get_numbers_gen():
    return (i for i in range(1000000))
    # 或使用 yield
    # for i in range(1000000):
    #     yield i

10.3 使用局部变量

python

# 慢：全局查找
import math

def slow_function():
    result = []
    for i in range(1000):
        result.append(math.sqrt(i))
    return result

# 快：局部变量
def fast_function():
    result = []
    sqrt = math.sqrt  # 局部化
    for i in range(1000):
        result.append(sqrt(i))
    return result

11. 参考资料

从字典中提取子集，Python Cookbook，https://python3-cookbook.readthedocs.io/zh_CN/latest/c01/p17_extract_subset_of_dict.html ↩︎

Python 调试方法

Python 混合编程

Python C/C++ 混合编程

Python pip 包管理器

Python 语言技巧 ​

1. 列表推导和生成器表达式 ​

1.1 从字典中提取子集 ​

1.2 列表推导的过滤和转换 ​

1.3 生成器表达式节省内存 ​

2. 解包操作 ​

2.1 星号表达式解包 ​

2.2 字典解包 ​

3. 默认值和空值处理 ​

3.1 使用 or 提供默认值 ​

3.2 使用 defaultdict ​

3.3 使用 get 方法 ​

4. 函数式编程技巧 ​

4.1 使用 map、filter 和 reduce ​

4.2 使用 partial 固定函数参数 ​

4.3 使用 lru_cache 缓存结果 ​

5. 上下文管理器 ​

5.1 自定义上下文管理器 ​

5.2 suppress 上下文管理器 ​

6. 字符串处理技巧 ​

6.1 f-string 格式化 ​

6.2 字符串方法链 ​

7. 迭代器和可迭代对象 ​

7.1 enumerate 带索引遍历 ​

7.2 zip 并行迭代 ​

7.3 itertools 强大工具 ​

8. 类和对象技巧 ​

8.1 dataclass 简化类定义 ​

8.2 property 装饰器 ​

8.3 __slots__ 节省内存 ​

9. 异常处理技巧 ​

9.1 else 和 finally 子句 ​

9.2 自定义异常 ​

10. 性能优化技巧 ​

10.1 使用集合进行成员检查 ​

10.2 使用生成器而不是列表 ​

10.3 使用局部变量 ​

11. 参考资料 ​

Python 语言技巧

1. 列表推导和生成器表达式

1.1 从字典中提取子集

1.2 列表推导的过滤和转换

1.3 生成器表达式节省内存

2. 解包操作

2.1 星号表达式解包

2.2 字典解包

3. 默认值和空值处理

3.1 使用 or 提供默认值

3.2 使用 defaultdict

3.3 使用 get 方法

4. 函数式编程技巧

4.1 使用 map、filter 和 reduce

4.2 使用 partial 固定函数参数

4.3 使用 lru_cache 缓存结果

5. 上下文管理器

5.1 自定义上下文管理器

5.2 suppress 上下文管理器

6. 字符串处理技巧

6.1 f-string 格式化

6.2 字符串方法链

7. 迭代器和可迭代对象

7.1 enumerate 带索引遍历

7.2 zip 并行迭代

7.3 itertools 强大工具

8. 类和对象技巧

8.1 dataclass 简化类定义

8.2 property 装饰器

8.3 slots 节省内存

9. 异常处理技巧

9.1 else 和 finally 子句

9.2 自定义异常

10. 性能优化技巧

10.1 使用集合进行成员检查

10.2 使用生成器而不是列表

10.3 使用局部变量

11. 参考资料