Python 语言技巧
1. 列表推导和生成器表达式
1.1 从字典中提取子集
你想构造一个字典,它是另外一个字典的子集,最简单的方式是使用字典推导:[1]
python
prices = {
'ACME': 45.23,
'AAPL': 612.78,
'IBM': 205.55,
'HPQ': 37.20,
'FB': 10.75
}
# Make a dictionary of all prices over 200
p1 = {key: value for key, value in prices.items() if value > 200}
# Make a dictionary of tech stocks
tech_names = {'AAPL', 'IBM', 'HPQ', 'MSFT'}
p2 = {key: value for key, value in prices.items() if key in tech_names}大多数情况下字典推导能做到的,通过创建一个元组序列然后把它传给 dict() 函数也能实现:
python
p1 = dict((key, value) for key, value in prices.items() if value > 200)但是,字典推导方式表意更清晰,并且实际上也会运行的更快些(在这个例子中,实际测试几乎比 dict() 函数方式快整整一倍)
有时候完成同一件事会有多种方式。比如,第二个例子程序也可以像这样重写:
python
# Make a dictionary of tech stocks
tech_names = { 'AAPL', 'IBM', 'HPQ', 'MSFT' }
p2 = { key: prices[key] for key in prices.keys() & tech_names }但是,运行时间测试结果显示这种方案大概比第一种方案慢 1.6 倍。
1.2 列表推导的过滤和转换
列表推导可以同时进行过滤和转换:
python
# 获取所有偶数的平方
squares = [x ** 2 for x in range(10) if x % 2 == 0]
print(squares) # [0, 4, 16, 36, 64]
# 多重循环
matrix = [[i * j for j in range(1, 4)] for i in range(1, 4)]
print(matrix) # [[1, 2, 3], [2, 4, 6], [3, 6, 9]]
# 展平嵌套列表
nested = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
flat = [item for sublist in nested for item in sublist]
print(flat) # [1, 2, 3, 4, 5, 6, 7, 8, 9]1.3 生成器表达式节省内存
对于大数据集,生成器表达式比列表推导更节省内存:
python
# 列表推导:创建完整列表,占用大量内存
squares_list = [x ** 2 for x in range(1000000)]
# 生成器表达式:按需生成,节省内存
squares_gen = (x ** 2 for x in range(1000000))
# 可以直接用于需要迭代器的场景
sum_of_squares = sum(x ** 2 for x in range(1000000))2. 解包操作
2.1 星号表达式解包
使用星号表达式可以轻松处理任意长度的序列:
python
# 基本解包
first, *middle, last = [1, 2, 3, 4, 5]
print(first) # 1
print(middle) # [2, 3, 4]
print(last) # 5
# 在函数参数中使用
def func(a, b, *args, **kwargs):
print(f"a={a}, b={b}")
print(f"args={args}")
print(f"kwargs={kwargs}")
func(1, 2, 3, 4, x=5, y=6)
# a=1, b=2
# args=(3, 4)
# kwargs={'x': 5, 'y': 6}
# 忽略不需要的值
first, *_, last = [1, 2, 3, 4, 5]
print(first, last) # 1 52.2 字典解包
Python 3.5+ 支持字典解包:
python
# 合并字典
dict1 = {'a': 1, 'b': 2}
dict2 = {'c': 3, 'd': 4}
merged = {**dict1, **dict2}
print(merged) # {'a': 1, 'b': 2, 'c': 3, 'd': 4}
# 更新字典值
defaults = {'host': 'localhost', 'port': 8080}
user_config = {'port': 9000}
config = {**defaults, **user_config}
print(config) # {'host': 'localhost', 'port': 9000}3. 默认值和空值处理
3.1 使用 or 提供默认值
python
# 为 None 或空值提供默认值
name = user_input or "Anonymous"
# 但要注意 0、False、空字符串都会被视为假值
count = user_count or 10 # 如果 user_count 是 0,这会有问题
# 更安全的方式
count = user_count if user_count is not None else 103.2 使用 defaultdict
python
from collections import defaultdict
# 自动创建默认值
word_count = defaultdict(int)
for word in ["apple", "banana", "apple", "cherry"]:
word_count[word] += 1 # 不需要检查 key 是否存在
print(dict(word_count)) # {'apple': 2, 'banana': 1, 'cherry': 1}
# 使用 list 作为默认工厂
groups = defaultdict(list)
for name, age in [("Alice", 25), ("Bob", 30), ("Alice", 26)]:
groups[name].append(age)
print(dict(groups)) # {'Alice': [25, 26], 'Bob': [30]}3.3 使用 get 方法
python
config = {'timeout': 30}
# 传统方式
if 'timeout' in config:
timeout = config['timeout']
else:
timeout = 60
# 使用 get
timeout = config.get('timeout', 60)
# get 还可以避免 KeyError
value = config.get('missing_key') # 返回 None 而不是抛出异常4. 函数式编程技巧
4.1 使用 map、filter 和 reduce
python
# map:转换序列中的每个元素
numbers = [1, 2, 3, 4, 5]
squares = list(map(lambda x: x ** 2, numbers))
print(squares) # [1, 4, 9, 16, 25]
# filter:过滤序列
evens = list(filter(lambda x: x % 2 == 0, numbers))
print(evens) # [2, 4]
# reduce:累积操作
from functools import reduce
product = reduce(lambda x, y: x * y, numbers)
print(product) # 120
# 通常列表推导更 Pythonic
squares = [x ** 2 for x in numbers]
evens = [x for x in numbers if x % 2 == 0]4.2 使用 partial 固定函数参数
python
from functools import partial
def power(base, exponent):
return base ** exponent
# 创建特定的函数版本
square = partial(power, exponent=2)
cube = partial(power, exponent=3)
print(square(5)) # 25
print(cube(5)) # 1254.3 使用 lru_cache 缓存结果
python
from functools import lru_cache
@lru_cache(maxsize=128)
def fibonacci(n):
if n < 2:
return n
return fibonacci(n - 1) + fibonacci(n - 2)
# 第一次调用会计算,后续调用直接返回缓存结果
print(fibonacci(100)) # 很快
# 查看缓存信息
print(fibonacci.cache_info())5. 上下文管理器
5.1 自定义上下文管理器
python
from contextlib import contextmanager
@contextmanager
def timer():
import time
start = time.time()
try:
yield
finally:
end = time.time()
print(f"Elapsed time: {end - start:.4f}s")
# 使用
with timer():
# 执行耗时操作
sum(range(1000000))5.2 suppress 上下文管理器
python
from contextlib import suppress
# 忽略特定异常
with suppress(FileNotFoundError):
import os
os.remove('nonexistent_file.txt') # 不会抛出异常
# 等价于
try:
import os
os.remove('nonexistent_file.txt')
except FileNotFoundError:
pass6. 字符串处理技巧
6.1 f-string 格式化
python
name = "Alice"
age = 30
# 基本用法
print(f"Name: {name}, Age: {age}")
# 表达式
print(f"Next year: {age + 1}")
# 格式化
pi = 3.14159
print(f"Pi: {pi:.2f}") # Pi: 3.14
# 对齐
print(f"{name:>10}") # 右对齐
print(f"{name:<10}") # 左对齐
print(f"{name:^10}") # 居中
# 调试(Python 3.8+)
x = 10
print(f"{x=}") # x=106.2 字符串方法链
python
text = " hello world "
# 方法链
result = text.strip().upper().replace("WORLD", "PYTHON")
print(result) # "HELLO PYTHON"
# 分割和连接
words = "apple,banana,cherry".split(',')
joined = ' | '.join(words)
print(joined) # "apple | banana | cherry"7. 迭代器和可迭代对象
7.1 enumerate 带索引遍历
python
fruits = ['apple', 'banana', 'cherry']
# 获取索引和值
for i, fruit in enumerate(fruits):
print(f"{i}: {fruit}")
# 指定起始索引
for i, fruit in enumerate(fruits, start=1):
print(f"{i}: {fruit}")7.2 zip 并行迭代
python
names = ['Alice', 'Bob', 'Charlie']
ages = [25, 30, 35]
cities = ['New York', 'London', 'Tokyo']
# 并行迭代多个序列
for name, age, city in zip(names, ages, cities):
print(f"{name} is {age} years old and lives in {city}")
# 创建字典
person_dict = dict(zip(names, ages))
print(person_dict) # {'Alice': 25, 'Bob': 30, 'Charlie': 35}7.3 itertools 强大工具
python
from itertools import chain, islice, cycle, groupby
# chain:连接多个迭代器
list1 = [1, 2, 3]
list2 = [4, 5, 6]
combined = list(chain(list1, list2))
print(combined) # [1, 2, 3, 4, 5, 6]
# islice:切片迭代器
data = range(100)
first_10 = list(islice(data, 10))
print(first_10) # [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
# cycle:无限循环
colors = ['red', 'green', 'blue']
color_cycle = cycle(colors)
print([next(color_cycle) for _ in range(5)]) # ['red', 'green', 'blue', 'red', 'green']
# groupby:分组
data = [('A', 1), ('A', 2), ('B', 1), ('B', 2), ('C', 1)]
for key, group in groupby(data, key=lambda x: x[0]):
print(f"{key}: {list(group)}")8. 类和对象技巧
8.1 dataclass 简化类定义
python
from dataclasses import dataclass
@dataclass
class Point:
x: float
y: float
def distance(self):
return (self.x ** 2 + self.y ** 2) ** 0.5
p = Point(3, 4)
print(p) # Point(x=3, y=4)
print(p.distance()) # 5.08.2 property 装饰器
python
class Temperature:
def __init__(self, celsius):
self._celsius = celsius
@property
def celsius(self):
return self._celsius
@celsius.setter
def celsius(self, value):
if value < -273.15:
raise ValueError("Temperature below absolute zero")
self._celsius = value
@property
def fahrenheit(self):
return self._celsius * 9/5 + 32
temp = Temperature(25)
print(temp.celsius) # 25
print(temp.fahrenheit) # 77.0
temp.celsius = 30 # 使用 setter8.3 __slots__ 节省内存
python
class Point:
__slots__ = ['x', 'y'] # 只允许这些属性
def __init__(self, x, y):
self.x = x
self.y = y
p = Point(1, 2)
# p.z = 3 # 会抛出 AttributeError9. 异常处理技巧
9.1 else 和 finally 子句
python
try:
result = 10 / 2
except ZeroDivisionError:
print("Cannot divide by zero")
else:
# 只有没有异常时才执行
print(f"Result: {result}")
finally:
# 无论是否有异常都会执行
print("Cleanup")9.2 自定义异常
python
class ValidationError(Exception):
"""自定义验证错误"""
pass
def validate_age(age):
if age < 0:
raise ValidationError(f"Age cannot be negative: {age}")
if age > 150:
raise ValidationError(f"Age too large: {age}")
return age
try:
validate_age(-5)
except ValidationError as e:
print(f"Validation failed: {e}")10. 性能优化技巧
10.1 使用集合进行成员检查
python
# 慢:O(n)
large_list = list(range(10000))
print(5000 in large_list)
# 快:O(1)
large_set = set(range(10000))
print(5000 in large_set)10.2 使用生成器而不是列表
python
# 占用大量内存
def get_numbers():
return [i for i in range(1000000)]
# 节省内存
def get_numbers_gen():
return (i for i in range(1000000))
# 或使用 yield
# for i in range(1000000):
# yield i10.3 使用局部变量
python
# 慢:全局查找
import math
def slow_function():
result = []
for i in range(1000):
result.append(math.sqrt(i))
return result
# 快:局部变量
def fast_function():
result = []
sqrt = math.sqrt # 局部化
for i in range(1000):
result.append(sqrt(i))
return result11. 参考资料
从字典中提取子集,Python Cookbook,https://python3-cookbook.readthedocs.io/zh_CN/latest/c01/p17_extract_subset_of_dict.html ↩︎