- 张芷铭的个人博客

LPIPS（Learned Perceptual Image Patch Similarity）通过预训练神经网络提取特征计算感知相似度，比传统 PSNR/SSIM 更符合人类视觉感知。

定义与发展

LPIPS 来源于 CVPR 2018 论文《The Unreasonable Effectiveness of Deep Features as a Perceptual Metric》。利用深度 CNN 提取图像特征，计算特征空间中的距离评估图像相似性。

核心优势：值越低表示两张图像越相似，比 PSNR/SSIM 更符合人眼感知。

数学原理

$$d(x, x_0) = \sum_{l} \frac{1}{H_l W_l} \sum_{h,w} ||w_l \odot (\hat{y}{hw}^l - \hat{y}{0hw}^l)||_2^2$$

其中 $\hat{y}^l, \hat{y}_0^l$ 是第 $l$ 层提取的激活特征，$w_l$ 是缩放权重向量。

网络选择

网络	模型大小	特点
AlexNet	9.1MB	速度最快，默认推荐
VGG	58.9MB	精度较高
SqueezeNet	2.8MB	轻量级

使用方法

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
import lpips
import torch

loss_fn = lpips.LPIPS(net='alex')

def preprocess_image(image_path):
    transform = transforms.Compose([
        transforms.ToTensor(),
        transforms.Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])
    ])
    image = Image.open(image_path).convert('RGB')
    return transform(image).unsqueeze(0)

img0 = preprocess_image('path/to/image0.jpg')
img1 = preprocess_image('path/to/image1.jpg')
distance = loss_fn(img0, img1)

关键注意事项

图像预处理：RGB 格式，归一化到 [-1, 1]
图像尺寸：两张图像需相同空间尺寸
结果解读：值越小越相似

应用场景

图像质量评估
GAN 训练感知损失
图像风格迁移
图像修复评估
NeRF 新视角合成评估

最佳实践

结合 PSNR、SSIM 多角度评估
默认使用 net='alex' 平衡速度与精度
批量处理提高效率