- 张芷铭的个人博客

FPN 通过横向连接融合深层语义与浅层几何信息，解决目标检测中的多尺度问题。

核心思想

特征金字塔网络（Feature Pyramid Network, FPN）构建具有横向连接的金字塔结构，实现高效的多尺度特征表示。

$$P_i = \text{Conv}{1\times1}(C_i) + \text{Upsample}(P{i+1})$$

其中 $P_i$ 为第 $i$ 层金字塔特征，$C_i$ 为主干网络特征。

发展历史

2015 年：Faster R-CNN 使用单一高层特征，忽视多尺度
2017 年：FAIR 提出 FPN 架构
2018 年后：衍生 PANet、BiFPN，成为检测器标准组件

结构原理

1
2
3
4
5
6
7
8
9
C5 ──Conv1x1──> P5 ──3x3──> Output5
                 │
               Upsample
                 ↓
C4 ──Conv1x1──> (+) ──3x3──> Output4
                 │
               Upsample
                 ↓
C3 ──Conv1x1──> (+) ──3x3──> Output3

自顶向下路径：深层特征逐步上采样
横向连接：1×1 卷积对齐通道后相加
特征增强：3×3 卷积消除混叠效应

数学表达

$$ \begin{cases} P_5 = \text{Conv}{1\times1}(C_5) \ P_i = \text{Conv}{1\times1}(C_i) + \text{Upsample}(P_{i+1}) \ \text{Output}i = \text{Conv}{3\times3}(P_i) \end{cases} $$

适用场景

任务	典型模型	FPN 作用
目标检测	Faster R-CNN	提升小目标 AP
实例分割	Mask R-CNN	生成高质量 ROI 特征
关键点检测	PoseNet	跨尺度关节定位
语义分割	DeepLab	增强边缘细节

实践经验

通道数：通常 256（平衡计算与效果）
上采样：双线性插值比转置卷积稳定
训练：从浅层到深层逐步解冻，使用 GN 替代 BN

PyTorch 实现

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
import torch.nn as nn

class FPN(nn.Module):
    def __init__(self, backbone_channels=[256,512,1024,2048], out_channels=256):
        super().__init__()
        self.lateral_convs = nn.ModuleList([
            nn.Conv2d(ch, out_channels, 1) for ch in backbone_channels
        ])
        self.output_convs = nn.ModuleList([
            nn.Conv2d(out_channels, out_channels, 3, padding=1)
            for _ in backbone_channels
        ])

    def forward(self, features):
        pyramid = [self.lateral_convs[-1](features[-1])]
        for i in range(len(features)-2, -1, -1):
            pyramid.append(
                F.interpolate(pyramid[-1], scale_factor=2) +
                self.lateral_convs[i](features[i])
            )
        return [self.output_convs[i](p) for i, p in enumerate(reversed(pyramid))]

改进方向

加权特征融合（BiFPN）：

$$P_i^{out} = \frac{w_1 \cdot C_i + w_2 \cdot \text{Upsample}(P_{i+1})}{w_1+w_2+\epsilon}$$

其他改进：跨尺度动态卷积（CARAFE）、轻量化设计（深度可分离卷积）。