1. Abstract

feature pyramid
- 다양한 scale의 object를 찾기 위한 방법
- 하지만, 연산이 많다는 단점이 존재
FPN(Feature Pyramid Network)
- top-down architecture with lateral connection 사용

2. Introduction

fig1-2

Featurized image pyramid
- hand-engineered feature를 사용하여 pyramid 구성하기 때문에 다양한 scale에 대하여 학습
- 각 level에서 독립적으로 feature를 추출하여 object detection
- 하지만, 연산량과 시간 관점에서 비효율적이며 현실에 적용하기 어려움
Single feature map
- conv layer가 scale 변화에 robust하기 때문에, conv layer를 통해서 feature를 압축
- 하지만, multi-scale feature representation을 사용하지 않으므로 성능이 부족
Pyramidal feature hierarchy
- 서로 다른 scale의 feature map을 사용하여 multi-scale feature representation 추출
- 각 level에서 독립적으로 feature를 추출하여 object detection
- 하지만, 이미 계산되어 있는 higher-resolution map을 재사용하지 않음
Feature Pyramid Network
- top-down pathway과 lateral connection을 이용하여 저해상도와 고해상도를 묶음
- 각 level에서 독립적으로 feature를 추출하여 object detection
- multi-scale feature representation을 효율적으로 사용

Hand-engineered features and early neural networks
Deep ConvNet object detectors
Methods using multiple layers

4. Feature Pyramid Networks

fig1

4.1 Bottom-up pathway

Bottom-up pathway
- 위의 level로 올라갈수록 scale이 작아지며, semantic 정보를 압축하는 역할
- 각 level은 convolution block 혹은 res block으로 이루질 수 있음
- 각 level의 마지막 layer를 각각 $C_2$ , $C_3$ , $C_4$ , $C_5$ 라고 함

4.2 Top-down pathway and lateral connections

fig3

Top-down pathway
- upsampling을 통해서 고해상도의 semantic feature를 추출
- $C_2$ , $C_3$ , $C_4$ , $C_5$ 와 대응되는 layer를 각각 $P_2$ , $P_3$ , $P_4$ , $P_5$ 라고 함
lateral connection
- spatial 정보가 정확한 bottom-up feature와 semantic 정보가 정확한 top-down feature를 합치는 역할
- element-wise addition

5. Applications

5.1 Feature Pyramid Networks for RPN

fig2

RPN을 위하여 3x3 conv와 sibling 1x1 conv을 각 level에 추가 구성
- 즉, 각 level의 feature마다 bounding box regression과 label classification 사용
각각의 에 single scale anchor box를 사용
- 작은 feature map에는 작은 anchor box, 큰 feature map에는 큰 anchor box 적용
IoU threshold
- 0.7 이상 : positive
- 0.3 이하 : negative
sharing parameter
- sibling 1x1 conv의 paramter를 모든 FPN이 공유

5.2 Feature Pyramid Networks for Fast R-CNN

fig3

FPN을 통해 다양한 scale의 feature map 추출
가장 해상도가 높은 feature map을 이용하여 RPN 적용
각 region proposal을 적당한 크기의 feature map 에 적용
- $k$ = $\left \lfloor k_0 + log_2 (\sqrt{wh}/224) \right \rfloor$
- $(w,h)$ = width and height of RoI
- $224$ = size of input image
RoI pooling
bounding box regression and label classification

6. Experiments on Object Detection

MS COCO
pre-trained ResNet

6.1 Region Proposal with RPN

synchronized SGD
mini-batch = 2

6.1.1 Ablation Experiments

table1

6.1.1.1 Comparisons with baselines

RPN에 single layer를 하나 더 사용하는 것은 큰 효과가 없음
- coarser resolutions과 stronger semantics의 trade-off 때문
- 다양한 scale을 제공하는 FPN이 효율적

6.1.1.2 How important is top-down enrichment?

top-down pathway를 제거하고, bottom-up pathway에서 낮은 level의 feature map을 재사용
- level마다 semantic gap이 존재하기 때문에 큰 영향을 미치지 못함

6.1.1.3 How important are lateral connections?

lateral connection을 삭제
- down sampling과 up sampling을 여러 번 했기 때문에 location정보가 부정확함
- lateral connection이 있어야 bottom-up pathway의 위치정보가 잘 전달됨

6.1.1.4 How important are pyramid representations?

top-down pathway의 마지막 feature map으로만 예측
- baseline보다는 좋은 성능을 보여주지만, FPN보다는 안좋은 성능
- scale 변화에 덜 robust하기 때문
anchor box의 개수를 늘림
- anchor box의 종류와 크기를 다양하게 사용해도 성능의 큰 향상은 없음
- 즉, 단순히 anchor box를 늘리는 것만으로는 성능을 올리는 것이 힘듦

6.2 Object Detection with Fast/Faster R-CNN

6.2.1 Fast R-CNN (on fixed proposals)

table2

6.2.2 Faster R-CNN (on consistent proposals)

table3

table5

6.2.2.2 Running time

기존의 모델보다 느리기는 하지만, 큰 차이는 없음

6.2.3 Comparing with COCO Competition Winners

table4

7. Extensions: Segmentation Proposals

fig4

segmentation proposal을 만들기 위하여 FPN을 사용

7.1 Segmentation Proposal Results

table6

8. Conclusion

multi-scale representation을 사용하는 것이 성능에 좋다.

DevelopHyun

FPN[1] Feature Pyramid Networks for Object Detection(2016) - Review

1. Abstract

2. Introduction

4. Feature Pyramid Networks

4.1 Bottom-up pathway

4.2 Top-down pathway and lateral connections

5. Applications

5.1 Feature Pyramid Networks for RPN

5.2 Feature Pyramid Networks for Fast R-CNN

6. Experiments on Object Detection

6.1 Region Proposal with RPN

6.1.1 Ablation Experiments

6.1.1.1 Comparisons with baselines

6.1.1.2 How important is top-down enrichment?

6.1.1.3 How important are lateral connections?

6.1.1.4 How important are pyramid representations?

6.2 Object Detection with Fast/Faster R-CNN

6.2.1 Fast R-CNN (on fixed proposals)

6.2.2 Faster R-CNN (on consistent proposals)

6.2.2.2 Running time

6.2.3 Comparing with COCO Competition Winners

7. Extensions: Segmentation Proposals

7.1 Segmentation Proposal Results

8. Conclusion

9. Reference

Related Posts

DevelopHyun

1. Abstract

2. Introduction

3. Related Work

4. Feature Pyramid Networks

4.1 Bottom-up pathway

4.2 Top-down pathway and lateral connections

5. Applications

5.1 Feature Pyramid Networks for RPN

5.2 Feature Pyramid Networks for Fast R-CNN

6. Experiments on Object Detection

6.1 Region Proposal with RPN

6.1.1 Ablation Experiments

6.1.1.1 Comparisons with baselines

6.1.1.2 How important is top-down enrichment?

6.1.1.3 How important are lateral connections?

6.1.1.4 How important are pyramid representations?

6.2 Object Detection with Fast/Faster R-CNN

6.2.1 Fast R-CNN (on fixed proposals)

6.2.2 Faster R-CNN (on consistent proposals)

6.2.2.1 Sharing features

6.2.2.2 Running time

6.2.3 Comparing with COCO Competition Winners

7. Extensions: Segmentation Proposals

7.1 Segmentation Proposal Results

8. Conclusion

9. Reference

Related Posts